Reinforcement Learning Elicits Contextual Learning of Unseen Language Translation

7/10 arXiv Friday, June 5, 2026

Why This Matters

Relevant to LLM integration and fine-tuning, with a focus on reinforcement learning for unseen language translation and contextual learning.

Abstract

Prior work has shown that large language models (LLMs) can translate unseen or low-resource languages by undergoing continued training or even by encoding a grammar book in their context. However, both methods typically overfit specific languages, with limited zero-shot transfer at test time. To translate extremely low-resource languages at scale, we argue that LLMs must acquire the meta-skill of utilizing in-context linguistic knowledge rather than memorizing specific languages. In this paper, we propose a reinforcement learning (RL) approach to unseen language translation given rich linguistic context, using a surface-level translation metric (chrF) as the reward. Empirically, despite the lightweight reward, our RL-trained models effectively extract and apply relevant linguistic information from the provided context, leading to better translations on completely unseen languages than in-context learning or supervised fine-tuning. Our analyses suggest that outcome-based RL can extend beyond conventional reasoning tasks like math and coding to serve as a recipe for language learning from context.

Links

📄 Original 📥 PDF

Metadata

Authors: Hanxu Hu, Zdeněk Šnajdr, Pinzhen Chen, Jannis Vamvas, Rico Sennrich

Categories: cs.CL

Published: Friday, June 5, 2026

Save to Vault

Save this article directly to your Obsidian vault. Opens Obsidian with the note pre-filled.

📋 Save to Obsidian Vault

Will save to: vault/inbox/signals/2026-06-07-reinforcement-learning-elicits-contextual-learning-of-unseen-language-translatio.md