โ† Back to 2026-06-07 digest

Decomposing Factual Sycophancy in Language Models: How Size and Instruction Tuning Shape Robustness

8/10 arXiv Friday, June 5, 2026

Why This Matters

This paper is relevant to fine-tuning, LLM integration, and production AI. The abstract describes a method to decompose factual sycophancy in language models and investigates the effects of size and instruction tuning on robustness. The paper presents empirical results and provides specific technical content, including the use of 56 open-weight models and 13 manipulation types.

Abstract

Factual sycophancy occurs when a language model abandons a correct, verifiable answer under social pressure. Because a flip occurs only when pressure toward a false answer exceeds the model's neutral preference for the truth, flip rates conflate two mechanisms: the strength of that baseline preference (truth margin), and how far pressure shifts it (manipulation sensitivity). We decompose factual sycophancy into these channels and use them to separate the effects of size and instruction tuning across 56 open-weight models spanning 0.3B-32B parameters and 13 manipulation types. We find that vulnerability is governed mainly by size, but instruction tuning changes how size acts: small instruction-tuned models can become less robust, whereas large instruction-tuned models usually become more robust. Instruction tuning primarily increases truth margin, but its behavioral effect depends on manipulation type. Scaling also changes the two channels differently: base models gain margin but become mildly more manipulation-sensitive, whereas instruction-tuned models gain margin faster and become less sensitive. Factual sycophancy is therefore not a single scalar property. Evaluations should report channel-specific, manipulation-specific, and size-conditioned robustness rather than flip rates alone.

Links

Metadata

Authors: Victor De Marez, Luna De Bruyne, Walter Daelemans

Categories: cs.CL

Published: Friday, June 5, 2026

Save to Vault

Save this article directly to your Obsidian vault. Opens Obsidian with the note pre-filled.

๐Ÿ“‹ Save to Obsidian Vault

Will save to: vault/inbox/signals/2026-06-07-decomposing-factual-sycophancy-in-language-models-how-size-and-instruction-tunin.md