Highly relevant to LLM integration, RAG, and fine-tuning through the proposal of SARDI, a dynamic RAG framework that uses lookahead tokens to guide retrieval during denoising. The paper presents…
This paper is highly relevant to LLM integration, fine-tuning, and production AI. It proposes a novel cross-layer sparse attention architecture for long-context inference in large language models,…
This paper is highly relevant to LLM integration, fine-tuning, and production AI. It presents a system (Vortex) for efficient and programmable sparse attention serving for AI agents, with specific…
Highly relevant to LLM integration and fine-tuning, as it discusses machine unlearning and the proposed framework, Alternating Token-Weighted Unlearning (ATWU), which jointly learns token…
This paper is relevant to RAG, fine-tuning, and LLM integration as it introduces Code2LoRA, a hypernetwork framework that generates repository-specific LoRA adapters for code language models,…
Directly addresses LLM integration and fine-tuning through the OpAI-Bench benchmark, providing a controlled testbed for analyzing AI-assisted writing under realistic progressive editing scenarios.…
This paper is relevant to fine-tuning, LLM integration, and RAG. It discusses the effectiveness of a specific prompt skill in improving code generation and presents a controlled study with empirical…
Relevant to LLM integration and context engineering, with specific technical content on auditing LLM-based stance simulation and counterfactual context revision.
Relevant to browser automation and RAG, as it discusses GUI agents and drag grounding, which involves controlling graphical user interfaces through vision-based models. The proposed dataset and…
This paper is relevant to fine-tuning, LLM integration, and production AI. The abstract describes a method to decompose factual sycophancy in language models and investigates the effects of size and…
Directly addresses LLM integration, agent architectures, and fine-tuning, with specific technical content on Causal Minimal Tool Filtering (CMTF) and empirical results.
Directly addresses LLM integration, with specific technical content on evaluating demographic perspective-taking in LLM hate speech annotation, including methods and results.
This paper is relevant to LLM integration, fine-tuning, and context engineering. It proposes a multitask representation engineering framework to improve the readability of LLM-generated codes,…
The abstract discusses the DeepSeek V4 Flash model and its performance on local inference. It mentions the model's intelligence, efficiency, and context window scaling, which is related to LLM…