-
Can Large Language Models Understand Context?
Paper • 2402.00858 • Published • 24 -
OLMo: Accelerating the Science of Language Models
Paper • 2402.00838 • Published • 85 -
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 152 -
SemScore: Automated Evaluation of Instruction-Tuned LLMs based on Semantic Textual Similarity
Paper • 2401.17072 • Published • 25
Collections
Discover the best community collections!
Collections including paper arxiv:2504.20595
-
Toward Evaluative Thinking: Meta Policy Optimization with Evolving Reward Models
Paper • 2504.20157 • Published • 37 -
The Leaderboard Illusion
Paper • 2504.20879 • Published • 72 -
ReasonIR: Training Retrievers for Reasoning Tasks
Paper • 2504.20595 • Published • 54 -
RM-R1: Reward Modeling as Reasoning
Paper • 2505.02387 • Published • 81
-
Challenging the Boundaries of Reasoning: An Olympiad-Level Math Benchmark for Large Language Models
Paper • 2503.21380 • Published • 38 -
Video-R1: Reinforcing Video Reasoning in MLLMs
Paper • 2503.21776 • Published • 79 -
Embodied-Reasoner: Synergizing Visual Search, Reasoning, and Action for Embodied Interactive Tasks
Paper • 2503.21696 • Published • 23 -
M1: Towards Scalable Test-Time Compute with Mamba Reasoning Models
Paper • 2504.10449 • Published • 15
-
M3DocRAG: Multi-modal Retrieval is What You Need for Multi-page Multi-document Understanding
Paper • 2411.04952 • Published • 29 -
Diff-2-in-1: Bridging Generation and Dense Perception with Diffusion Models
Paper • 2411.05005 • Published • 13 -
M3SciQA: A Multi-Modal Multi-Document Scientific QA Benchmark for Evaluating Foundation Models
Paper • 2411.04075 • Published • 16 -
Self-Consistency Preference Optimization
Paper • 2411.04109 • Published • 19
-
I-Con: A Unifying Framework for Representation Learning
Paper • 2504.16929 • Published • 30 -
LLMs are Greedy Agents: Effects of RL Fine-tuning on Decision-Making Abilities
Paper • 2504.16078 • Published • 21 -
WALL-E 2.0: World Alignment by NeuroSymbolic Learning improves World Model-based LLM Agents
Paper • 2504.15785 • Published • 22 -
OTC: Optimal Tool Calls via Reinforcement Learning
Paper • 2504.14870 • Published • 35
-
Step Back to Leap Forward: Self-Backtracking for Boosting Reasoning of Language Models
Paper • 2502.04404 • Published • 25 -
Learning Adaptive Parallel Reasoning with Language Models
Paper • 2504.15466 • Published • 44 -
TTRL: Test-Time Reinforcement Learning
Paper • 2504.16084 • Published • 120 -
THOUGHTTERMINATOR: Benchmarking, Calibrating, and Mitigating Overthinking in Reasoning Models
Paper • 2504.13367 • Published • 26
-
Is ChatGPT Good at Search? Investigating Large Language Models as Re-Ranking Agent
Paper • 2304.09542 • Published • 5 -
Dense X Retrieval: What Retrieval Granularity Should We Use?
Paper • 2312.06648 • Published • 1 -
Context Tuning for Retrieval Augmented Generation
Paper • 2312.05708 • Published • 16 -
Rank-without-GPT: Building GPT-Independent Listwise Rerankers on Open-Source Large Language Models
Paper • 2312.02969 • Published • 14
-
Can Large Language Models Understand Context?
Paper • 2402.00858 • Published • 24 -
OLMo: Accelerating the Science of Language Models
Paper • 2402.00838 • Published • 85 -
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 152 -
SemScore: Automated Evaluation of Instruction-Tuned LLMs based on Semantic Textual Similarity
Paper • 2401.17072 • Published • 25
-
Toward Evaluative Thinking: Meta Policy Optimization with Evolving Reward Models
Paper • 2504.20157 • Published • 37 -
The Leaderboard Illusion
Paper • 2504.20879 • Published • 72 -
ReasonIR: Training Retrievers for Reasoning Tasks
Paper • 2504.20595 • Published • 54 -
RM-R1: Reward Modeling as Reasoning
Paper • 2505.02387 • Published • 81
-
I-Con: A Unifying Framework for Representation Learning
Paper • 2504.16929 • Published • 30 -
LLMs are Greedy Agents: Effects of RL Fine-tuning on Decision-Making Abilities
Paper • 2504.16078 • Published • 21 -
WALL-E 2.0: World Alignment by NeuroSymbolic Learning improves World Model-based LLM Agents
Paper • 2504.15785 • Published • 22 -
OTC: Optimal Tool Calls via Reinforcement Learning
Paper • 2504.14870 • Published • 35
-
Challenging the Boundaries of Reasoning: An Olympiad-Level Math Benchmark for Large Language Models
Paper • 2503.21380 • Published • 38 -
Video-R1: Reinforcing Video Reasoning in MLLMs
Paper • 2503.21776 • Published • 79 -
Embodied-Reasoner: Synergizing Visual Search, Reasoning, and Action for Embodied Interactive Tasks
Paper • 2503.21696 • Published • 23 -
M1: Towards Scalable Test-Time Compute with Mamba Reasoning Models
Paper • 2504.10449 • Published • 15
-
Step Back to Leap Forward: Self-Backtracking for Boosting Reasoning of Language Models
Paper • 2502.04404 • Published • 25 -
Learning Adaptive Parallel Reasoning with Language Models
Paper • 2504.15466 • Published • 44 -
TTRL: Test-Time Reinforcement Learning
Paper • 2504.16084 • Published • 120 -
THOUGHTTERMINATOR: Benchmarking, Calibrating, and Mitigating Overthinking in Reasoning Models
Paper • 2504.13367 • Published • 26
-
M3DocRAG: Multi-modal Retrieval is What You Need for Multi-page Multi-document Understanding
Paper • 2411.04952 • Published • 29 -
Diff-2-in-1: Bridging Generation and Dense Perception with Diffusion Models
Paper • 2411.05005 • Published • 13 -
M3SciQA: A Multi-Modal Multi-Document Scientific QA Benchmark for Evaluating Foundation Models
Paper • 2411.04075 • Published • 16 -
Self-Consistency Preference Optimization
Paper • 2411.04109 • Published • 19
-
Is ChatGPT Good at Search? Investigating Large Language Models as Re-Ranking Agent
Paper • 2304.09542 • Published • 5 -
Dense X Retrieval: What Retrieval Granularity Should We Use?
Paper • 2312.06648 • Published • 1 -
Context Tuning for Retrieval Augmented Generation
Paper • 2312.05708 • Published • 16 -
Rank-without-GPT: Building GPT-Independent Listwise Rerankers on Open-Source Large Language Models
Paper • 2312.02969 • Published • 14