Golden Goose: A Simple Trick to Synthesize Unlimited RLVR Tasks from Unverifiable Internet Text Paper • 2601.22975 • Published 8 days ago • 81
Llama-3.1-FoundationAI-SecurityLLM-Reasoning-8B Technical Report Paper • 2601.21051 • Published 9 days ago • 12
Sweep Next Edit Collection Locally running next edit autocomplete • 2 items • Updated 10 days ago • 4
Scaling Law Discovery Collection Dataset and results for SLD (https://arxiv.org/abs/2507.21184) • 2 items • Updated 30 days ago • 2
The Flexibility Trap: Why Arbitrary Order Limits Reasoning Potential in Diffusion Language Models Paper • 2601.15165 • Published 16 days ago • 71
LongBench v2: Towards Deeper Understanding and Reasoning on Realistic Long-context Multitasks Paper • 2412.15204 • Published Dec 19, 2024 • 38
CIMemories: A Compositional Benchmark for Contextual Integrity of Persistent Memory in LLMs Paper • 2511.14937 • Published Nov 18, 2025 • 1
Foundation-Sec-8B Collection Foundation-Sec-8B models and quantizations. • 8 items • Updated 9 days ago • 6
GPT-OSS Pruned Experts (4.2B-20B) [IF, Science, Math, etc.] Collection Complete collection of domain-specialized GPT-OSS models (1-32 experts) optimized for science, math, medicine, law, safety, and instruction following. • 8 items • Updated Aug 13, 2025 • 10
Tool Use Reasoning Collection A collection of tool use reasoning dataset in Hermes format • 5 items • Updated Jul 23, 2025 • 9
GLiNER-PII Collection PII detection models developed in collaboration with Wordcab • 5 items • Updated 9 days ago • 21