TimelineDiff: prove where two “identical” agent runs diverged (first split, cause, receipts)

RFTSystems · January 9, 2026, 11:42am

Most agent debugging today is theatre: “trust me, it’s the same prompt” and “it worked on my run.” In production, runs diverge — sampling jitter, tool timing, memory writes, hidden state, flaky endpoints, and plain old nondeterminism. Benchmarks tell you you failed. Single logs tell you what happened once. What you actually need is a diff: where did the timelines first split, and what changed?

I built TimelineDiff — Differential Reproducibility to do exactly that. Upload two DRP trace bundles (.zip) and it will:
• Align both timelines event-by-event
• Identify the first divergence step (the moment reality splits)
• Show the delta: missing events, changed tool outputs, memory mutations, control-flow differences
• Export a shareable evidence pack (so you can stop arguing and start fixing)

Space: TimelineDiff Differential Reproducibility - a Hugging Face Space by RFTSystems

If you’re shipping agents, eval tooling, or anything that relies on “reproducible” behaviour: run TimelineDiff on two sessions you swear are the same. You’ll find the split fast, and you’ll have receipts you can hand to a teammate, a reviewer, or a client.
RFTSystems,Liam

⸻

#reproducibility #mlops #observability #agenticAI #aiSafety #evals #debugging #forensics #traceability #llm #RFTSystems #TrustStack #verification

Topic		Replies	Views
RFTSystems: Agent Forensics Suite - a RFTSystems Collection Spaces	0	10	January 9, 2026
RFTSystems Agent Forensics Suite — audit, prove, replay, diff agent runs Show and Tell	0	34	January 10, 2026
Agent Flight Recorder - a Hugging Face Space by RFTSystems Spaces	0	31	January 9, 2026
ReplayProof Agent POV Verified Replay - a Hugging Face Space by RFTSystems Spaces	2	33	January 9, 2026
Agents Console - a Hugging Face Space by RFTSystems Spaces	0	19	December 15, 2025

TimelineDiff: prove where two “identical” agent runs diverged (first split, cause, receipts)

Related topics