Heykuki News

TopNewBestAskShowJobs
TopNewBestAskShowJobs
1.
Show HN: Terminal-Bench-RL: Training long-horizon terminal agents with RL (github.com/Danau5tin)
125 points
Danau5tin
10 months ago
12 comments
2.
Show HN: Multi-Agent-Coder Is #12 on Stanford's TBench. Beats Claude Code (github.com/Danau5tin)
5 points
Danau5tin
9 months ago
1 comment
3.
My weekend project accidentally beat Claude Code – #12 on Stanford's TBench (github.com/Danau5tin)
2 points
Danau5tin
9 months ago
2 comments
4.
Scaling Coding-Agent RL to 32x H100s. 160% Improvement on Stanford's TBench (github.com/Danau5tin)
2 points
Danau5tin
7 months ago
1 comment