Heykuki News
Top
New
Best
Ask
Show
Jobs
Toggle theme
Login
Top
New
Best
Ask
Show
Jobs
1.
▲
DeepSWE Audit: DeepSeek-v4-pro results are unreliable
(github.com/datacurve-ai)
3 points
eunos
12 hours ago
discuss
2.
▲
DeepSWE results are unreliable – 3/3 DSv4 "failed" tasks solved with same model
(github.com/datacurve-ai)
2 points
theanonymousone
an hour ago
discuss