Heykuki News

TopNewBestAskShowJobs
TopNewBestAskShowJobs
91.
Thudm/AgentBench: A Comprehensive Benchmark to Evaluate LLMs as Agents (github.com/THUDM)
1 point
freediver
3 years ago
discuss
92.
AgentBench: A Comprehensive Benchmark to Evaluate LLMs as Agents (github.com/THUDM)
1 point
swyx
3 years ago
discuss
93.
Evaluate Multiple LLMs Easily (github.com/ray-project)
1 point
fzliu
3 years ago
discuss
94.
acorn-macros – Evaluates and replaces JavaScript macros with Acorn (github.com/heyheyhello)
1 point
ducaale
5 years ago
discuss
95.
Show HN: Evaluate Markdown code blocks within Vim (github.com/gpanders)
1 point
gpanders
6 years ago
discuss
96.
Show HN: Little tool to evaluate your cryptocurrency trades on Poloniex (github.com/enricobacis)
1 point
enricobacis
9 years ago
discuss
97.
Seeing Is Believing: Evaluates Ruby code, recording the results of each line (github.com/JoshCheek)
1 point
lobo_tuerto
9 years ago
discuss
98.
Go proposal: clarify how proposals are evaluated (github.com/golang)
1 point
sgmansfield
10 years ago
discuss
99.
Evaluate JavaScript code blocks from within markdown (github.com/reggi)
1 point
thomasreggi
11 years ago
discuss
100.
Ask HN: How do you familiarize yourself with a new codebase?
407 points
roflc0ptic
11 years ago
238 comments
101.
Show HN: Faster FastAPI with simdjson and io_uring on Linux 5.19 (github.com/unum-cloud)
290 points
ashvardanian
3 years ago
90 comments
102.
Show HN: EVA – AI-Relational Database System (github.com/georgia-tech-db)
237 points
jarulraj
3 years ago
36 comments
103.
Show HN: Open-source, browser-local data exploration using DuckDB-WASM and PRQL (github.com/pretzelai)
227 points
prasoonds
2 years ago
74 comments
104.
Show HN: Mizu.js – Lightweight HTML templating library for any-side rendering (mizu.sh)
225 points
lowlighter
a year ago
86 comments
105.
Show HN: Qwen-2.5-32B is now the best open source OCR model (github.com/getomni-ai)
211 points
themanmaran
a year ago
47 comments
106.
Show HN: PromptTools – open-source tools for evaluating LLMs and vector DBs (github.com/hegelai)
211 points
krawfy
3 years ago
24 comments
107.
Less Slow C++ (github.com/ashvardanian)
198 points
ashvardanian
a year ago
97 comments
108.
Show HN: cmux - Ghostty-based terminal with vertical tabs and notifications (github.com/manaflow-ai)
198 points
lawrencechen
4 months ago
77 comments
109.
Show HN: Use Code Llama as Drop-In Replacement for Copilot Chat (continue.dev)
187 points
sestinj
3 years ago
52 comments
110.
Show HN: Benchmarking VLMs vs. Traditional OCR (getomni.ai)
146 points
themanmaran
a year ago
40 comments
111.
Show HN: Quality News – Towards a fairer ranking algorithm for Hacker News (news.social-protocols.org)
140 points
manx
3 years ago
71 comments
112.
Show HN: Statewright – Visual state machines that make AI agents reliable (github.com/statewright)
126 points
azurewraith
25 days ago
59 comments
113.
Show HN: Terminal-Bench-RL: Training long-horizon terminal agents with RL (github.com/Danau5tin)
125 points
Danau5tin
10 months ago
12 comments
114.
Show HN: Ragas – Open-source library for evaluating RAG pipelines (github.com/explodinggradients)
121 points
shahules
2 years ago
26 comments
115.
Launch HN: Confident AI (YC W25) – Open-source evaluation framework for LLM apps
117 points
jeffreyip
a year ago
27 comments
116.
Show HN: Improve LLM Performance by Maximizing Iterative Development (github.com/palico-ai)
104 points
asif_
2 years ago
22 comments
117.
Launch HN: Grai (YC S22) – Open-Source Data Observability Platform
101 points
ersatz_username
3 years ago
44 comments
118.
Show HN: Dittofeed – 1-Click deploy, self-host Mailchimp alternative (github.com/dittofeed)
100 points
maxthegeek1
3 years ago
29 comments
119.
Show HN: Burr – A framework for building and debugging GenAI apps faster (github.com/DAGWorks-Inc)
94 points
elijahbenizzy
2 years ago
22 comments
120.
Show HN: Gambit, an open-source agent harness for building reliable AI agents (github.com/bolt-foundry)
91 points
randall
5 months ago
27 comments
More