Heykuki News
Top
New
Best
Ask
Show
Jobs
Toggle theme
Login
Top
New
Best
Ask
Show
Jobs
31.
▲
Show HN: EvalView pytest style tests for AI agents (budgets, hallucinations)
(github.com/hidai25)
1 point
hidai25
6 months ago
1 comment
32.
▲
Evaluatly is now open source and free
(github.com/evaluatly)
1 point
gpnt
6 years ago
1 comment
33.
▲
Show HN: EvalLens – Open-source tool to evaluate structured LLM outputs
(github.com/simonrendona)
1 point
simonrendon
2 months ago
discuss
34.
▲
Evalien – Node.js event loop agent harness
(github.com/agentbellnorm)
1 point
agentbellnorm
3 months ago
discuss
35.
▲
Show HN: Evals skill for agents – no tooling, just Markdown and subagents
(github.com/adriancooney)
1 point
adriancooney
4 months ago
discuss
36.
▲
Evalite: Evaluate your LLM-powered apps with TypeScript
(github.com/mattpocock)
1 point
handfuloflight
6 months ago
discuss
37.
▲
Triilman25/evaluation-machine-for-classification-models
(github.com/triilman25)
1 point
triilman
a year ago
discuss
38.
▲
Eval Villain Update released Find those dangerous JavaScript functions
(github.com/swoops)
1 point
tony-ds
2 years ago
discuss
39.
▲
GitHub Action for Cluster API
(github.com/evalsocket)
1 point
evalsocket
6 years ago
discuss
40.
▲
Eval – a bot that executes arbitrary JavaScript and posts the result on Plemora
(github.com/CosineP)
1 point
aeroplain
6 years ago
discuss
41.
▲
Evaldb: Use your favorite language as a database
(github.com/turbio)
1 point
amasad
6 years ago
discuss
42.
▲
Show HN: Evalfilter: A simple Golang evaluation engine for filtering via scripts
(github.com/skx)
1 point
stevekemp
7 years ago
discuss
43.
▲
Evaluate JavaScript code blocks from within markdown
(github.com/reggi)
1 point
thomasreggi
11 years ago
discuss
44.
▲
Show HN: Open-source alternative to ChatGPT Agents for browsing
(github.com/trymeka)
104 points
ElasticBottle
10 months ago
23 comments
45.
▲
Show HN: ColiVara – State of the Art RAG API with Vision Models
(github.com/tjmlabs)
10 points
jonathan-adly
2 years ago
discuss
46.
▲
Show HN: Pixeebot – a GitHub App that fixes your Sonar findings (Java/Python)
(pixee.ai)
10 points
nahsra
2 years ago
discuss
47.
▲
Show HN: Neuron – Cognitive Multi-Agent Architecture for Reasoning
8 points
machinemusic
9 months ago
discuss
48.
▲
Show HN: Auto LLM Ranker – Describe a task in English and get ranked models
(github.com/gauravvij)
3 points
gauravvij137
3 months ago
discuss
49.
▲
Q Evaluation Harness: open-source evals for LLMs on q/kdb+
(github.com/KxSystems)
2 points
erfan_mhi
10 months ago
discuss
50.
▲
Show HN: FizzBuzz purely in Rust's trait system
(github.com/doctorn)
120 points
doctor_n_
6 years ago
30 comments
51.
▲
Show HN: Duktape-eval – a eval library built on Duktape and WebAssembly
(github.com/maple3142)
41 points
maple3142
6 years ago
6 comments
52.
▲
Show HN: Pytest-evals – Simple LLM apps evaluation using pytest
(github.com/AlmogBaku)
13 points
almogbaku
a year ago
3 comments
53.
▲
Show HN: Agent-evals – Claude skill to build your own evals
(github.com/fsilavong)
9 points
sauercrowd
a month ago
1 comment
54.
▲
EvalAI: An open-source alternative of Kaggle
(github.com/Cloud-CV)
6 points
deshraj
9 years ago
discuss
55.
▲
Estonia's voting system: a python program on GitHub
(github.com/vvk-ehk)
5 points
leephillips
10 years ago
1 comment
56.
▲
Gbrain-Evals
(github.com/garrytan)
4 points
mjtk
a month ago
1 comment
57.
▲
I tested Haiku vs. Sonnet across 3 agent tasks – the cheap model won every time
(github.com/aimvik07)
3 points
aimvik07
15 days ago
discuss
58.
▲
GPT-4o Benchmark Results
(github.com/openai)
3 points
joak
2 years ago
discuss
59.
▲
OpenAI/Simple-Evals
(github.com/openai)
3 points
davidbarker
2 years ago
discuss
60.
▲
Show HN: Retrieval Evaluations Framework
(github.com/DeployQL)
3 points
mtbarta
2 years ago
discuss
More