Heykuki News

TopNewBestAskShowJobs
TopNewBestAskShowJobs
31.
Show HN: EvalView pytest style tests for AI agents (budgets, hallucinations) (github.com/hidai25)
1 point
hidai25
6 months ago
1 comment
32.
Evaluatly is now open source and free (github.com/evaluatly)
1 point
gpnt
6 years ago
1 comment
33.
Show HN: EvalLens – Open-source tool to evaluate structured LLM outputs (github.com/simonrendona)
1 point
simonrendon
2 months ago
discuss
34.
Evalien – Node.js event loop agent harness (github.com/agentbellnorm)
1 point
agentbellnorm
3 months ago
discuss
35.
Show HN: Evals skill for agents – no tooling, just Markdown and subagents (github.com/adriancooney)
1 point
adriancooney
4 months ago
discuss
36.
Evalite: Evaluate your LLM-powered apps with TypeScript (github.com/mattpocock)
1 point
handfuloflight
6 months ago
discuss
37.
Triilman25/evaluation-machine-for-classification-models (github.com/triilman25)
1 point
triilman
a year ago
discuss
38.
Eval Villain Update released Find those dangerous JavaScript functions (github.com/swoops)
1 point
tony-ds
2 years ago
discuss
39.
GitHub Action for Cluster API (github.com/evalsocket)
1 point
evalsocket
6 years ago
discuss
40.
Eval – a bot that executes arbitrary JavaScript and posts the result on Plemora (github.com/CosineP)
1 point
aeroplain
6 years ago
discuss
41.
Evaldb: Use your favorite language as a database (github.com/turbio)
1 point
amasad
6 years ago
discuss
42.
Show HN: Evalfilter: A simple Golang evaluation engine for filtering via scripts (github.com/skx)
1 point
stevekemp
7 years ago
discuss
43.
Evaluate JavaScript code blocks from within markdown (github.com/reggi)
1 point
thomasreggi
11 years ago
discuss
44.
Show HN: Open-source alternative to ChatGPT Agents for browsing (github.com/trymeka)
104 points
ElasticBottle
10 months ago
23 comments
45.
Show HN: ColiVara – State of the Art RAG API with Vision Models (github.com/tjmlabs)
10 points
jonathan-adly
2 years ago
discuss
46.
Show HN: Pixeebot – a GitHub App that fixes your Sonar findings (Java/Python) (pixee.ai)
10 points
nahsra
2 years ago
discuss
47.
Show HN: Neuron – Cognitive Multi-Agent Architecture for Reasoning
8 points
machinemusic
9 months ago
discuss
48.
Show HN: Auto LLM Ranker – Describe a task in English and get ranked models (github.com/gauravvij)
3 points
gauravvij137
3 months ago
discuss
49.
Q Evaluation Harness: open-source evals for LLMs on q/kdb+ (github.com/KxSystems)
2 points
erfan_mhi
10 months ago
discuss
50.
Show HN: FizzBuzz purely in Rust's trait system (github.com/doctorn)
120 points
doctor_n_
6 years ago
30 comments
51.
Show HN: Duktape-eval – a eval library built on Duktape and WebAssembly (github.com/maple3142)
41 points
maple3142
6 years ago
6 comments
52.
Show HN: Pytest-evals – Simple LLM apps evaluation using pytest (github.com/AlmogBaku)
13 points
almogbaku
a year ago
3 comments
53.
Show HN: Agent-evals – Claude skill to build your own evals (github.com/fsilavong)
9 points
sauercrowd
a month ago
1 comment
54.
EvalAI: An open-source alternative of Kaggle (github.com/Cloud-CV)
6 points
deshraj
9 years ago
discuss
55.
Estonia's voting system: a python program on GitHub (github.com/vvk-ehk)
5 points
leephillips
10 years ago
1 comment
56.
Gbrain-Evals (github.com/garrytan)
4 points
mjtk
a month ago
1 comment
57.
I tested Haiku vs. Sonnet across 3 agent tasks – the cheap model won every time (github.com/aimvik07)
3 points
aimvik07
15 days ago
discuss
58.
GPT-4o Benchmark Results (github.com/openai)
3 points
joak
2 years ago
discuss
59.
OpenAI/Simple-Evals (github.com/openai)
3 points
davidbarker
2 years ago
discuss
60.
Show HN: Retrieval Evaluations Framework (github.com/DeployQL)
3 points
mtbarta
2 years ago
discuss
More