Heykuki News
Top
New
Best
Ask
Show
Jobs
Toggle theme
Login
Top
New
Best
Ask
Show
Jobs
211.
▲
Show HN: Tonic Validate Metrics – an open-source RAG evaluation metrics package
(github.com/TonicAI)
40 points
Ephil012
3 years ago
17 comments
212.
▲
Generic engine to evaluate logical circuits on homomorphic encryption
(github.com/virtualsecureplatform)
38 points
EvgeniyZh
5 years ago
3 comments
213.
▲
Stop Evaluating LLMs on Vibes
(github.com/truera)
35 points
shayaks
3 years ago
7 comments
214.
▲
Show HN: Create LLM graders and run evals in JavaScript with one file
(github.com/bolt-foundry)
28 points
randall
a year ago
2 comments
215.
▲
Show HN: SumEval – Multi-language evaluation framework for text summarization
(github.com/chakki-works)
25 points
icoxfog417
9 years ago
3 comments
216.
▲
λ-calculus evaluator
(zaach.github.com)
24 points
alrex021
16 years ago
5 comments
217.
▲
Evaluate Scheme in Ruby's virtual machine
(gist.github.com)
24 points
tenderlove
14 years ago
2 comments
218.
▲
Numexpr: Fast numerical array expression evaluator for Python, NumPy, Pandas
(github.com/pydata)
23 points
tosh
19 days ago
4 comments
219.
▲
Show HN: Phoenix OSS – Applying LLM Spans, Traces, and Evals for AI Insights
(github.com/Arize-ai)
23 points
jlopes2
3 years ago
3 comments
220.
▲
Show HN: I implemented evals metrics for LLMs that runs locally on your machine
(github.com/confident-ai)
22 points
3d27
2 years ago
3 comments
221.
▲
Utility to estimate tasks using PERT (Program evaluation and review technique)
(github.com/arzzen)
22 points
arzzen
10 years ago
1 comment
222.
▲
Thorn in a HaizeStack test for evaluating long-context adversarial robustness
(github.com/haizelabs)
19 points
leonardtang
2 years ago
11 comments
223.
▲
Math.mk - GNUmake eval gone wild
(github.com/adam-f)
19 points
adam_freidin
14 years ago
4 comments
224.
▲
Show HN: DeepEval – Evaluation and Unit Testing for LLMs
(github.com/confident-ai)
18 points
jacky2wong
3 years ago
8 comments
225.
▲
Python Search – eval(raw_input())
(github.com)
17 points
Nurdok
12 years ago
19 comments
226.
▲
Show HN: Ragas – Open-source library for evals and testing RAG systems
(github.com/explodinggradients)
15 points
shahules
2 years ago
9 comments
227.
▲
Show HN: An Empirical Evaluation of Linear Probing Algorithms
(github.com/senderista)
14 points
senderista
7 years ago
1 comment
228.
▲
Show HN: Evaluate LLM-based RAG Applications with automated test set generation
(github.com/Giskard-AI)
13 points
RuiLyonesse
2 years ago
discuss
229.
▲
Common Expression Language (CEL); lightweight expression evaluation
(github.com/google)
12 points
Wxc2jjJmST9XWWL
5 years ago
5 comments
230.
▲
How Erlang evaluates funs (i.e. lambdas)
(gist.github.com)
12 points
bascule
17 years ago
3 comments
231.
▲
Show HN: UpTrain (YC W23) – open-source tool to evaluate LLM response quality
(demo.uptrain.ai)
12 points
sourabh03agr
3 years ago
discuss
232.
▲
Show HN: Open-source toolkit for ML model evaluation and active learning
(github.com/encord-team)
11 points
ulrikhansen54
3 years ago
discuss
233.
▲
Fexl – Highly robust functional evaluation
(github.com/chkoreff)
10 points
fexl
12 years ago
3 comments
234.
▲
Show HN: Kiln – AI Boilerplate with Evals, Fine-Tuning, Synthetic Data, and Git
(github.com/Kiln-AI)
10 points
scosman
10 months ago
1 comment
235.
▲
Pixar just open sourced their high-performance subdivision evaluator
(github.com/PixarAnimationStudios)
10 points
ColinWright
14 years ago
discuss
236.
▲
Show HN: C++ Mathematical Expression Parser and Evaluation Benchmark
(github.com/ArashPartow)
10 points
ArashPartow
8 years ago
discuss
237.
▲
Can ELO tournaments be used to evaluate LLMs and RAG?
(github.com/zetaalphavector)
9 points
zavrel
3 years ago
1 comment
238.
▲
Show HN: Evolve expressions that evaluate to a target number
(github.com/yati-sagade)
8 points
yati
11 years ago
4 comments
239.
▲
Rllab – framework for developing and evaluating reinforcement learning algorithms
(github.com/rllab)
8 points
dementrock
10 years ago
2 comments
240.
▲
Show HN: Code-Knack – A code evaluator on your web page
(github.com/lyricat)
8 points
lyricat
7 years ago
discuss
More