Heykuki News
Top
New
Best
Ask
Show
Jobs
Toggle theme
Login
Top
New
Best
Ask
Show
Jobs
421.
▲
Show HN: Paramount – OSS package for *Human* Evals of AI support
(github.com/ask-fini)
2 points
hakimk
2 years ago
discuss
422.
▲
SDMetrics: Library for evaluating synthetic data quality
(github.com/sdv-dev)
2 points
skadamat
2 years ago
discuss
423.
▲
Promptfoo – Testing and Evaluation for LLMs
(github.com/promptfoo)
2 points
tin7in
2 years ago
discuss
424.
▲
Google DeepMind's research on uncertain ground truth in AI eval
(github.com/google-deepmind)
2 points
minraws
3 years ago
discuss
425.
▲
Show HN: Reference-free evaluation of LLM-powered chatbots
(github.com/parea-ai)
2 points
Joschkabraun
3 years ago
discuss
426.
▲
Ragas – Framework for RAG Evaluation
(github.com/explodinggradients)
2 points
izik
3 years ago
discuss
427.
▲
AGIEval: A Human-Centric Benchmark for Evaluating Foundation Models
(github.com/ruixiangcui)
2 points
accrual
3 years ago
discuss
428.
▲
RAGElo: Toolkit for evaluating RAG agents using tournament-style Elo ranking
(github.com/zetaalphavector)
2 points
barefeg
3 years ago
discuss
429.
▲
Starwhale: A new MLOps platform for Model Evaluation
(github.com/star-whale)
2 points
liutianweidlut
3 years ago
discuss
430.
▲
ChainForge now supports chat evaluation
(github.com/ianarawjo)
2 points
fatso784
3 years ago
discuss
431.
▲
Show HN: CLI for testing and evaluating LLM prompts and outputs
(github.com/promptfoo)
2 points
typpo
3 years ago
discuss
432.
▲
OSS for training, serving, and evaluating LLM based ChatBots
(github.com/lm-sys)
2 points
yujian
3 years ago
discuss
433.
▲
Show HN: XV - Expression Evaluator for C
(github.com/tidwall)
2 points
tidwall
3 years ago
discuss
434.
▲
Croner: Trigger functions or evaluate cron expressions in JavaScript or TS
(github.com/Hexagon)
2 points
kiyanwang
3 years ago
discuss
435.
▲
Haskell library for evaluating whether chess moves are allowed
(github.com/ArnoVanLumig)
2 points
tosh
3 years ago
discuss
436.
▲
Show HN: Brace Lang – parse brace groups and evaluate them however you want
(github.com/xaedes)
2 points
xaedes
4 years ago
discuss
437.
▲
Show HN: Convert VHDL to Verilog using GHDL (+ first evaluation)
(github.com/stnolting)
2 points
youre_the_voice
4 years ago
discuss
438.
▲
SIMD Library for Evaluating Elementary Functions, Vectorized Libm and DFT
(github.com/shibatch)
2 points
brrrrrm
4 years ago
discuss
439.
▲
PicoMath: Fast math evaluation library (C++ header-only)
(github.com/Nitrillo)
2 points
nitrillo
4 years ago
discuss
440.
▲
Parse and evaluate MS Excel formula in JavaScript
(github.com/LesterLyu)
2 points
eatonphil
4 years ago
discuss
441.
▲
Show HN: ANECompat, evaluate CoreML model compatibility with Apple Neural Engine
(github.com/fredyshox)
2 points
fredyshox
4 years ago
discuss
442.
▲
Paper Walkthrough: Is Automated Topic Model Evaluation Broken
(github.com/acatovic)
2 points
armcat
4 years ago
discuss
443.
▲
Lisp Evaluator for FreeBASIC
(github.com/jayrm)
2 points
eatonphil
4 years ago
discuss
444.
▲
Armory Adversarial Robustness Evaluation Test Bed
(github.com/twosixlabs)
2 points
soheil
4 years ago
discuss
445.
▲
Show HN: Lambda Calculus evaluation with type-annotations in TypeScript
(github.com/EvolveYourMind)
2 points
evolveyourmind
4 years ago
discuss
446.
▲
Damov: Methodology and Benchmark Suite for Evaluating Data Movement Bottlenecks
(github.com/CMU-SAFARI)
2 points
simonpure
5 years ago
discuss
447.
▲
JavaScript lexical scope and eval explored
(dbrans.github.com)
2 points
dbrans
15 years ago
discuss
448.
▲
ZickStandardLisp: A Lisp Evaluator in Lisp
(github.com/zick)
2 points
HerrMonnezza
5 years ago
discuss
449.
▲
Evaluate my junior project on GitHub
2 points
damklis
6 years ago
discuss
450.
▲
Datasets and Evaluation Metrics for NLP (True Open Source GPT Alternative)
(github.com/huggingface)
2 points
dragonsh
6 years ago
discuss
More