Search: github.com/eval | Heykuki News

Heykuki News

Top New Best Ask Show Jobs

Top New Best Ask Show Jobs

481.

Show HN: BloonsBench – Evaluate agent performance on Bloons Tower Defense 5 (github.com/cnqso)

1 point

3 months ago

482.

RAGScore – Evaluate RAG pipelines in 2 commands, works offline with Ollama (github.com/HZYAI)

1 point

3 months ago

483.

ZIO-OpenFeature – Feature Flag Evaluation for Scala with ZIO (github.com/EtaCassiopeia)

1 point

4 months ago

484.

Show HN: Eval based agent builder (pls roast us) (github.com/seer-engg)

1 point

6 months ago

485.

SigmaEval – statistical evaluation for GenAI apps (github.com/Itura-AI)

1 point

8 months ago

486.

AgentTrace – Open-Source Tracing and Evaluation for AI Agents by TensorStax (github.com/tensorstax)

1 point

a year ago

487.

Dingo: A Comprehensive Data Quality Evaluation Tool (github.com/DataEval)

1 point

a year ago

488.

ModelClash: Dynamic LLM Evaluation Through AI Duels (github.com/mrconter1)

1 point

2 years ago

489.

Haveged being evaluated by AI models (github.com/jirka-h)

1 point

2 years ago

490.

Promptbench: A Unified Library for Evaluating and Understanding LLMs (github.com/microsoft)

1 point

2 years ago

491.

ToolBench: An evaluation suite for LLM tool manipulation capabilities (github.com/sambanova)

1 point

3 years ago

492.

Conan does not evaluate joint compatibility of version requirements by design (github.com/conan-io)

1 point

4 years ago

493.

Torch-metrics: a model evaluation package for PyTorch (github.com/enochkan)

1 point

6 years ago

494.

How to Evaluate Your Career (github.com/kthejoker)

1 point

6 years ago

495.

Release 1.3.0 of Expr expression evaluation library (github.com/antonmedv)

1 point

7 years ago

496.

SMJSON: a homoiconic and “self evaluating” format of JSON (github.com/udexon)

1 point

7 years ago

497.

Quine Eval Server – An Experiment (gist.github.com)

1 point

11 years ago

498.

Fexl now using purely functional evaluation (github.com/chkoreff)

1 point

12 years ago

499.

New Fexl Release (default is eager evaluation instead of lazy) (github.com/chkoreff)

1 point

12 years ago

500.

Ruby evolution – class_eval-ing class_eval (github.com/jumph4x)

1 point

westonplatter31

12 years ago

501.

Localeval: Evaluate a string of JS code without access to the global object (github.com/espadrine)

1 point

13 years ago

502.

Stabilizer: Statistically Rigorous Performance Evaluation (github.com/ccurtsinger)

1 point

13 years ago

503.

Show HN: Texas Hold'em hand evaluator for node.js (github.com/decs)

1 point

13 years ago

504.

LLM INQUISITOR: Evaluating how AI models handle long, realistic tasks (github.com/AssimilatedHuman)

1 point

20 days ago

505.

Show HN: TweakIdea – 14-dimension startup idea evaluation in Claude Code (github.com/eph5xx)

1 point

2 months ago

506.

Show HN: Evaluate Python functions at their singularities (github.com/FWDhr)

1 point

calculusmachine

2 months ago

507.

Show HN: 2500 vision benchmarks / evals for Vision Language Models (github.com/Overshoot-ai)

1 point

zakariaelhjouji

2 months ago

508.

Show HN: An agent skill for eval-driven development of LLM-powered app (github.com/yiouli)

1 point

3 months ago

509.

ReqIf OPA SARIF – CI/CD semantically evaluated policy gates (github.com/PromptExecution)

1 point

elasticventures

3 months ago

510.

Show HN: Vibe Coding Review Checklist – Evaluate AI-Generated Code Quality (github.com/aiqualitylab)

1 point

4 months ago