Search: github.com/eval | Heykuki News

Heykuki News

Top New Best Ask Show Jobs

Top New Best Ask Show Jobs

541.

Litmus: LLM Testing and Evaluation Tool for AI App Development on Google Cloud (github.com/google)

1 point

2 years ago

542.

Llama Stack by Meta – Inference, Safety, Memory, Agentic System, Evaluation (github.com/meta-llama)

1 point

2 years ago

543.

Unibench: Vision-Language Model Evaluation (github.com/facebookresearch)

1 point

2 years ago

544.

LLM Evaluation Methods (github.com/alopatenko)

1 point

2 years ago

545.

Show HN: Serializable infix expressions and a Python evaluator (github.com/shrir)

1 point

2 years ago

546.

FreeEval: A Framework for Trustworthy and Efficient Evaluation of LLMs (github.com/WisdomShell)

1 point

2 years ago

547.

Llama.cpp: Improve CPU prompt eval speed (github.com/ggerganov)

1 point

2 years ago

548.

Evaluate LLMs in Real Time with Street Fighter III (github.com/OpenGenerativeAI)

1 point

2 years ago

549.

Evaluating Claude 3 for Converting Screenshots to Code (github.com/abi)

1 point

2 years ago

550.

Show HN: Hiring when you don't know exactly how to evalute candidates (github.com/joelparkerhenderson)

1 point

2 years ago

551.

Multi-bitrate JPEG compression perceptual evaluation dataset 2023 (github.com/google-research)

1 point

2 years ago

552.

Show HN: Lone Arena – Self-hosted LLM human evaluation, you be the judge (github.com/Contextualist)

1 point

2 years ago

553.

IFEval: Evaluator for LLMs (github.com/Rohan2002)

1 point

2 years ago

554.

Genealogos takes outputs from Nix evaluation tools and produces SBoM files (github.com/tweag)

1 point

2 years ago

555.

Show HN: Open-source evaluations for web agents (github.com/reworkd)

1 point

3 years ago

556.

PhaseLLM Eval: run batch LLM jobs and evals via visual front-end (MIT licensed) (github.com/wgryc)

1 point

3 years ago

557.

Thudm/AgentBench: A Comprehensive Benchmark to Evaluate LLMs as Agents (github.com/THUDM)

1 point

3 years ago

558.

AgentBench: A Comprehensive Benchmark to Evaluate LLMs as Agents (github.com/THUDM)

1 point

3 years ago

559.

Evaluate Multiple LLMs Easily (github.com/ray-project)

1 point

3 years ago

560.

Show HN: ChainForge, a visual tool for evaluating LLM responses (github.com/ianarawjo)

1 point

3 years ago

561.

Lazy evaluation and infinite streams in C++ (github.com/apresta)

1 point

14 years ago

562.

Git-REPL: A Git REPL (read-eval-print loop) courtesy of rlwrap (github.com/jcsalterego)

1 point

3 years ago

563.

Show HN: Made a simple parser/evaluator of arithmetic expressions in Python (github.com/beyonddream)

1 point

3 years ago

564.

A safe eval library based on WebAssembly and Duktape/QuickJS (github.com/maple3142)

1 point

4 years ago

565.

Feature request: add PR curve in TensorFlow object detection API / eval.p (github.com/tensorflow)

1 point

4 years ago

566.

Latte: Evaluation Framework for Disentangled Latent Spaces (github.com/karnwatcharasupat)

1 point

4 years ago

567.

Show HN: Rues an Expression Evaluation Sidecar (github.com/maxpert)

1 point

4 years ago

568.

Rues Is a Expression Evaluation as Service (github.com/maxpert)

1 point

4 years ago

569.

Show HN: A generic policy and constraint evaluator (Go) (github.com/tadasv)

1 point

5 years ago

570.

acorn-macros – Evaluates and replaces JavaScript macros with Acorn (github.com/heyheyhello)

1 point

5 years ago