Search: github.com/eval | Heykuki News

Heykuki News

Top New Best Ask Show Jobs

Top New Best Ask Show Jobs

31.

Show HN: EvalView pytest style tests for AI agents (budgets, hallucinations) (github.com/hidai25)

1 point

6 months ago

32.

Evaluatly is now open source and free (github.com/evaluatly)

1 point

6 years ago

33.

Show HN: EvalLens – Open-source tool to evaluate structured LLM outputs (github.com/simonrendona)

1 point

2 months ago

34.

Evalien – Node.js event loop agent harness (github.com/agentbellnorm)

1 point

3 months ago

35.

Show HN: Evals skill for agents – no tooling, just Markdown and subagents (github.com/adriancooney)

1 point

4 months ago

36.

Evalite: Evaluate your LLM-powered apps with TypeScript (github.com/mattpocock)

1 point

6 months ago

37.

Triilman25/evaluation-machine-for-classification-models (github.com/triilman25)

1 point

a year ago

38.

Eval Villain Update released Find those dangerous JavaScript functions (github.com/swoops)

1 point

2 years ago

39.

GitHub Action for Cluster API (github.com/evalsocket)

1 point

6 years ago

40.

Eval – a bot that executes arbitrary JavaScript and posts the result on Plemora (github.com/CosineP)

1 point

6 years ago

41.

Evaldb: Use your favorite language as a database (github.com/turbio)

1 point

6 years ago

42.

Show HN: Evalfilter: A simple Golang evaluation engine for filtering via scripts (github.com/skx)

1 point

7 years ago

43.

Evaluate JavaScript code blocks from within markdown (github.com/reggi)

1 point

11 years ago

44.

Show HN: Open-source alternative to ChatGPT Agents for browsing (github.com/trymeka)

104 points

10 months ago

45.

Show HN: ColiVara – State of the Art RAG API with Vision Models (github.com/tjmlabs)

10 points

2 years ago

46.

Show HN: Pixeebot – a GitHub App that fixes your Sonar findings (Java/Python) (pixee.ai)

10 points

2 years ago

47.

Show HN: Neuron – Cognitive Multi-Agent Architecture for Reasoning

8 points

9 months ago

48.

Show HN: Auto LLM Ranker – Describe a task in English and get ranked models (github.com/gauravvij)

3 points

3 months ago

49.

Q Evaluation Harness: open-source evals for LLMs on q/kdb+ (github.com/KxSystems)

2 points

10 months ago

50.

Show HN: FizzBuzz purely in Rust's trait system (github.com/doctorn)

120 points

6 years ago

51.

Show HN: Duktape-eval – a eval library built on Duktape and WebAssembly (github.com/maple3142)

41 points

6 years ago

52.

Show HN: Pytest-evals – Simple LLM apps evaluation using pytest (github.com/AlmogBaku)

13 points

a year ago

53.

Show HN: Agent-evals – Claude skill to build your own evals (github.com/fsilavong)

9 points

a month ago

54.

EvalAI: An open-source alternative of Kaggle (github.com/Cloud-CV)

6 points

9 years ago

55.

Estonia's voting system: a python program on GitHub (github.com/vvk-ehk)

5 points

10 years ago

56.

Gbrain-Evals (github.com/garrytan)

4 points

a month ago

57.

I tested Haiku vs. Sonnet across 3 agent tasks – the cheap model won every time (github.com/aimvik07)

3 points

15 days ago

58.

GPT-4o Benchmark Results (github.com/openai)

3 points

2 years ago

59.

OpenAI/Simple-Evals (github.com/openai)

3 points

2 years ago

60.

Show HN: Retrieval Evaluations Framework (github.com/DeployQL)

3 points

2 years ago