Search: github.com/evalstate | Heykuki News

Heykuki News

Top New Best Ask Show Jobs

Top New Best Ask Show Jobs

271.

Show HN: I built an MCP server to recruit employees for free (github.com/Himalayas-App)

2 points

3 months ago

272.

Show HN: Lar-JEPA – A Testbed for Orchestrating Predictive World Models (github.com/snath-ai)

2 points

3 months ago

273.

Show HN: BreakMyAgent – Open-source red-teaming sandbox for LLM system prompts

2 points

3 months ago

274.

Show HN: Agentic Gatekeeper – AI pre-commit hook to auto-patch logic errors (github.com/revanthpobala)

2 points

4 months ago

275.

Show HN: DACP – governance gateway for AI coding agents (github.com/elliot35)

2 points

4 months ago

276.

Show HN: Measuring how AI agent teams improve issue resolution on SWE-Verified (arxiv.org)

2 points

4 months ago

277.

Show HN: sc-membench for modern memory bandwidth and latency benchmarks (github.com/spareCores)

2 points

5 months ago

278.

Ask HN: Critical review of a spec-first economic protocol

2 points

5 months ago

279.

Show HN: Epistemic Summary Line for ChatGPT (github.com/il-b)

2 points

5 months ago

280.

Show HN: Episteme – Aggregating and critiquing retail investor theses with NLP (episteme.cloud)

2 points

5 months ago

281.

Show HN: ZK-auctions – experimenting with zero-knowledge sealed-bid auctions (github.com/ndrwnaguib)

2 points

5 months ago

282.

Show HN: Runtime Kubernetes Compliance Engine (Policy as Data, No SCAP XML) (github.com/scanset)

2 points

5 months ago

283.

Show HN: KARMA – An evaluation framework for Medical AI systems (karma.eka.care)

2 points

10 months ago

284.

Show HN: Dingo 1.9.0 released: With enhanced hallucination detection (github.com/MigoXLab)

2 points

10 months ago

285.

Show HN: New SWE-bench leaderboard compares LMs without fancy agent scaffolds (swebench.com)

2 points

10 months ago

286.

Show HN: Kritikos – Ready to use Go back end for LLM-as-a-critique (github.com/michelelacorte-quinck)

2 points

a year ago

287.

Show HN: I made an open-source synthetic text datasets generator (github.com/patrickfleith)

2 points

a year ago

288.

Show HN: I made an open-source synthetic text datasets generator (github.com/patrickfleith)

2 points

a year ago

289.

Show HN: Nebula – A DSL for scripting TestContainers-based demos (github.com/orbitalapi)

2 points

a year ago

290.

Show HN: Botwell – A Framework for LLM Comparative Analysis Using AI Peer Review (github.com/alanwilhelm)

2 points

a year ago

291.

Show HN: OptiLLMBench – Test how inference optimization tricks scale up LLMs

2 points

a year ago

292.

Show HN: Mandoline – Custom LLM Evaluations for Real-World Use Cases (mandoline.ai)

2 points

2 years ago

293.

Show HN: KubeFox – Open-Source At-Runtime Versioning and Virtual Environments (github.com/xigxog)

2 points

2 years ago

294.

Show HN: [OSS] Taking a Systematic Approach to Improving LLM Accuracy (github.com/palico-ai)

2 points

2 years ago

295.

Show HN: Claude 3.5 Sonnet beats GPT-4o at Competitive Programming (github.com/juvi21)

2 points

2 years ago

296.

Show HN: A GitHub Action for helping RAG apps with CI/CD (github.com/marketplace)

2 points

2 years ago

297.

Show HN: Open Source, Splitscreen Prompt Engineering (github.com/benguz)

2 points

2 years ago

298.

Show HN: Reference-free evaluation of LLM-powered chatbots (github.com/parea-ai)

2 points

3 years ago

299.

Show HN: Open-source alternative to OpenAI Assistants API (superflows.ai)

2 points

3 years ago

300.

Show HN: Play Euchre with AI Bots (euchre.fewworddotrick.com)

2 points

3 years ago