Search: github.com/b1nc | Heykuki News

Heykuki News

Top New Best Ask Show Jobs

Top New Best Ask Show Jobs

751.

Show HN: CivBench a long-horizon AI benchmark for multi-agent games (clashai.live)

12 points

3 months ago

752.

Open-source LLM cascading, up to 92% cost savings on benchmarks (github.com/lemony-ai)

12 points

6 months ago

753.

An honest analysis of SpacetimeDB 2.0's insane benchmark results (gist.github.com)

12 points

brandonpollack2

3 months ago

754.

Show HN: A benchmark + latency sim for LLM db queries: ClickHouse / Postgres (github.com/514-labs)

12 points

10 months ago

755.

Yahoo Cloud Serving Benchmark (wiki.github.com)

12 points

16 years ago

756.

Google/fuzzbench: Fuzzer benchmarking as a service (github.com/google)

11 points

6 years ago

757.

A benchmark to compare synchronization techniques for multicore programming (github.com/gramoli)

11 points

10 years ago

758.

HTTP benchmarking tool written in Crystal (github.com/Sdogruyol)

11 points

11 years ago

759.

Show HN: Codex context bloat? 87% avg reduction on SWE-bench Verified traces (npmjs.com)

10 points

a month ago

760.

Show HN: LLM Debate Benchmark (github.com/lechmazur)

9 points

2 months ago

761.

Recursive grep written in Go benched against a C++ and Rust variant (github.com/bep)

9 points

a month ago

762.

LLM Persuasion Benchmark: Multi-Turn Persuasion Between Models (github.com/lechmazur)

9 points

2 months ago

763.

You Do Not Need a Vector Database (For RAG): Benchmarking IR Methods

9 points

3 years ago

764.

Trival PHP string concatenation benchmarks, proving time better spent elsewhere. (github.com/magnetikonline)

8 points

12 years ago

765.

Real-world benchmarks (gist.github.com)

8 points

13 years ago

766.

Show HN: Bazaar – a new LLM benchmark for economic reasoning under uncertainty (github.com/lechmazur)

8 points

a year ago

767.

OpenChat_8192 Beats ChatGPT-3.5 on Vicuna GPT-4 Benchmark

8 points

3 years ago

768.

A bunch of JavaScript idiosyncrasies to beginners (github.com/odykyi)

8 points

8 years ago

769.

Raspberry Pi httpd micro benchmark (gist.github.com)

8 points

10 years ago

770.

Show HN: LLM Creative Story‑Writing Benchmark V3 (github.com/lechmazur)

8 points

9 months ago

771.

Show HN: LLM Divergent Thinking Creativity Benchmark (github.com/lechmazur)

8 points

a year ago

772.

Show HN: Iron Cushion, a CouchDB benchmark and load testing tool (github.com/mgp)

8 points

14 years ago

773.

Show HN: Mini-swe-agent achieves 65% on SWE-bench in 100 lines of python (github.com/SWE-agent)

7 points

10 months ago

774.

A caffeine driven, simplistic approach to benchmarking Node.js code. (github.com/logicalparadox)

7 points

14 years ago

775.

Multi-Agent Step Race Benchmark: LLM Collaboration and Deception Under Pressure (github.com/lechmazur)

7 points

a year ago

776.

Show HN: LLM Deceptiveness and Gullibility Benchmark (github.com/lechmazur)

7 points

2 years ago

777.

Engulf, A Graphical HTTP Benchmarker written in Clojure + D3.js (github.com/andrewvc)

7 points

14 years ago

778.

Wrk – an HTTP benchmarking tool (github.com/wg)

7 points

13 years ago

779.

Show HN: Get a report on your compliance to CIS Benchmarks (Azure and AWS) (github.com/4urcloud)

7 points

2 years ago

780.

Show HN: Ben, your benchmarking assistant, written in Go (github.com/drish)

7 points

8 years ago