Heykuki News
Top
New
Best
Ask
Show
Jobs
Toggle theme
Login
Top
New
Best
Ask
Show
Jobs
631.
▲
Show HN: Rerankers – Models, benchmarks, and papers for RAG
(github.com/agentset-ai)
2 points
midamurat
5 months ago
discuss
632.
▲
Show HN: sc-membench for modern memory bandwidth and latency benchmarks
(github.com/spareCores)
2 points
daroczig
5 months ago
discuss
633.
▲
Show HN: Long-horizon LLM coherence benchmark (500 cycles)
(zenodo.org)
2 points
teugent
5 months ago
discuss
634.
▲
Epiplexity to Beat DeepMind's Alchemy Meta RL Benchmark
(github.com/RandMan444)
2 points
Phillip98798
5 months ago
discuss
635.
▲
Show HN: JSONBench, a Benchmark for Data Analytics on JSON
(github.com/ClickHouse)
2 points
saisrirampur
5 months ago
discuss
636.
▲
Running a 270M LLM on Android (architecture and benchmarks)
2 points
ayushranjan99
6 months ago
discuss
637.
▲
Show HN: LLM‑Simple‑Eval – Easily Benchmark LLMs for Your Use Case
(github.com/grigio)
2 points
grigio
9 months ago
discuss
638.
▲
PostgreSQL vs. ClickHouse: Learnings from building my first database benchmark
(github.com/514-labs)
2 points
oatsandsugar
10 months ago
discuss
639.
▲
Show HN: New SWE-bench leaderboard compares LMs without fancy agent scaffolds
(swebench.com)
2 points
lieret
10 months ago
discuss
640.
▲
Show HN: Comprehensive Benchmark Suite for Story Visualization
(github.com/ViStoryBench)
2 points
hzwer
a year ago
discuss
641.
▲
Show HN: Benchmarks agree with the complexity analysis of the TopoSort algorithm
(github.com/williamw520)
2 points
ww520
a year ago
discuss
642.
▲
Show HN: I built an open-source benchmark that evaluates LLMs through gameplay
(llmshowdown.io)
2 points
jmogi
a year ago
discuss
643.
▲
Elimination Game Benchmark: Social Reasoning, Strategy, and Deception in LLMs
(github.com/lechmazur)
2 points
amichail
a year ago
discuss
644.
▲
C++ Showing std:swap faster than XOR trick to swap numbers via naive benchmark
(github.com/vladov3000)
2 points
signa11
2 years ago
discuss
645.
▲
miniF2F: Formal to Formal Mathematics Benchmark
(github.com/openai)
2 points
tosh
2 years ago
discuss
646.
▲
Pgdsat – Postgres database security assessment tool for CIS benchmarks
(github.com/HexaCluster)
2 points
avi_vallarapu
2 years ago
discuss
647.
▲
Spam-T5: Benchmarking Large Language Models for Few-Shot Email Spam Detection
(github.com/jpmorganchase)
2 points
mariuz
2 years ago
discuss
648.
▲
Benchmarks for JDK HTTP Server Running on Java 21 with Virtual Threads
(github.com/ebarlas)
2 points
simonpure
2 years ago
discuss
649.
▲
BEIR: A Heterogeneous Benchmark for Information Retrieval
(github.com/beir-cellar)
2 points
dmezzetti
2 years ago
discuss
650.
▲
Benchmarking Tool for Vector DBs
(github.com/zilliztech)
2 points
fzliu
3 years ago
discuss
651.
▲
Blossom Bindings (Re: Backbone Events vs Ember Bindings: A Benchmark)
(fohr.github.com)
2 points
blktiger
14 years ago
discuss
652.
▲
VectorDB benchmark for both cloud and open source
(github.com/zilliztech)
2 points
liliuleo93
3 years ago
discuss
653.
▲
SciTS: A tool to benchmark Time-series on different databases
(github.com/jalalmostafa)
2 points
jalalmostafa
3 years ago
discuss
654.
▲
Deduplication Solutions Benchmark
(github.com/borgbackup)
2 points
todsacerdoti
3 years ago
discuss
655.
▲
Vector Database Performance Benchmarking
(github.com/zilliztech)
2 points
fzliu
3 years ago
discuss
656.
▲
Yahoo Cloud Serving Benchmark (YCSB)
(github.com/brianfrankcooper)
2 points
bobbiechen
3 years ago
discuss
657.
▲
Tap-Vid: A Benchmark for Tracking Any Point in a Video
(github.com/deepmind)
2 points
simonpure
4 years ago
discuss
658.
▲
HyperImpute: A tool for prototyping and benchmarking data imputation methods
(github.com/vanderschaarlab)
2 points
bcebere
4 years ago
discuss
659.
▲
Benchmarking Vite vs. Next and turbopack HMR performance
(github.com/yyx990803)
2 points
kretaceous
4 years ago
discuss
660.
▲
ManiSkill: Benchmark for Generalizable Manipulation Skills
(github.com/haosulab)
2 points
lnyan
5 years ago
discuss
More