Heykuki News

TopNewBestAskShowJobs
TopNewBestAskShowJobs
631.
Show HN: Rerankers – Models, benchmarks, and papers for RAG (github.com/agentset-ai)
2 points
midamurat
5 months ago
discuss
632.
Show HN: sc-membench for modern memory bandwidth and latency benchmarks (github.com/spareCores)
2 points
daroczig
5 months ago
discuss
633.
Show HN: Long-horizon LLM coherence benchmark (500 cycles) (zenodo.org)
2 points
teugent
5 months ago
discuss
634.
Epiplexity to Beat DeepMind's Alchemy Meta RL Benchmark (github.com/RandMan444)
2 points
Phillip98798
5 months ago
discuss
635.
Show HN: JSONBench, a Benchmark for Data Analytics on JSON (github.com/ClickHouse)
2 points
saisrirampur
5 months ago
discuss
636.
Running a 270M LLM on Android (architecture and benchmarks)
2 points
ayushranjan99
6 months ago
discuss
637.
Show HN: LLM‑Simple‑Eval – Easily Benchmark LLMs for Your Use Case (github.com/grigio)
2 points
grigio
9 months ago
discuss
638.
PostgreSQL vs. ClickHouse: Learnings from building my first database benchmark (github.com/514-labs)
2 points
oatsandsugar
10 months ago
discuss
639.
Show HN: New SWE-bench leaderboard compares LMs without fancy agent scaffolds (swebench.com)
2 points
lieret
10 months ago
discuss
640.
Show HN: Comprehensive Benchmark Suite for Story Visualization (github.com/ViStoryBench)
2 points
hzwer
a year ago
discuss
641.
Show HN: Benchmarks agree with the complexity analysis of the TopoSort algorithm (github.com/williamw520)
2 points
ww520
a year ago
discuss
642.
Show HN: I built an open-source benchmark that evaluates LLMs through gameplay (llmshowdown.io)
2 points
jmogi
a year ago
discuss
643.
Elimination Game Benchmark: Social Reasoning, Strategy, and Deception in LLMs (github.com/lechmazur)
2 points
amichail
a year ago
discuss
644.
C++ Showing std:swap faster than XOR trick to swap numbers via naive benchmark (github.com/vladov3000)
2 points
signa11
2 years ago
discuss
645.
miniF2F: Formal to Formal Mathematics Benchmark (github.com/openai)
2 points
tosh
2 years ago
discuss
646.
Pgdsat – Postgres database security assessment tool for CIS benchmarks (github.com/HexaCluster)
2 points
avi_vallarapu
2 years ago
discuss
647.
Spam-T5: Benchmarking Large Language Models for Few-Shot Email Spam Detection (github.com/jpmorganchase)
2 points
mariuz
2 years ago
discuss
648.
Benchmarks for JDK HTTP Server Running on Java 21 with Virtual Threads (github.com/ebarlas)
2 points
simonpure
2 years ago
discuss
649.
BEIR: A Heterogeneous Benchmark for Information Retrieval (github.com/beir-cellar)
2 points
dmezzetti
2 years ago
discuss
650.
Benchmarking Tool for Vector DBs (github.com/zilliztech)
2 points
fzliu
3 years ago
discuss
651.
Blossom Bindings (Re: Backbone Events vs Ember Bindings: A Benchmark) (fohr.github.com)
2 points
blktiger
14 years ago
discuss
652.
VectorDB benchmark for both cloud and open source (github.com/zilliztech)
2 points
liliuleo93
3 years ago
discuss
653.
SciTS: A tool to benchmark Time-series on different databases (github.com/jalalmostafa)
2 points
jalalmostafa
3 years ago
discuss
654.
Deduplication Solutions Benchmark (github.com/borgbackup)
2 points
todsacerdoti
3 years ago
discuss
655.
Vector Database Performance Benchmarking (github.com/zilliztech)
2 points
fzliu
3 years ago
discuss
656.
Yahoo Cloud Serving Benchmark (YCSB) (github.com/brianfrankcooper)
2 points
bobbiechen
3 years ago
discuss
657.
Tap-Vid: A Benchmark for Tracking Any Point in a Video (github.com/deepmind)
2 points
simonpure
4 years ago
discuss
658.
HyperImpute: A tool for prototyping and benchmarking data imputation methods (github.com/vanderschaarlab)
2 points
bcebere
4 years ago
discuss
659.
Benchmarking Vite vs. Next and turbopack HMR performance (github.com/yyx990803)
2 points
kretaceous
4 years ago
discuss
660.
ManiSkill: Benchmark for Generalizable Manipulation Skills (github.com/haosulab)
2 points
lnyan
5 years ago
discuss
More