Heykuki News

TopNewBestAskShowJobs
TopNewBestAskShowJobs
151.
Hypersim, Photorealistic Synthetic Dataset for Indoor Scene Understanding (github.com/apple)
122 points
homarp
5 years ago
20 comments
152.
Show HN: Dlt – Python library to automate the creation of datasets (colab.research.google.com)
114 points
MatthausK
3 years ago
54 comments
153.
Driving dataset for car autopilot AI training (github.com/commaai)
100 points
EvgeniyZh
10 years ago
44 comments
154.
Boston housing price dataset was removed from scikit-learn 1.2 (github.com/scikit-learn)
81 points
ok123456
3 years ago
84 comments
155.
RipTable – multi-threaded Python data analytics tools for numpy arrays/datasets (github.com/rtosholdings)
79 points
aldanor
6 years ago
14 comments
156.
Show HN: Hyperparam: OSS tools for exploring datasets locally in the browser (hyperparam.app)
77 points
platypii
a year ago
21 comments
157.
Comma2k19 – A dataset of over 33 hours of commute in California's 280 highway (github.com/commaai)
70 points
pd0wm
7 years ago
35 comments
158.
How to query data.gov json datasets with SQL: a case study (github.com/axibase)
68 points
rodionos
9 years ago
1 comment
159.
The Museum of Modern Art Research Dataset (github.com/MuseumofModernArt)
61 points
danso
11 years ago
15 comments
160.
Chicago Crime Trends. Analyzing 3GB Dataset from Data.gov with SQL and Graphs (github.com/axibase)
44 points
rodionos
9 years ago
3 comments
161.
Dataset of Linus Torvalds' rants ranked by hate (github.com/corollari)
42 points
fctorial
5 years ago
17 comments
162.
ClickHouse Obfuscator – A tool for dataset anonymization (github.com/ClickHouse)
39 points
rrampage
3 years ago
3 comments
163.
DeepMind's machine-reading question/answer dataset (github.com/deepmind)
37 points
andrewtbham
11 years ago
3 comments
164.
Madlad-400: A Multilingual and Document-Level Large Audited Dataset (github.com/google-research)
37 points
the_bookmaker
3 years ago
1 comment
165.
A dataset of crimes committed in Buenos Aires (github.com/ramadis)
34 points
ramadis
8 years ago
4 comments
166.
Show HN: I used streaming to skip downloading my 45GB dataset (github.com/DagsHub)
31 points
npRandom
4 years ago
discuss
167.
Toxicity Dataset (github.com/surge-ai)
25 points
CarrieLab
4 years ago
32 comments
168.
Structured Etymology Dataset (github.com/droher)
24 points
downboots
a year ago
3 comments
169.
Washington Post publishes dataset of 52,000 criminal homicides (github.com/washingtonpost)
24 points
danso
8 years ago
2 comments
170.
I have trained StyleGAN2 from scratch with a dataset of female portraits (github.com/l4rz)
20 points
EvgeniyZh
5 years ago
20 comments
171.
VoxelCNN: Order-Aware Generative Modeling Using the 3D-Craft Dataset (github.com/facebookresearch)
20 points
ingve
6 years ago
discuss
172.
Show HN: I made this tool for navigating pandas datasets (github.com/man-group)
20 points
leehcksource
6 years ago
discuss
173.
Show HN: SemHash – Fast Semantic Text Deduplication for Cleaner Datasets (github.com/MinishLab)
19 points
Pringled
a year ago
6 comments
174.
Show HN: Version code, models, & datasets together in GitHub
19 points
skadamat
3 years ago
6 comments
175.
NLP: A new datasets and metrics library from Hugging Face (github.com/huggingface)
19 points
julien_c
6 years ago
discuss
176.
Show HN: Dataset of Linus Torvalds' rants sorted by hate (github.com/corollari)
17 points
corollari
7 years ago
4 comments
177.
GitHub: Awesome-reasoning, a curated list of datasets for reasoning AIs (github.com/neurallambda)
17 points
neurallambda
2 years ago
discuss
178.
ICLR 2026 – Institutional Affiliations Dataset and Analysis (github.com/DmytroLopushanskyy)
15 points
stared
22 days ago
2 comments
179.
Easy way to load, create, version, query and visualize computer vision datasets
13 points
morpheusme
4 years ago
discuss
180.
Show HN: Dataset of 125k Medium Blog Post Titles and Subtitles (With Categories) (github.com/turbo)
13 points
minxomat
7 years ago
discuss
More