Heykuki News
Top
New
Best
Ask
Show
Jobs
Toggle theme
Login
Top
New
Best
Ask
Show
Jobs
91.
▲
Show HN: I made this tool for navigating pandas datasets
(github.com/man-group)
20 points
leehcksource
6 years ago
discuss
92.
▲
Show HN: SemHash – Fast Semantic Text Deduplication for Cleaner Datasets
(github.com/MinishLab)
19 points
Pringled
a year ago
6 comments
93.
▲
Show HN: Version code, models, & datasets together in GitHub
19 points
skadamat
3 years ago
6 comments
94.
▲
NLP: A new datasets and metrics library from Hugging Face
(github.com/huggingface)
19 points
julien_c
6 years ago
discuss
95.
▲
GitHub: Awesome-reasoning, a curated list of datasets for reasoning AIs
(github.com/neurallambda)
17 points
neurallambda
2 years ago
discuss
96.
▲
Datasetq: jq for Datasets; Polars-powered Parquet/JSON/CSV query lang/cli
(github.com/datasetq)
15 points
djb-at-durable
6 months ago
2 comments
97.
▲
Easy way to load, create, version, query and visualize computer vision datasets
13 points
morpheusme
4 years ago
discuss
98.
▲
Show HN: Create datasets more simply and improve AI model with unstructured data
(github.com/adansons)
12 points
KenichiHiguchi
4 years ago
3 comments
99.
▲
Show HN: Download HuggingFace Models/Datasets easily and super fast
(github.com/bodaay)
10 points
qqqbodaayqqq
3 years ago
2 comments
100.
▲
Show HN: Training synthetic models on highly complex datasets
(github.com/gretelai)
10 points
repeat_or
4 years ago
2 comments
101.
▲
Show HN: React-like Declarative DSL for building synthetic LLM datasets
(github.com/qforge-dev)
10 points
arturwala
7 months ago
discuss
102.
▲
Kangas: Explore Multimedia Datasets at Scale
(github.com/comet-ml)
9 points
dmoura
4 years ago
2 comments
103.
▲
Open Thoughts: Curating the best reasoning datasets
(github.com/open-thoughts)
8 points
madiator
a year ago
discuss
104.
▲
Show HN: Automate Variable Selection for Research on Big Datasets (Open-Source)
(github.com/MalikHarrisAhm)
8 points
mha23
2 years ago
discuss
105.
▲
Our classifier outperforms CatBoost, XGBoost, LightGBM on 5 benchmark datasets
(github.com/LinearBoost)
6 points
hamid9
2 years ago
5 comments
106.
▲
DatasetGPT – an open-source command line tool for generating datasets with LLMs
(github.com/radi-cho)
6 points
radicho123
3 years ago
1 comment
107.
▲
Show HN: FiftyOne – Explore, Analyze and Curate Visual Datasets
(github.com/voxel51)
6 points
benjaminpkane
6 years ago
1 comment
108.
▲
Show HN: Xray: N-D labeled arrays and datasets in Python
(github.com/xray)
6 points
shoyer
12 years ago
discuss
109.
▲
Show HN: SemHash – Fast Semantic Text Deduplication for Cleaner Datasets
(github.com/MinishLab)
6 points
stephantul
a year ago
discuss
110.
▲
Show HN: Interactively explore unstructured datasets from your dataframe
(github.com/Renumics)
6 points
sps44
3 years ago
discuss
111.
▲
Kangas: Pandas for Multimedia Datasets
(github.com/comet-ml)
6 points
synergy20
3 years ago
discuss
112.
▲
The fastest command-line tools for querying large JSON datasets
(github.com/dcmoura)
6 points
zX41ZdbW
4 years ago
discuss
113.
▲
Resampling Unbalanced Datasets
(github.com/fmfn)
5 points
hrb1979
12 years ago
discuss
114.
▲
Curated list of language modeling researches for code, plus related datasets
(github.com/codefuse-ai)
5 points
Bluestein
a year ago
discuss
115.
▲
Show HN: Byte-Pair Encoding tokenizer for training LLMs on large datasets
(github.com/jmaczan)
5 points
yu3zhou4
2 years ago
discuss
116.
▲
DataDM – Search and analyze datasets with LLMs
(github.com/approximatelabs)
5 points
cle
3 years ago
discuss
117.
▲
Show HN: Create APIs for static datasets without writing a single line of code
(github.com/roapi)
5 points
houqp
5 years ago
discuss
118.
▲
Show HN: Transform Unstructured Data into Usable Datasets
(github.com/wizenheimer)
4 points
wizenheimer
2 years ago
1 comment
119.
▲
Show HN: pqry – A fast, lightweight CLI tool to diagnose Parquet datasets
(github.com/symblic)
4 points
setzeno
4 months ago
discuss
120.
▲
Show HN: Lance – Open lakehouse format for multimodal AI datasets
(github.com/lance-format)
4 points
criexe
5 months ago
discuss
More