Heykuki News
Top
New
Best
Ask
Show
Jobs
Toggle theme
Login
Top
New
Best
Ask
Show
Jobs
151.
▲
Show HN: Self-hosted DCF workspace using Damodaran datasets, LLM narratives
1 point
softcane
3 months ago
1 comment
152.
▲
Show HN: RAG-corpus-profiler – A linter for RAG datasets (dedup, PII, quality)
(github.com/aashirpersonal)
1 point
aashirpersonal
5 months ago
1 comment
153.
▲
Show HN: React-obj-view – A virtualized object inspector for large datasets
(github.com/vothanhdat)
1 point
datvo
6 months ago
1 comment
154.
▲
Show HN: Wrote a small tool that turns PDFs and docs into fine-tuning datasets
(github.com/Datalore-ai)
1 point
FineTuner42
10 months ago
1 comment
155.
▲
Show HN: DataChain – Tool to create, curate, version AI datasets
(github.com/iterative)
1 point
shcheklein
2 years ago
1 comment
156.
▲
Face Alignment API: Simple API to align faces when creating datasets/scraping
(github.com/botoxparty)
1 point
botoxparty
3 years ago
1 comment
157.
▲
GeoCOCO: Transform GIS annotations into COCO datasets for use in deep learning
(github.com/jaspersiebring)
1 point
qtieb
3 years ago
1 comment
158.
▲
PLS GIVE UR FEEDBACK: DPIPE Library to easily create TensorFlow datasets
(github.com/aiporre)
1 point
arielin1
6 years ago
1 comment
159.
▲
Not_notMNIST: Generate your own datasets
1 point
RafazZ
9 years ago
1 comment
160.
▲
Synth-dataset-kit: Generate and audit synthetic datasets from seed data
(github.com/KazKozDev)
1 point
kazkozdev
2 months ago
discuss
161.
▲
GABRIEL – turn messy qualitative corpora into analysis-ready datasets
(github.com/openai)
1 point
michaelsbradley
4 months ago
discuss
162.
▲
Show HN: Vietnam Elections (open, source-linked datasets and site)
(bamboo-filing-cabinet.github.io)
1 point
vietthan
4 months ago
discuss
163.
▲
Fasttfidf: High-performance TF-IDF vectorization for large-scale text datasets
(github.com/purijs)
1 point
jspuri
5 months ago
discuss
164.
▲
Show HN: AI tool that walks citation graph and extracts data to create datasets
(github.com/eamag)
1 point
eamag
5 months ago
discuss
165.
▲
Training YOLO vision models on Kaggle datasets
(github.com/mfranzon)
1 point
walterbell
7 months ago
discuss
166.
▲
Show HN: Gaggle – A DuckDB extension for working with Kaggle datasets
1 point
habedi0
7 months ago
discuss
167.
▲
Show HN: Django PostgreSQL Anonymizer – prod → safe dev datasets (beta)
(github.com/CuriousLearner)
1 point
sanyam-khurana
8 months ago
discuss
168.
▲
A toolkit for improving the quality of your LeRobot datasets
(github.com/RoboticsData)
1 point
machinelearning
8 months ago
discuss
169.
▲
A new RAG algorithm to self-heal damaged datasets and query them on a graph
(github.com/iblameandrew)
1 point
scraper02
8 months ago
discuss
170.
▲
Show HN: Tensorpack a CLI tool for semantic discovery across datasets
1 point
AyodeleFikayomi
8 months ago
discuss
171.
▲
Reasoning Gym – Procedural RL reasoning datasets
(github.com/open-thought)
1 point
t55
10 months ago
discuss
172.
▲
Datasets Are All You Need (LLM Learns to Prompt from Data)
(github.com/intellectronica)
1 point
intellectronica
a year ago
discuss
173.
▲
RKaggle: Bring Kaggle Datasets Straight into the R console
(github.com/benyamindsmith)
1 point
SuperMint
a year ago
discuss
174.
▲
Transform and optimize datasets for fast AI model training
(github.com/Lightning-AI)
1 point
shcheklein
2 years ago
discuss
175.
▲
Tool to prepare, curate, version datasets for AI/ML
(github.com/iterative)
1 point
shcheklein
2 years ago
discuss
176.
▲
Transform and Optimize Datasets at Scale
(github.com/Lightning-AI)
1 point
shcheklein
2 years ago
discuss
177.
▲
DataChain: Enrich, transform and curate datasets for ML
(github.com/iterative)
1 point
shcheklein
2 years ago
discuss
178.
▲
TorchGeo: Datasets and pre-trained models for geospatial data
(github.com/microsoft)
1 point
zerojames
2 years ago
discuss
179.
▲
Renumics/spotlight: Interactively explore unstructured datasets from dataframes
(github.com/Renumics)
1 point
rbanffy
2 years ago
discuss
180.
▲
Show HN: Data Contract CLI – Test your datasets
(github.com/datacontract)
1 point
aiobe
2 years ago
discuss
More