Heykuki News

TopNewBestAskShowJobs
TopNewBestAskShowJobs
151.
Show HN: Self-hosted DCF workspace using Damodaran datasets, LLM narratives
1 point
softcane
3 months ago
1 comment
152.
Show HN: RAG-corpus-profiler – A linter for RAG datasets (dedup, PII, quality) (github.com/aashirpersonal)
1 point
aashirpersonal
5 months ago
1 comment
153.
Show HN: React-obj-view – A virtualized object inspector for large datasets (github.com/vothanhdat)
1 point
datvo
6 months ago
1 comment
154.
Show HN: Wrote a small tool that turns PDFs and docs into fine-tuning datasets (github.com/Datalore-ai)
1 point
FineTuner42
10 months ago
1 comment
155.
Show HN: DataChain – Tool to create, curate, version AI datasets (github.com/iterative)
1 point
shcheklein
2 years ago
1 comment
156.
Face Alignment API: Simple API to align faces when creating datasets/scraping (github.com/botoxparty)
1 point
botoxparty
3 years ago
1 comment
157.
GeoCOCO: Transform GIS annotations into COCO datasets for use in deep learning (github.com/jaspersiebring)
1 point
qtieb
3 years ago
1 comment
158.
PLS GIVE UR FEEDBACK: DPIPE Library to easily create TensorFlow datasets (github.com/aiporre)
1 point
arielin1
6 years ago
1 comment
159.
Not_notMNIST: Generate your own datasets
1 point
RafazZ
9 years ago
1 comment
160.
Synth-dataset-kit: Generate and audit synthetic datasets from seed data (github.com/KazKozDev)
1 point
kazkozdev
2 months ago
discuss
161.
GABRIEL – turn messy qualitative corpora into analysis-ready datasets (github.com/openai)
1 point
michaelsbradley
4 months ago
discuss
162.
Show HN: Vietnam Elections (open, source-linked datasets and site) (bamboo-filing-cabinet.github.io)
1 point
vietthan
4 months ago
discuss
163.
Fasttfidf: High-performance TF-IDF vectorization for large-scale text datasets (github.com/purijs)
1 point
jspuri
5 months ago
discuss
164.
Show HN: AI tool that walks citation graph and extracts data to create datasets (github.com/eamag)
1 point
eamag
5 months ago
discuss
165.
Training YOLO vision models on Kaggle datasets (github.com/mfranzon)
1 point
walterbell
7 months ago
discuss
166.
Show HN: Gaggle – A DuckDB extension for working with Kaggle datasets
1 point
habedi0
7 months ago
discuss
167.
Show HN: Django PostgreSQL Anonymizer – prod → safe dev datasets (beta) (github.com/CuriousLearner)
1 point
sanyam-khurana
8 months ago
discuss
168.
A toolkit for improving the quality of your LeRobot datasets (github.com/RoboticsData)
1 point
machinelearning
8 months ago
discuss
169.
A new RAG algorithm to self-heal damaged datasets and query them on a graph (github.com/iblameandrew)
1 point
scraper02
8 months ago
discuss
170.
Show HN: Tensorpack a CLI tool for semantic discovery across datasets
1 point
AyodeleFikayomi
8 months ago
discuss
171.
Reasoning Gym – Procedural RL reasoning datasets (github.com/open-thought)
1 point
t55
10 months ago
discuss
172.
Datasets Are All You Need (LLM Learns to Prompt from Data) (github.com/intellectronica)
1 point
intellectronica
a year ago
discuss
173.
RKaggle: Bring Kaggle Datasets Straight into the R console (github.com/benyamindsmith)
1 point
SuperMint
a year ago
discuss
174.
Transform and optimize datasets for fast AI model training (github.com/Lightning-AI)
1 point
shcheklein
2 years ago
discuss
175.
Tool to prepare, curate, version datasets for AI/ML (github.com/iterative)
1 point
shcheklein
2 years ago
discuss
176.
Transform and Optimize Datasets at Scale (github.com/Lightning-AI)
1 point
shcheklein
2 years ago
discuss
177.
DataChain: Enrich, transform and curate datasets for ML (github.com/iterative)
1 point
shcheklein
2 years ago
discuss
178.
TorchGeo: Datasets and pre-trained models for geospatial data (github.com/microsoft)
1 point
zerojames
2 years ago
discuss
179.
Renumics/spotlight: Interactively explore unstructured datasets from dataframes (github.com/Renumics)
1 point
rbanffy
2 years ago
discuss
180.
Show HN: Data Contract CLI – Test your datasets (github.com/datacontract)
1 point
aiobe
2 years ago
discuss
More