Heykuki News

TopNewBestAskShowJobs
TopNewBestAskShowJobs
241.
Show HN: Create simulated datasets in Python with Simulacrum (github.com/jbrambleDC)
4 points
jbrambleDC
10 years ago
discuss
242.
A Python tool that automatically cleans data sets and readies them for analysis (github.com/rhiever)
4 points
felix_thursday
10 years ago
discuss
243.
Show HN: Kiln - Interactive LLM fine-tuning, dataset collab & synthetic data gen (github.com/Kiln-AI)
3 points
scosman
a year ago
2 comments
244.
Large New Dataset 220k AI Art Text to Image Prompts (github.com/lee101)
3 points
wrdsmsh321
2 years ago
2 comments
245.
hfsearch: a fast cli tool to discover models and datasets on HuggingFace (github.com/HenokB)
3 points
henok_ademtew
6 months ago
1 comment
246.
Show HN: Torque – A declarative, typesafe DSL for LLM training datasets (MIT) (github.com/qforge-dev)
3 points
michalwarda
7 months ago
1 comment
247.
Hugging Face AI Sheets, open-source tool to vibe test models on your datasets (github.com/huggingface)
3 points
dvilasuero
10 months ago
1 comment
248.
Promptwright: Generate large synthetic datasets using a local LLM (github.com/StacklokLabs)
3 points
trickleup
2 years ago
1 comment
249.
Easily convert YouTube, Torrent and Enterprise videos into LLM datasets (github.com/qet-lab)
3 points
m_2018
2 years ago
1 comment
250.
CodeCapybara: Code Writing LLaMa Finetuned on Deepmind Dataset (github.com/AI4Code-Research)
3 points
brucethemoose2
3 years ago
1 comment
251.
UpliftML: An uplift modeling library that handles web scale datasets (github.com/bookingcom)
3 points
TaXxEr
5 years ago
1 comment
252.
A tool for creating deep learning datasets (github.com/dicroce)
3 points
dicroce
5 years ago
1 comment
253.
Show HN: A dataset of 40k professionally-written summaries of news articles (github.com/curationcorp)
3 points
CurationCorp
6 years ago
1 comment
254.
Crossfader: Autoencoders to find structure in arbitrary datasets (github.com/bettermg)
3 points
vierja
11 years ago
discuss
255.
ExCon is an R/JavaScript tool for exploring topographic-like data sets (github.com/bryanhanson)
3 points
sebg
12 years ago
discuss
256.
Machine Learning: Access Tiny Images Dataset with Python (github.com/cioc)
3 points
cioc
13 years ago
discuss
257.
Open Data Hub Data Browser – Explore and Query Open Datasets (github.com/noi-techpark)
3 points
KadambariSuresh
3 months ago
discuss
258.
JQuery dataset() Plugin (github.com/realchaseadams)
3 points
nwienert
14 years ago
discuss
259.
WebZFS Modern Web Management for ZFS Pools/Datasets/Snapshots/Smart Monitoring (github.com/webzfs)
3 points
vermaden
5 months ago
discuss
260.
Data-morph: Morph a dataset into select shapes, while preserving the statistics (github.com/stefmolin)
3 points
ZeljkoS
9 months ago
discuss
261.
Show HN: Synthetic dataset generator for NLP and tabular data (github.com/VoxDroid)
3 points
voxdroid
a year ago
discuss
262.
DataChain: Prepare and curate datasets for AI/ML (github.com/iterative)
3 points
shcheklein
2 years ago
discuss
263.
Reladiff: High-performance diffing of large datasets across databases (github.com/erezsh)
3 points
PaulHoule
2 years ago
discuss
264.
RNNoise 0.2 – now trained using only publicly available CC-licensed datasets (github.com/xiph)
3 points
pabs3
2 years ago
discuss
265.
ClickHouse-Obfuscator – a tool for dataset anonymization (github.com/ClickHouse)
3 points
aeontech
3 years ago
discuss
266.
CommaVQ: Dataset of 100k Driving Videos (github.com/commaai)
3 points
kklisura
3 years ago
discuss
267.
Img2dataset: Turns large sets of image URLs to an image dataset (github.com/rom1504)
3 points
wildpeaks
3 years ago
discuss
268.
Dataset with Vulgar and Offensive California Vanity License Plates (github.com/veltman)
3 points
RamblingCTO
3 years ago
discuss
269.
Parse research papers into a structured dataset (github.com/neuml)
3 points
txtai
3 years ago
discuss
270.
Legal NLP Dataset With Over 39,000 Examples (github.com/TheAtticusProject)
3 points
optimalsolver
3 years ago
discuss
More