Heykuki News
Top
New
Best
Ask
Show
Jobs
Toggle theme
Login
Top
New
Best
Ask
Show
Jobs
421.
▲
Show HN: Xray: N-D labeled arrays and datasets in Python
(github.com/xray)
6 points
shoyer
12 years ago
discuss
422.
▲
Cockroachdb – A Scalable, Geo-Replicated, Transactional Datastore
(github.com/cockroachdb)
6 points
pandemicsyn
12 years ago
discuss
423.
▲
Show HN: Generate Fine-tunning dataset using deep research in terminal
(github.com/Datalore-ai)
6 points
FineTuner42
10 months ago
discuss
424.
▲
Show HN: SemHash – Fast Semantic Text Deduplication for Cleaner Datasets
(github.com/MinishLab)
6 points
stephantul
a year ago
discuss
425.
▲
Show HN: Interactively explore unstructured datasets from your dataframe
(github.com/Renumics)
6 points
sps44
3 years ago
discuss
426.
▲
Kangas: Pandas for Multimedia Datasets
(github.com/comet-ml)
6 points
synergy20
3 years ago
discuss
427.
▲
The fastest command-line tools for querying large JSON datasets
(github.com/dcmoura)
6 points
zX41ZdbW
4 years ago
discuss
428.
▲
Lethe: A Basic Log-Structured Flash Datastore in Rust
(github.com/oxidecomputer)
6 points
hasheddan
4 years ago
discuss
429.
▲
Video Classification Starter Code for Working with the YouTube-8M Dataset
(github.com/google)
6 points
tylerwhipple
9 years ago
discuss
430.
▲
Resampling Unbalanced Datasets
(github.com/fmfn)
5 points
hrb1979
12 years ago
discuss
431.
▲
Curated list of language modeling researches for code, plus related datasets
(github.com/codefuse-ai)
5 points
Bluestein
a year ago
discuss
432.
▲
Show HN: Byte-Pair Encoding tokenizer for training LLMs on large datasets
(github.com/jmaczan)
5 points
yu3zhou4
2 years ago
discuss
433.
▲
DataDM – Search and analyze datasets with LLMs
(github.com/approximatelabs)
5 points
cle
3 years ago
discuss
434.
▲
DataDM: Open-source local-LLM code-interpreter with dataset search
(github.com/approximatelabs)
5 points
bluecoconut
3 years ago
discuss
435.
▲
Show HN: Multiobjective Large-Scale Fashion Dataset with Distributional Shifts
(github.com/st-tech)
5 points
nanikano
5 years ago
discuss
436.
▲
Show HN: H5records – simple large dataset for pytorch training
(github.com/theblackcat102)
5 points
polymorph1sm
5 years ago
discuss
437.
▲
Show HN: Create APIs for static datasets without writing a single line of code
(github.com/roapi)
5 points
houqp
5 years ago
discuss
438.
▲
Show HN: We made a dataset differ! (Free, Open source)
(github.com/qri-io)
5 points
rgardaphe
7 years ago
discuss
439.
▲
Show HN: Qri, a free and open source distributed dataset versioning tool
5 points
rgardaphe
7 years ago
discuss
440.
▲
Show HN: MNIST-Sequence – Generate dataset for sequences of handwritten digits
(github.com/ankitaggarwal011)
5 points
aaggarwal
9 years ago
discuss
441.
▲
VisualNexus – Training Pipeline for Visual Dataset Segmentation and Labeling
(github.com/kyegomez)
4 points
Reclaimer
3 years ago
3 comments
442.
▲
DeltaQL - a NodeJS datastore whose query results never get stale.
(github.com/chrisdew)
4 points
chrisdew
14 years ago
2 comments
443.
▲
Addressing for PHP: Postal address management powered by Google's dataset
(github.com/commerceguys)
4 points
robertDouglass
12 years ago
1 comment
444.
▲
Show HN: Transform Unstructured Data into Usable Datasets
(github.com/wizenheimer)
4 points
wizenheimer
2 years ago
1 comment
445.
▲
Show HN: Cerebras-GPT-2.7B finetuned on Stanford Alpaca dataset
(github.com/lxe)
4 points
lxe
3 years ago
1 comment
446.
▲
Lichess Combined Puzzle-Game Dataset
(github.com/mcognetta)
4 points
mcyc
4 years ago
1 comment
447.
▲
Show HN: Drone Deploy Dataset – Segmentation with Pytorch
(github.com/s3nh)
4 points
s3nhxx
6 years ago
1 comment
448.
▲
CockroachDB: A Scalable, Geo-Replicated, Transactional Datastore
(github.com/cockroachdb)
4 points
scribu
12 years ago
discuss
449.
▲
Show HN: CRED-1 – Open domain credibility dataset for on-device pre-bunking
(github.com/aloth)
4 points
xlth
13 days ago
discuss
450.
▲
Show HN: pqry – A fast, lightweight CLI tool to diagnose Parquet datasets
(github.com/symblic)
4 points
setzeno
4 months ago
discuss
More