Heykuki News

TopNewBestAskShowJobs
TopNewBestAskShowJobs
1.
Show HN: A lightweight open-source web analytics for webdevs (github.com/extractumio)
7 points
instad
3 years ago
discuss
2.
Article extraction benchmark: open-source libraries and commercial services (github.com/scrapinghub)
19 points
lopuhin
6 years ago
10 comments
3.
Scrape any website quickly with LLM (open-source) (github.com/trancethehuman)
3 points
hainghiem375
3 years ago
1 comment
4.
Show HN: Query Wikidata with DuckDB Instead of Sparql (github.com/piebro)
2 points
piebro
5 months ago
discuss
5.
Create a database of crawled HTML pages with RethinkDB and Python (github.com/lethain)
1 point
dsr12
14 years ago
discuss
6.
Show HN: Speech feature extraction package developed in Python (github.com/astorfi)
5 points
irsina
9 years ago
discuss
7.
Show HN: Contract Extraction Assistant – Fast Batch Extraction (github.com/Qleric-labs)
2 points
Mo1756
8 months ago
discuss
8.
Show HN: Contract Extraction Assistant – Local, open-source contract data tool (github.com/Qleric-labs)
2 points
Mo1756
8 months ago
discuss
9.
Show HN: Speech feature extraction package developed in Python (github.com/astorfi)
1 point
irsina
9 years ago
discuss
10.
Show HN: Yapit – PDF and webpage reader with TTS that doesn't suck (github.com/yapit-tts)
5 points
MaxWolf-01
2 months ago
1 comment
11.
Computer Vision Project: Fingerprint Minutiae Feature Extraction (github.com/Utkarsh-Deshmukh)
3 points
d_utkarsh
5 years ago
discuss
12.
Show HN: EmbedRank: Unsupervised Keyphrase Extraction Using Sentence Embeddings (github.com/swisscom)
2 points
Wronskia
7 years ago
discuss
13.
Journalism AI – Quotes extraction for modular journalism (github.com/JournalismAI-2021-Quotes)
1 point
malshe
5 years ago
discuss
14.
Tutorial: Extracting structured data from websites using Groq and Firecrawl (github.com/mendableai)
3 points
nickca
2 years ago
discuss
15.
DEDA – Tracking Dots Extraction, Decoding and Anonymisation Toolkit (github.com/dfd-tud)
286 points
pavel_lishin
a year ago
99 comments
16.
Show HN: Unblob – extraction suite for 30+ file formats (github.com/onekey-sec)
240 points
kissgyorgy
3 years ago
42 comments
17.
Zpdf: PDF text extraction in Zig (github.com/Lulzx)
217 points
lulzx
5 months ago
87 comments
18.
Show HN: Kreuzberg – Modern async Python library for document text extraction (github.com/Goldziher)
197 points
nhirschfeld
a year ago
75 comments
19.
Web Clipper Browser Extension with Automatic Content Extraction, Now Open Source (github.com/jhlyeung)
192 points
laybak
6 years ago
25 comments
20.
DeepDoctection: Document extraction and analysis using deep learning models (github.com/deepdoctection)
191 points
bpiche
3 years ago
62 comments
21.
Run structured extraction on documents/images locally with Ollama and Pydantic (github.com/vlm-run)
170 points
EarlyOom
a year ago
29 comments
22.
Tsfresh – Automatic extraction of relevant features from time series (github.com/blue-yonder)
167 points
restapi
10 years ago
8 comments
23.
Nvidia-Ingest: Multi-modal data extraction (github.com/NVIDIA)
145 points
mihaid150
a year ago
45 comments
24.
Heartleech: Automated OpenSSL private key extraction tool using Heartbleed (github.com/robertdavidgraham)
114 points
FredericJ
12 years ago
76 comments
25.
A library for audio feature extraction, regression, classification, segmentation (github.com/tyiannak)
107 points
nothrowaways
4 years ago
12 comments
26.
Show HN: Ocrbase – pdf → .md/.json document OCR and structured extraction API (github.com/majcheradam)
99 points
adammajcher
4 months ago
36 comments
27.
Coq to Rust Program Extraction (github.com/pirapira)
99 points
kushti
10 years ago
18 comments
28.
DEDA – Tracking Dots Extraction, Decoding and Anonymisation Toolkit (github.com/dfd-tud)
83 points
adulau
8 years ago
7 comments
29.
RoboSat: feature extraction from aerial and satellite imagery (github.com/mapbox)
80 points
danieljh
8 years ago
15 comments
30.
Show HN: Movie Iris - Visualizing Films Through Color Extraction (github.com/LoSinCos)
78 points
losincos
a year ago
37 comments
More