Heykuki News

TopNewBestAskShowJobs
TopNewBestAskShowJobs
1.
Replace OCR with Vision Language Models (github.com/vlm-run)
292 points
EarlyOom
a year ago
125 comments
2.
Run structured extraction on documents/images locally with Ollama and Pydantic (github.com/vlm-run)
170 points
EarlyOom
a year ago
29 comments
3.
A Node.js SDK for calling Vision Language Models (github.com/vlm-run)
6 points
EarlyOom
a year ago
discuss
4.
Unified Vision-Language Agents – Detect, Segment, OCR, Generate and More (github.com/vlm-run)
5 points
fzysingularity
6 months ago
1 comment
5.
Show HN: Visually parse an entire YouTube video frame by frame (github.com/vlm-run)
5 points
EarlyOom
a year ago
discuss
6.
Vlms-zero-to-hero: readings from the fundamentals to the cutting edge of VLMS (github.com/SkalskiP)
2 points
swyx
a year ago
discuss
7.
Experimental Optical Encoder for Qwen3-VLM-2B-Instruct (github.com/Volkopat)
1 point
volkopat2
7 months ago
1 comment
8.
Asn1c: The Lionet ASN.1 Compiler (github.com/vlm)
1 point
fanf2
7 months ago
discuss
9.
Show HN: r1_vlm – Open-Source Framework for Visual Reasoning with GRPO (github.com/groundlight)
5 points
skumar17
a year ago
8 comments
10.
Mlx-VLM: Fast Local VLMs and Omni Models on Apple Silicon with MLX (github.com/Blaizzy)
2 points
salkahfi
2 months ago
discuss
11.
Show HN: I achieved over 10% improvement on 3D vision PointCLIP (github.com/genji970)
2 points
genji970
a year ago
discuss
12.
Show HN: 2500 vision benchmarks / evals for Vision Language Models (github.com/Overshoot-ai)
1 point
zakariaelhjouji
2 months ago
discuss
13.
Show HN: Vlm in 3D PC, 16 shot scanobjectnn top1 acc: 99.91 (github.com/genji970)
1 point
genji970
a year ago
discuss
14.
Super fast and accurate image classification on edge devices (github.com/Paulescu)
1 point
PauLabartaBajo
8 months ago
discuss
15.
Show HN: Benchmarking VLMs vs. Traditional OCR (getomni.ai)
146 points
themanmaran
a year ago
40 comments
16.
Show HN: LoongForge-A high-performance training framework for LLM, VLM, VLA, Wan (github.com/baidu-baige)
10 points
mindzzz
14 days ago
2 comments
17.
Show HN: Cursed Browser – a VLM reads the HTML and hallucinates the page (github.com/scosman)
7 points
scosman
10 days ago
1 comment
18.
Cursed_browser: Web browser with a VLM as rendering engine (github.com/scosman)
4 points
misterdata
a month ago
discuss
19.
SketchVLM: Letting VLMs draw on images while explaining their reasoning (github.com/Brandon-Collins7)
3 points
taesiri
a month ago
1 comment
20.
Show HN: Unsiloed Chunker – VLM powered semantic chunking for RAG (github.com/Unsiloed-AI)
3 points
unsiloed-ai
a year ago
discuss
21.
Show HN: Vision AI Checkup, an Optometrist for VLMs (visioncheckup.com)
2 points
zerojames
a year ago
discuss
22.
The simplest, fastest repository for training/finetuning small-sized VLMs (github.com/huggingface)
2 points
s-macke
a year ago
discuss
23.
Advanced Quantization Algorithm for LLMs/VLMs (github.com/intel)
2 points
XnoiVeX
a year ago
discuss
24.
Show HN: LLM / VLM language agent implementations (github.com/arthurcolle)
2 points
arthurcolle
a year ago
discuss
25.
Show HN: A VLM-powered image search engine built with Ruby on Rails (github.com/neonwatty)
2 points
neonwatty
a year ago
discuss
26.
Show HN: A/B test your own VLMs for document parsing (Self-hosted Arena) (github.com/Bae-ChangHyun)
1 point
matthew624
4 months ago
discuss
27.
Show HN: Offline AI Photo Search (local VLM and semantic search) (github.com/Pankaj4152)
1 point
Pankaj4152
6 months ago
discuss
28.
"Captions With Attitude" in the browser from local VLM using llama.cpp in Go (github.com/hybridgroup)
1 point
deadprogram
7 months ago
discuss
29.
Pure Go hardware accelerated local inference on VLMs using llama.cpp (github.com/hybridgroup)
1 point
deadprogram
7 months ago
discuss
30.
Yzma = embedding+inference on VLM/LLM/SLM/TLM in pure Go using llama.cpp (github.com/hybridgroup)
1 point
deadprogram
8 months ago
discuss
More