Heykuki News
Top
New
Best
Ask
Show
Jobs
Toggle theme
Login
Top
New
Best
Ask
Show
Jobs
1.
▲
Replace OCR with Vision Language Models
(github.com/vlm-run)
292 points
EarlyOom
a year ago
125 comments
2.
▲
Run structured extraction on documents/images locally with Ollama and Pydantic
(github.com/vlm-run)
170 points
EarlyOom
a year ago
29 comments
3.
▲
A Node.js SDK for calling Vision Language Models
(github.com/vlm-run)
6 points
EarlyOom
a year ago
discuss
4.
▲
Unified Vision-Language Agents – Detect, Segment, OCR, Generate and More
(github.com/vlm-run)
5 points
fzysingularity
6 months ago
1 comment
5.
▲
Show HN: Visually parse an entire YouTube video frame by frame
(github.com/vlm-run)
5 points
EarlyOom
a year ago
discuss
6.
▲
Vlms-zero-to-hero: readings from the fundamentals to the cutting edge of VLMS
(github.com/SkalskiP)
2 points
swyx
a year ago
discuss
7.
▲
Experimental Optical Encoder for Qwen3-VLM-2B-Instruct
(github.com/Volkopat)
1 point
volkopat2
7 months ago
1 comment
8.
▲
Asn1c: The Lionet ASN.1 Compiler
(github.com/vlm)
1 point
fanf2
7 months ago
discuss
9.
▲
Show HN: r1_vlm – Open-Source Framework for Visual Reasoning with GRPO
(github.com/groundlight)
5 points
skumar17
a year ago
8 comments
10.
▲
Mlx-VLM: Fast Local VLMs and Omni Models on Apple Silicon with MLX
(github.com/Blaizzy)
2 points
salkahfi
2 months ago
discuss
11.
▲
Show HN: I achieved over 10% improvement on 3D vision PointCLIP
(github.com/genji970)
2 points
genji970
a year ago
discuss
12.
▲
Show HN: 2500 vision benchmarks / evals for Vision Language Models
(github.com/Overshoot-ai)
1 point
zakariaelhjouji
2 months ago
discuss
13.
▲
Show HN: Vlm in 3D PC, 16 shot scanobjectnn top1 acc: 99.91
(github.com/genji970)
1 point
genji970
a year ago
discuss
14.
▲
Super fast and accurate image classification on edge devices
(github.com/Paulescu)
1 point
PauLabartaBajo
8 months ago
discuss
15.
▲
Show HN: Benchmarking VLMs vs. Traditional OCR
(getomni.ai)
146 points
themanmaran
a year ago
40 comments
16.
▲
Show HN: LoongForge-A high-performance training framework for LLM, VLM, VLA, Wan
(github.com/baidu-baige)
10 points
mindzzz
14 days ago
2 comments
17.
▲
Show HN: Cursed Browser – a VLM reads the HTML and hallucinates the page
(github.com/scosman)
7 points
scosman
10 days ago
1 comment
18.
▲
Cursed_browser: Web browser with a VLM as rendering engine
(github.com/scosman)
4 points
misterdata
a month ago
discuss
19.
▲
SketchVLM: Letting VLMs draw on images while explaining their reasoning
(github.com/Brandon-Collins7)
3 points
taesiri
a month ago
1 comment
20.
▲
Show HN: Unsiloed Chunker – VLM powered semantic chunking for RAG
(github.com/Unsiloed-AI)
3 points
unsiloed-ai
a year ago
discuss
21.
▲
Show HN: Vision AI Checkup, an Optometrist for VLMs
(visioncheckup.com)
2 points
zerojames
a year ago
discuss
22.
▲
The simplest, fastest repository for training/finetuning small-sized VLMs
(github.com/huggingface)
2 points
s-macke
a year ago
discuss
23.
▲
Advanced Quantization Algorithm for LLMs/VLMs
(github.com/intel)
2 points
XnoiVeX
a year ago
discuss
24.
▲
Show HN: LLM / VLM language agent implementations
(github.com/arthurcolle)
2 points
arthurcolle
a year ago
discuss
25.
▲
Show HN: A VLM-powered image search engine built with Ruby on Rails
(github.com/neonwatty)
2 points
neonwatty
a year ago
discuss
26.
▲
Show HN: A/B test your own VLMs for document parsing (Self-hosted Arena)
(github.com/Bae-ChangHyun)
1 point
matthew624
4 months ago
discuss
27.
▲
Show HN: Offline AI Photo Search (local VLM and semantic search)
(github.com/Pankaj4152)
1 point
Pankaj4152
6 months ago
discuss
28.
▲
"Captions With Attitude" in the browser from local VLM using llama.cpp in Go
(github.com/hybridgroup)
1 point
deadprogram
7 months ago
discuss
29.
▲
Pure Go hardware accelerated local inference on VLMs using llama.cpp
(github.com/hybridgroup)
1 point
deadprogram
7 months ago
discuss
30.
▲
Yzma = embedding+inference on VLM/LLM/SLM/TLM in pure Go using llama.cpp
(github.com/hybridgroup)
1 point
deadprogram
8 months ago
discuss
More