Heykuki News
Top
New
Best
Ask
Show
Jobs
Toggle theme
Login
Top
New
Best
Ask
Show
Jobs
361.
▲
Show HN: Slideo – Synchronize Slides with Video Using Computer Vision (OpenCV)
(github.com/hediet)
2 points
Gehinnn
5 years ago
1 comment
362.
▲
pip3 install videoflow - New library for computer vision on videos
2 points
jadielam
7 years ago
1 comment
363.
▲
JavaScript Computer Vision library.
(inspirit.github.com)
2 points
divy
13 years ago
discuss
364.
▲
Show HN: SoMatic – Vision-based OS automation framework for AI agents
(github.com/Smyan1909)
2 points
smyansondur
16 days ago
discuss
365.
▲
Show HN: Neuroscope – Real-time “x-ray vision” into LLMs’ minds
(github.com/cjroth)
2 points
rothific
3 months ago
discuss
366.
▲
Alibaba releases open-source vision model for native layered image editing
(github.com/QwenLM)
2 points
bakigul
6 months ago
discuss
367.
▲
Yzma – local Vision Language Models/LLMs in Go using llama.cpp without CGo
(github.com/hybridgroup)
2 points
deadprogram
8 months ago
discuss
368.
▲
Show HN: Magnitude MCP – vision-first browser interaction for Claude Code
(github.com/sagekit)
2 points
anerli
8 months ago
discuss
369.
▲
Show HN: Demo of AI-enabled voice/vision features on open source hardware [video]
(youtube.com)
2 points
mmajzoobi
9 months ago
discuss
370.
▲
Show HN: Plug-and-play Python utils for any computer-vision pipeline
(github.com/roboflow)
2 points
birdinleconey
a year ago
discuss
371.
▲
Show HN: I achieved over 10% improvement on 3D vision PointCLIP
(github.com/genji970)
2 points
genji970
a year ago
discuss
372.
▲
Smolvlm – Realtime Vision Language Model Demo
(github.com/ngxson)
2 points
informal007
a year ago
discuss
373.
▲
Search images like text using Vision Language Models
(github.com/StarlightSearch)
2 points
r0rshrk
a year ago
discuss
374.
▲
OmniTool – Control a Windows 11 VM with OmniParser plus vision model of choice
(github.com/microsoft)
2 points
danboarder
a year ago
discuss
375.
▲
Sparrow: Open-source data processing with ML, LLM and Vision LLM
(github.com/katanaml)
2 points
madbiz
a year ago
discuss
376.
▲
Visual Product Search: Combining React Native, Cloud Vision, Algolia, and Remix
2 points
iliashad
a year ago
discuss
377.
▲
ShowUI: A lightweight vision-language-action model for GUI agents
(github.com/showlab)
2 points
punkpeye
2 years ago
discuss
378.
▲
BiomedGPT: A Generalist Vision-Language Foundation Model for Biomedical Tasks
(github.com/taokz)
2 points
giuliomagnifico
2 years ago
discuss
379.
▲
Mini-Omni2: Towards Open-Source GPT-4o with Vision, Speech, Duplex Capabilities
(github.com/gpt-omni)
2 points
taikon
2 years ago
discuss
380.
▲
Ollama with Experimental Vision Support
(github.com/ollama)
2 points
rspoerri
2 years ago
discuss
381.
▲
Show HN: Created a notebook to compare the top LMSYS vision models easily
(github.com/Portkey-AI)
2 points
roh26it
2 years ago
discuss
382.
▲
Recognize faces in photos using local models with Apple Vision
(github.com/Nexuist)
2 points
nexuist
2 years ago
discuss
383.
▲
Show HN: I made a simple unified LLM client with tool calling and vision support
(github.com/piEsposito)
2 points
someguy12345678
2 years ago
discuss
384.
▲
Implementation of Google's ScreenAI: Vision-Lang Model for UI and Understanding
(github.com/kyegomez)
2 points
spxneo
2 years ago
discuss
385.
▲
Apple Vision Pro and ROG Ally: Portable console gaming setup guide
(gist.github.com)
2 points
osy
2 years ago
discuss
386.
▲
Godot Support for VisionOS
(github.com/kevinw)
2 points
dagmx
2 years ago
discuss
387.
▲
TrackTales: Zero-shot narrator for mpd using GPT-4-vision
(github.com/mlang)
2 points
lynx23
2 years ago
discuss
388.
▲
Moe-LLaVA: Mixture of Experts for Large Vision-Language Models
(github.com/PKU-YuanGroup)
2 points
GaggiX
2 years ago
discuss
389.
▲
GPT Video – Reproducing the Gemini Demo Using GPT 4 Vision
(github.com/jide)
2 points
ziptron
2 years ago
discuss
390.
▲
GPT-Vision first most reliable open-source browser automation
(github.com/vignshwarar)
2 points
georgehill
2 years ago
discuss
More