Heykuki News

TopNewBestAskShowJobs
TopNewBestAskShowJobs
361.
Show HN: Slideo – Synchronize Slides with Video Using Computer Vision (OpenCV) (github.com/hediet)
2 points
Gehinnn
5 years ago
1 comment
362.
pip3 install videoflow - New library for computer vision on videos
2 points
jadielam
7 years ago
1 comment
363.
JavaScript Computer Vision library. (inspirit.github.com)
2 points
divy
13 years ago
discuss
364.
Show HN: SoMatic – Vision-based OS automation framework for AI agents (github.com/Smyan1909)
2 points
smyansondur
16 days ago
discuss
365.
Show HN: Neuroscope – Real-time “x-ray vision” into LLMs’ minds (github.com/cjroth)
2 points
rothific
3 months ago
discuss
366.
Alibaba releases open-source vision model for native layered image editing (github.com/QwenLM)
2 points
bakigul
6 months ago
discuss
367.
Yzma – local Vision Language Models/LLMs in Go using llama.cpp without CGo (github.com/hybridgroup)
2 points
deadprogram
8 months ago
discuss
368.
Show HN: Magnitude MCP – vision-first browser interaction for Claude Code (github.com/sagekit)
2 points
anerli
8 months ago
discuss
369.
Show HN: Demo of AI-enabled voice/vision features on open source hardware [video] (youtube.com)
2 points
mmajzoobi
9 months ago
discuss
370.
Show HN: Plug-and-play Python utils for any computer-vision pipeline (github.com/roboflow)
2 points
birdinleconey
a year ago
discuss
371.
Show HN: I achieved over 10% improvement on 3D vision PointCLIP (github.com/genji970)
2 points
genji970
a year ago
discuss
372.
Smolvlm – Realtime Vision Language Model Demo (github.com/ngxson)
2 points
informal007
a year ago
discuss
373.
Search images like text using Vision Language Models (github.com/StarlightSearch)
2 points
r0rshrk
a year ago
discuss
374.
OmniTool – Control a Windows 11 VM with OmniParser plus vision model of choice (github.com/microsoft)
2 points
danboarder
a year ago
discuss
375.
Sparrow: Open-source data processing with ML, LLM and Vision LLM (github.com/katanaml)
2 points
madbiz
a year ago
discuss
376.
Visual Product Search: Combining React Native, Cloud Vision, Algolia, and Remix
2 points
iliashad
a year ago
discuss
377.
ShowUI: A lightweight vision-language-action model for GUI agents (github.com/showlab)
2 points
punkpeye
2 years ago
discuss
378.
BiomedGPT: A Generalist Vision-Language Foundation Model for Biomedical Tasks (github.com/taokz)
2 points
giuliomagnifico
2 years ago
discuss
379.
Mini-Omni2: Towards Open-Source GPT-4o with Vision, Speech, Duplex Capabilities (github.com/gpt-omni)
2 points
taikon
2 years ago
discuss
380.
Ollama with Experimental Vision Support (github.com/ollama)
2 points
rspoerri
2 years ago
discuss
381.
Show HN: Created a notebook to compare the top LMSYS vision models easily (github.com/Portkey-AI)
2 points
roh26it
2 years ago
discuss
382.
Recognize faces in photos using local models with Apple Vision (github.com/Nexuist)
2 points
nexuist
2 years ago
discuss
383.
Show HN: I made a simple unified LLM client with tool calling and vision support (github.com/piEsposito)
2 points
someguy12345678
2 years ago
discuss
384.
Implementation of Google's ScreenAI: Vision-Lang Model for UI and Understanding (github.com/kyegomez)
2 points
spxneo
2 years ago
discuss
385.
Apple Vision Pro and ROG Ally: Portable console gaming setup guide (gist.github.com)
2 points
osy
2 years ago
discuss
386.
Godot Support for VisionOS (github.com/kevinw)
2 points
dagmx
2 years ago
discuss
387.
TrackTales: Zero-shot narrator for mpd using GPT-4-vision (github.com/mlang)
2 points
lynx23
2 years ago
discuss
388.
Moe-LLaVA: Mixture of Experts for Large Vision-Language Models (github.com/PKU-YuanGroup)
2 points
GaggiX
2 years ago
discuss
389.
GPT Video – Reproducing the Gemini Demo Using GPT 4 Vision (github.com/jide)
2 points
ziptron
2 years ago
discuss
390.
GPT-Vision first most reliable open-source browser automation (github.com/vignshwarar)
2 points
georgehill
2 years ago
discuss
More