Heykuki News
Top
New
Best
Ask
Show
Jobs
Toggle theme
Login
Top
New
Best
Ask
Show
Jobs
601.
▲
ParaAttention: Speed Up Flux and Mochi Inference with Multiple GPUs
(github.com/chengzeyi)
1 point
chengzeyi
2 years ago
1 comment
602.
▲
Llama Deck:CLI for running multiple language implementations of LLM inference
(github.com/xxxbf0222)
1 point
mikepapadim
2 years ago
1 comment
603.
▲
Benchmarked Llama2 and mistral across popular inference engines and precisions
(github.com/premAI-io)
1 point
anindya2002
2 years ago
1 comment
604.
▲
Curated List of 50 Open-Source LLM Inference Tools: Seeking Contributions
(github.com/vince-lam)
1 point
vincelam
2 years ago
1 comment
605.
▲
Show HN: Fortran inference code for the Mamba state space language model
(github.com/rbitr)
1 point
andy99
2 years ago
1 comment
606.
▲
GPT-Fast: Simple and efficient GPT inference in <1000 LOC of Python
(github.com/pytorch-labs)
1 point
Palmik
3 years ago
1 comment
607.
▲
Generate Nix packages from URLs with hash prefetching and dependency inference
(github.com/nix-community)
1 point
figsoda
3 years ago
1 comment
608.
▲
Show HN: Kylo – Simple FAQ Bot Built with Facebook's Infersent
(github.com/avinassh)
1 point
avinassh
7 years ago
1 comment
609.
▲
Clevr-Iep: Inferring and Executing Programs for Visual Reasoning
(github.com/facebookresearch)
1 point
runesoerensen
9 years ago
1 comment
610.
▲
XcodeGhost infectd Apps List
(github.com/zengyun-programmer)
1 point
dengjh
11 years ago
1 comment
611.
▲
Configurable zombie infection simulation
(github.com/Ellzord)
1 point
javinpaul
11 years ago
discuss
612.
▲
Why Gemma-4 26B MoE works in HuggingFace but breaks in prod inference engines
(github.com/maeddesg)
1 point
maeddesg
23 days ago
discuss
613.
▲
Show HN: Gosd: High-performance Stable Diffusion inference in pure Go(no CGO)
(github.com/l8bloom)
1 point
krakato
a month ago
discuss
614.
▲
WebLLM is a high-performance in-browser LLM inference engine
(github.com/mlc-ai)
1 point
doener
a month ago
discuss
615.
▲
Rcarmo/gte-go: Golang inference for the GTE Small embedding model
(github.com/rcarmo)
1 point
rcarmo
a month ago
discuss
616.
▲
Show HN: JibarOS, a shared inference runtime for Android
(github.com/Jibar-OS)
1 point
rafaelvalle03
2 months ago
discuss
617.
▲
ORAC-NT MedChem Copilot that blocks synthetically infeasible molecules
(github.com/Kretski)
1 point
DREDREG
2 months ago
discuss
618.
▲
New ML inference language dropped today
(github.com/m0at)
1 point
sfffs
2 months ago
discuss
619.
▲
QuantumLeap: 2.3× faster MoE inference with intelligent expert caching
(github.com/MartinCrespoC)
1 point
ikharoz
2 months ago
discuss
620.
▲
Show HN: Mamba SSM in Rust – training and inference with custom CUDA kernels
(github.com/silvermpx)
1 point
silvermpx
3 months ago
discuss
621.
▲
Show HN: Go LLM inference with a Vulkan GPU back end that beats Ollama's CUDA
(github.com/computerex)
1 point
computerex
3 months ago
discuss
622.
▲
Speculative Speculative Decoding: Really, Really Fast LLM Inference
(github.com/tanishqkumar)
1 point
fizzbuzz07
3 months ago
discuss
623.
▲
Show HN: SAM 3 Inference on Modal in Under 10 Seconds
(github.com/TheFloatingString)
1 point
larryll
3 months ago
discuss
624.
▲
Show HN: oMLX – Native Mac inference server that persists KV cache to SSD
(github.com/jundot)
1 point
jundot
4 months ago
discuss
625.
▲
MicroGPT: train & inference in 243 lines of code
(gist.github.com)
1 point
RyanShook
4 months ago
discuss
626.
▲
MicroGPT - Train and inference a GPT in pure, dependency-free Python (200 lines)
(gist.github.com)
1 point
susam
4 months ago
discuss
627.
▲
Show HN: ARIA Protocol – P2P distributed 1-bit LLM inference at 120 tok/s on CPU
(github.com/spmfrance-cloud)
1 point
anthonymu
4 months ago
discuss
628.
▲
Show HN: ARIA – P2P distributed inference protocol for 1-bit LLMs on CPU
(github.com/spmfrance-cloud)
1 point
anthonymu
4 months ago
discuss
629.
▲
Show HN: Weed–Minimalist AI/ML inference and backprogation in the style of Qrack
(github.com/vm6502q)
1 point
wrathfulspatula
4 months ago
discuss
630.
▲
PowerInfer: Fast LLM Inference on a Consumer-Grade GPU
(github.com/Tiiny-AI)
1 point
oldfuture
4 months ago
discuss
More