Heykuki News
Top
New
Best
Ask
Show
Jobs
Toggle theme
Login
Top
New
Best
Ask
Show
Jobs
121.
▲
Hypura – A storage-tier-aware LLM inference scheduler for Apple Silicon
(github.com/t8)
221 points
tatef
2 months ago
85 comments
122.
▲
Microsoft BitNet: inference framework for 1-bit LLMs
(github.com/microsoft)
173 points
galeos
2 years ago
33 comments
123.
▲
Show HN: Apple II clock using interrupts from physical pendulum clock
(github.com/wkjagt)
157 points
wkjagt
2 years ago
29 comments
124.
▲
Launch HN: Cactus (YC S25) – AI inference on smartphones
(github.com/cactus-compute)
123 points
HenryNdubuaku
9 months ago
63 comments
125.
▲
Parakeet.cpp – Parakeet ASR inference in pure C++ with Metal GPU acceleration
(github.com/Frikallo)
114 points
noahkay13
3 months ago
31 comments
126.
▲
Open source inference time compute example from HuggingFace
(github.com/huggingface)
88 points
burningion
a year ago
26 comments
127.
▲
Fast GPT-2 inference written in Fortran
(github.com/certik)
83 points
Loic
3 years ago
13 comments
128.
▲
Show HN: ZSE – Open-source LLM inference engine with 3.9s cold starts
(github.com/Zyora-Dev)
58 points
zyoralabs
3 months ago
9 comments
129.
▲
GPU-accelerated Llama3.java inference in pure Java using TornadoVM
(github.com/beehive-lab)
48 points
pjmlp
a year ago
discuss
130.
▲
Show HN: LLM, a Rust Crate/CLI for CPU Inference of LLMs (LLaMA, GPT-NeoX, etc.)
(github.com/rustformers)
45 points
Philpax
3 years ago
4 comments
131.
▲
Show HN: React-hint – 150LoC Tooltip Component for React, Preact and Inferno
(github.com/slmgc)
37 points
slmgc
9 years ago
15 comments
132.
▲
DoWhy is a Python library for causal inference
(github.com/py-why)
37 points
nabla9
4 years ago
2 comments
133.
▲
ZML - High performance AI inference stack
(github.com/zml)
36 points
msoad
2 years ago
12 comments
134.
▲
DeepSeek-V3/R1 Inference System Overview
(github.com/deepseek-ai)
27 points
meetpateltech
a year ago
6 comments
135.
▲
Node9 – Inferno-Like Hosted OS Using LuaJIT
(github.com/jvburnes)
23 points
rcarmo
11 years ago
2 comments
136.
▲
RTNeural – real-time neural network inferencing engine
(github.com/jatinchowdhury18)
16 points
ushakov
4 years ago
7 comments
137.
▲
RDMA-Powered Distributed Cache for Fast AI Training and Inference
(github.com/blackbird-io)
16 points
hackercat01012
9 months ago
discuss
138.
▲
CLIP inference in plain C/C++ with no extra dependencies
(github.com/monatis)
12 points
lawrencechen
3 years ago
2 comments
139.
▲
DeepCamera: Local inference engine, Home Assistant intrusion detection AI camera
(github.com/SharpAI)
12 points
walterbell
4 years ago
1 comment
140.
▲
Show HN: N0x – LLM inference, agents, RAG, Python exec in browser, no back end
(n0xth.vercel.app)
9 points
redhanuman
3 months ago
discuss
141.
▲
BharatMLStack – Realtime Inference, MLOps
(github.com/Meesho)
8 points
shsethi
a year ago
1 comment
142.
▲
Show HN: Bhumi–OSS Python Library w Rust Underhead for 2.5x Faster LLM Inference
(bhumi.trilok.ai)
8 points
rachpradhan
a year ago
discuss
143.
▲
gg: "M2 Ultra is the absolute best personal LLM inference node you can buy."
(github.com/ggerganov)
8 points
behnamoh
3 years ago
discuss
144.
▲
Alpa: Auto-parallelizing large model training and inference (by UC Berkeley)
(github.com/alpa-projects)
7 points
zhisbug
4 years ago
1 comment
145.
▲
Show HN: Secure XGBoost training and inference on encrypted data
(github.com/mc2-project)
7 points
chesterl
6 years ago
1 comment
146.
▲
Show HN: Composable middleware for LLM inference Optimization Passes
(github.com/liquidos-ai)
7 points
human_hack3r
3 months ago
discuss
147.
▲
Llama.cpp: Deterministic Inference Mode (CUDA): RMSNorm, MatMul, Attention
(github.com/ggml-org)
6 points
diwank
9 months ago
discuss
148.
▲
Rust+OpenCL+AVX2 implementation of LLaMA inference code
(github.com/Noeda)
6 points
myers
3 years ago
discuss
149.
▲
Ask HN: What is the best tool to infer data type of tabular data?
5 points
mahalel
5 years ago
7 comments
150.
▲
I implemented CLIP inference in plain C/C++
(github.com/monatis)
5 points
monatis
3 years ago
1 comment
More