Heykuki News
Top
New
Best
Ask
Show
Jobs
Toggle theme
Login
Top
New
Best
Ask
Show
Jobs
331.
▲
MetalChat – Llama Inference for Apple Silicone
(github.com/ybubnov)
5 points
ybubnov
4 months ago
discuss
332.
▲
Voxtral.c Voxtral Realtime 4B model inference as a C library
(github.com/antirez)
5 points
antirez
4 months ago
discuss
333.
▲
llama2.zig: Inference Llama 2 in one file of pure Zig
(github.com/cgbur)
5 points
tosh
6 months ago
discuss
334.
▲
T-Mac: Low-bit LLM inference on CPU/NPU with lookup table
(github.com/microsoft)
5 points
nateb2022
8 months ago
discuss
335.
▲
Show HN: gline-rs – an inference engine for GLiNER models, in Rust
(github.com/fbilhaut)
5 points
fbilhaut
a year ago
discuss
336.
▲
Fast LLM Inference in Rust
(github.com/EricLBuehler)
5 points
goranmoomin
2 years ago
discuss
337.
▲
Fast and hackable PyTorch native transformer inference
(github.com/pytorch-labs)
5 points
gavi
3 years ago
discuss
338.
▲
Lepton: An open-source library (Apache 2.0) for scaling model inference
(github.com/leptonai)
5 points
Jimmc414
3 years ago
discuss
339.
▲
Run LLaMA Inference on CPU, with Rust
(github.com/rustformers)
5 points
kristianpaul
3 years ago
discuss
340.
▲
Three-processor inference on AMD Ryzen AI 300
(github.com/Peterc3-dev)
4 points
peterc3dev
2 months ago
2 comments
341.
▲
LangPatrol: A static analyzer for LLM prompts that catches bugs before inference
(github.com/langpatrol)
4 points
mmarvin
6 months ago
2 comments
342.
▲
Show HN: Inference Mixtral 8x7B in pure Rust
(github.com/moritztng)
4 points
molli
2 years ago
2 comments
343.
▲
Show HN: Ggml.js – Serverless AI Inference on Browser with Web Assembly
(rahuldshetty.github.io)
4 points
anonymousd3vil
3 years ago
2 comments
344.
▲
TensorSharp: Open-Source Local LLM Inference Engine
(github.com/zhongkaifu)
4 points
zhongkaifu
3 days ago
1 comment
345.
▲
Train and inference GPT in 243 lines of pure, dependency-free Python by Karpathy
(gist.github.com)
4 points
itvision
4 months ago
1 comment
346.
▲
PasLLM: An Object Pascal inference engine for LLM models
(github.com/BeRo1985)
4 points
nor-and-or-not
6 months ago
1 comment
347.
▲
Distributed-Llama: Connect home devices into a cluster for LLM inference
(github.com/b4rtaz)
4 points
tosh
a year ago
1 comment
348.
▲
Practical Llama 3 inference in Java
(github.com/mukel)
4 points
mukel
2 years ago
1 comment
349.
▲
Llama.cpp speculative sampling: 2x faster inference for large models
(github.com/ggerganov)
4 points
bobivl
3 years ago
1 comment
350.
▲
Zig GPT-2 inference engine
(github.com/EugenHotaj)
4 points
eugenhotaj
3 years ago
1 comment
351.
▲
Stable Diffusion inference locally on iOS / macOS using MPSGraph
(github.com/mortenjust)
4 points
consumer451
4 years ago
1 comment
352.
▲
Pytype checks and infers types for your Python code
(github.com/google)
4 points
mkesper
7 years ago
1 comment
353.
▲
Inferential database seeding in Clojure
(michaeldrogalis.github.com)
4 points
MichaelDrogalis
13 years ago
discuss
354.
▲
Show HN: Static-allocation MLP inference in ANSI C using a 2-slot ring buffer
(github.com/GiorgosXou)
4 points
xou
9 days ago
discuss
355.
▲
Mtplx – 2.24x faster TPS – The native MTP inference engine for Apple Silicon
(github.com/youssofal)
4 points
youssof
a month ago
discuss
356.
▲
Show HN: Open-source GDPR router for LLMs detects PII, forces EU-only inference
(github.com/mahadillahm4di-cyber)
4 points
mahadillah-ai
2 months ago
discuss
357.
▲
Iris – a C inference pipeline for image synthesis models
(github.com/antirez)
4 points
nnx
2 months ago
discuss
358.
▲
Show HN: Our command line tool to transpile AI Inference from Python to C++
(github.com/muna-ai)
4 points
olokobayusuf
4 months ago
discuss
359.
▲
Show HN: I wrote inference for Qwen3 0.6B in C/CUDA
(github.com/asdf93074)
4 points
mk93074
8 months ago
discuss
360.
▲
Show HN: Klartraum, a neural rendering inference engine
(github.com/fortmeier)
4 points
fortmeier
a year ago
discuss
More