Heykuki News

TopNewBestAskShowJobs
TopNewBestAskShowJobs
361.
Show HN: Furnace – Rust and Burn inference server, zero Python, single binary
4 points
gilfeather
a year ago
discuss
362.
Fenic: The dataframe (re)built for LLM inference (github.com/typedef-ai)
4 points
asiramdas
a year ago
discuss
363.
Zorokee/ArtificialCast: Type-safe transformation powered by inference (github.com/Zorokee)
4 points
cratermoon
a year ago
discuss
364.
Bark.cpp: Port of Suno AI's Bark in C/C++ for fast inference (github.com/PABannier)
4 points
siraben
2 years ago
discuss
365.
Jetstream: New LLM Inference Engine (github.com/google)
4 points
gfortaine
2 years ago
discuss
366.
LLM Inference Endpoint Performance Benchmarking Tool (github.com/ray-project)
4 points
richardliaw
3 years ago
discuss
367.
Accelerating Inferencing Services with Kontain (github.com/kontainapp)
4 points
gnode1
3 years ago
discuss
368.
LLM-J: A pure Java implementation of a LLM inference engine (github.com/tjake)
4 points
mfiguiere
3 years ago
discuss
369.
Full GPU Inference of LLaMA on Apple Silicon Using Metal (github.com/ggerganov)
4 points
behnamoh
3 years ago
discuss
370.
Show HN: Deterministic objective Bayesian inference for spatial models [pdf] (buildingblock.ai)
4 points
rnburn
3 years ago
discuss
371.
Inference at the Edge (github.com/ggerganov)
4 points
Mizza
3 years ago
discuss
372.
Show HN: TypeScript query builder with full type inference (edgedb.com)
4 points
colinmcd
4 years ago
discuss
373.
Whats new in Scala 2.8: type constructor inference (adriaanm.github.com)
4 points
DanielRibeiro
15 years ago
discuss
374.
Linux.Midrashim: x64 ELF infector virus (github.com/guitmz)
4 points
guitmz
6 years ago
discuss
375.
Show HN: Larq – Binarized Neural Network Inference with MLIR and TFLite (github.com/larq)
4 points
lgeiger
6 years ago
discuss
376.
Fast In-Browser Inference with ONNX.js, WebAssembly and WebGL (github.com/Microsoft)
4 points
0101111101
7 years ago
discuss
377.
Infer Clojure specs from sample data. Inspired by F#'s type providers (github.com/stathissideris)
4 points
tosh
9 years ago
discuss
378.
Show HN: oLLM – LLM Inference for large-context tasks on consumer GPUs (github.com/Mega4alik)
3 points
anuarsh
9 months ago
7 comments
379.
Show HN: A reasoning model that infers over whole tasks in 1ms in latent space (github.com/OrderOneAI)
3 points
orderone_ai
a year ago
6 comments
380.
Show HN: Standalone TurboQuant KV Cache Inference (github.com/g023)
3 points
g023
2 months ago
4 comments
381.
Ternative – C++/CUDA inference engine for ternary LLMs with runtime LoRA (github.com/michelangeloromerochisco)
3 points
michelangeloro
18 days ago
1 comment
382.
Xinity Runtime: Apache 2.0 LLM inference engine for on-premise deployment (github.com/xinity-ai)
3 points
xinity
2 months ago
1 comment
383.
Show HN: Dendrite – O(1) KV cache forking for tree-structured LLM inference (github.com/BioInfo)
3 points
RyeCatcher
2 months ago
1 comment
384.
Show HN: Kremis – Rust graph DB; every answer is fact, inference, or unknown (github.com/TyKolt)
3 points
TyKolt
3 months ago
1 comment
385.
vLLM-mlx – 65 tok/s LLM inference on Mac with tool calling and prompt caching (github.com/raullenchai)
3 points
raullen
3 months ago
1 comment
386.
A Distributed Inference Framework Enabling Running Models Exceeding Total Memory (github.com/firstbatchxyz)
3 points
driaforall
6 months ago
1 comment
387.
Metaphysical Priming reduces Gemini 3.0 Pro inference latency by 60% (github.com/Cactus-mp4)
3 points
cactus-jpg
6 months ago
1 comment
388.
SQLite AI – Local AI Inference, Powered by SQLite (github.com/sqliteai)
3 points
marcobambini
10 months ago
1 comment
389.
Neural Amp Modeler inference in web browsers using WebAssembly (TONE3000) (github.com/tone-3000)
3 points
woodybury
a year ago
1 comment
390.
Fastgen – SOTA LLM inference in 3k lines of Python (github.com/facebookresearch)
3 points
mpu
a year ago
1 comment
More