Heykuki News

TopNewBestAskShowJobs
TopNewBestAskShowJobs
511.
OMLX – Ollama for MLX (LLM Inference Server for Apple Silicon) (github.com/jundot)
2 points
fintechie
4 months ago
discuss
512.
Show HN: Omni-NLI – A multi-interface server for natural language inference
2 points
habedi0
4 months ago
discuss
513.
Show HN: EmbodIOS – AI Operating System with Kernel-Level Inference (github.com/dddimcha)
2 points
dddimcha
5 months ago
discuss
514.
Rig: Distributed LLM inference across machines in Rust (github.com/buyukakyuz)
2 points
corrode2711
5 months ago
discuss
515.
Tract: Self-contained, TensorFlow and ONNX inference (github.com/sonos)
2 points
vishnukvmd
5 months ago
discuss
516.
EmbodIOS - AI inference as the operating system (3.5s cold start) (github.com/dddimcha)
2 points
dddimcha
5 months ago
discuss
517.
HF-mem: CLI to estimate inference memory requirements for Hugging Face models (github.com/alvarobartt)
2 points
handfuloflight
5 months ago
discuss
518.
Mini-SGLang: A lightweight yet high-performance inference framework for LLM (github.com/sgl-project)
2 points
limoce
6 months ago
discuss
519.
Go apps can directly integrate llama.cpp for HW accelerated local inference (github.com/hybridgroup)
2 points
deadprogram
6 months ago
discuss
520.
Show HN: Olla – Lightweight LLM Proxy for Homelab and OnPrem AI Inference
2 points
thushanfernando
10 months ago
discuss
521.
WebAssembly binding for llama.cpp – Enabling on-browser LLM inference (github.com/ngxson)
2 points
selvan
a year ago
discuss
522.
Show HN: Dwani.ai – multimodal inference API for Indian languages (dwani.ai)
2 points
gaganyatri
a year ago
discuss
523.
GPT4Free: "educational project" for free LLM inference from various services (github.com/xtekky)
2 points
bobbiechen
a year ago
discuss
524.
OmniPainter: Training-Free Stylized Text-to-Image Generation with Fast Inference (github.com/maxin-cn)
2 points
dvrp
a year ago
discuss
525.
GPU-enabled Llama 3 inference in Java from scratch (github.com/beehive-lab)
2 points
mikepapadim
a year ago
discuss
526.
BitNet 1.58bit GPU Inference Kernel (github.com/microsoft)
2 points
galeos
a year ago
discuss
527.
Kubernetes-native distributed LLM inference framework (github.com/llm-d)
2 points
baijum
a year ago
discuss
528.
Show HN: Contextual AI Document Parser – Infer hierarchy for long, complex docs
2 points
ishan_sinha
a year ago
discuss
529.
Lambda calculus - compiler, type inference, and evaluator in less than 100 LOC (gist.github.com)
2 points
tearflake
a year ago
discuss
530.
Protobuf-ts-types: zero-codegen TypeScript type inference from protobuf messages (github.com/nathanhleung)
2 points
18nleung
a year ago
discuss
531.
Eagle-3 Speculative Decoding for LLM Inference (5.6x speedup) (github.com/SafeAILab)
2 points
summarity
a year ago
discuss
532.
Show HN: Kernel-level LLM inference via /dev/llm0 (github.com/randombk)
2 points
RandomBK
a year ago
discuss
533.
Rust Type Inference Broke with Update to Deranged Crate (github.com/jhpratt)
2 points
nethunters
a year ago
discuss
534.
DeepDive: In-Depth Decryption of LLMs Construction and Inference from Scratch (github.com/therealoliver)
2 points
therealoliver
a year ago
discuss
535.
Show HN: OptiLLMBench – Test how inference optimization tricks scale up LLMs
2 points
codelion
a year ago
discuss
536.
Deepseek.cpp: CPU inference for the DeepSeek family of LLMs in pure C++ (github.com/andrewkchan)
2 points
hedgehog0
a year ago
discuss
537.
Jlama: LLM Inference Engine for Java (github.com/tjake)
2 points
saikatsg
a year ago
discuss
538.
Show HN: EmbedAnything – Rust Powered Inference, Ingestion and Indexing (github.com/StarlightSearch)
2 points
Sonam_AI
a year ago
discuss
539.
JetStream: Throughput+memory optimized engine for LLM inference on XLA devices (github.com/google)
2 points
lnyan
2 years ago
discuss
540.
Duck-Lisp: optional free-form parenthesis inference (github.com/oitzujoey)
2 points
nemoniac
2 years ago
discuss
More