Heykuki News
Top
New
Best
Ask
Show
Jobs
Toggle theme
Login
Top
New
Best
Ask
Show
Jobs
271.
▲
A complete Llama2 inference engine that fits in 1356 bytes of x86 assembly
(github.com/rdmsr)
27 points
monax
a month ago
discuss
272.
▲
Peinjector: MITM PE file infector
(github.com/JonDoNym)
26 points
geographomics
11 years ago
4 comments
273.
▲
CausalPy: A Python package for causal inference in quasi-experimental settings
(github.com/pymc-labs)
25 points
tplrbv
3 years ago
1 comment
274.
▲
Copyleft license that infects dependent or operational software
(github.com/candiddev)
23 points
candiddevmike
3 years ago
4 comments
275.
▲
Node9 – Inferno-Like Hosted OS Using LuaJIT
(github.com/jvburnes)
23 points
rcarmo
11 years ago
2 comments
276.
▲
Show HN: KTransformers–236B Model and 1M Context LLM Inference on Local Machines
(github.com/kvcache-ai)
20 points
sssummer
2 years ago
3 comments
277.
▲
OpenArc – Lightweight Inference Server for OpenVINO
(github.com/SearchSavior)
17 points
marban
a year ago
2 comments
278.
▲
Show HN: Graphsignal – ML profiler to speed up training and inference
(github.com/graphsignal)
16 points
dmitrim
4 years ago
7 comments
279.
▲
RTNeural – real-time neural network inferencing engine
(github.com/jatinchowdhury18)
16 points
ushakov
4 years ago
7 comments
280.
▲
Speculative: PoC for speeding-up inference via speculative sampling by ggerganov
(github.com/ggerganov)
16 points
kristianp
3 years ago
1 comment
281.
▲
RDMA-Powered Distributed Cache for Fast AI Training and Inference
(github.com/blackbird-io)
16 points
hackercat01012
9 months ago
discuss
282.
▲
Show HN: An educational Local Qwen3 LLM Inference project written in Rust
(github.com/reinterpretcat)
15 points
eiskalt
a year ago
1 comment
283.
▲
On-Device LLM Inference Powered by X-Bit Quantization
(github.com/Picovoice)
15 points
dynamix
2 years ago
discuss
284.
▲
Show HN: An RDMA/Infiniband Distributed Cache for Fast Inference and Training
(github.com/blackbird-io)
13 points
hackercat0101
9 months ago
discuss
285.
▲
Library for Machine Learning Security Evasion, Poisoning, Extraction, Inference
(github.com/Trusted-AI)
13 points
soheil
4 years ago
discuss
286.
▲
CLIP inference in plain C/C++ with no extra dependencies
(github.com/monatis)
12 points
lawrencechen
3 years ago
2 comments
287.
▲
Show HN: ClawRouter – Open-source LLM router that saves 78% on inference costs
(github.com/BlockRunAI)
12 points
vickyfu
4 months ago
1 comment
288.
▲
DeepCamera: Local inference engine, Home Assistant intrusion detection AI camera
(github.com/SharpAI)
12 points
walterbell
4 years ago
1 comment
289.
▲
Show HN: Lightweight Llama3 Inference Engine – CUDA C
(github.com/abhisheknair10)
12 points
abhisheknair10
a year ago
discuss
290.
▲
LLM-D: Kubernetes-Native Distributed Inference at Scale
(github.com/llm-d)
10 points
bbzjk7
a year ago
2 comments
291.
▲
Show HN: N0x – LLM inference, agents, RAG, Python exec in browser, no back end
(n0xth.vercel.app)
9 points
redhanuman
3 months ago
discuss
292.
▲
Show HN: RDMA/Infiniband Distributed Cache for Fast Inference and Training
(github.com/blackbird-io)
9 points
hackercat010
9 months ago
discuss
293.
▲
Show HN: NeuroFlow 55.8x video inference speedup for Vision Transformers PyTorch
(github.com/ynnk-research)
8 points
ynnk
11 days ago
2 comments
294.
▲
BharatMLStack – Realtime Inference, MLOps
(github.com/Meesho)
8 points
shsethi
a year ago
1 comment
295.
▲
CueTableReloader - Automatically infer animations for UITableView
(github.com/Cue)
8 points
asarazan
13 years ago
discuss
296.
▲
Llama 2 inference from scratch in C++20 (No PyTorch/GGML, ARM NEON)
(github.com/farukalpay)
8 points
recoverydial
5 months ago
discuss
297.
▲
Show HN: Bhumi–OSS Python Library w Rust Underhead for 2.5x Faster LLM Inference
(bhumi.trilok.ai)
8 points
rachpradhan
a year ago
discuss
298.
▲
Show HN: ChainFactory – Run Structured LLM Inference with Easy Parallelism
(github.com/pankajgarkoti)
8 points
garkotipankaj
2 years ago
discuss
299.
▲
gg: "M2 Ultra is the absolute best personal LLM inference node you can buy."
(github.com/ggerganov)
8 points
behnamoh
3 years ago
discuss
300.
▲
LLaMA-rs: a Rust port of llama.cpp for fast LLaMA inference on CPU
(github.com/setzer22)
8 points
darthdeus
3 years ago
discuss
More