Heykuki News
Top
New
Best
Ask
Show
Jobs
Toggle theme
Login
Top
New
Best
Ask
Show
Jobs
211.
▲
Show HN: Tok/s on a 35B MoE model using a $100 AMD crypto APU and Vulkan
(github.com/akandr)
2 points
akandr
2 months ago
1 comment
212.
▲
Mistral: Light-weight library for mixture-of-experts (MoE) training
(github.com/mistralai)
2 points
georgehill
2 years ago
1 comment
213.
▲
Show HN: Ported Cerebras REAP to MLX – Prune MoE Experts on a MacBook
(github.com/egesabanci)
2 points
egesabanci
4 days ago
discuss
214.
▲
Live 204-node MoE visualization reveals emergent cognitive stratification
(github.com/eriirfos-eng)
2 points
rfi-irfos
15 days ago
discuss
215.
▲
Show HN: 35B MoE LLM and other models locally on an old AMD crypto APU (BC250)
(github.com/akandr)
2 points
akandr
3 months ago
discuss
216.
▲
Dots.llm1: open-source MoE LLM with 142B total and 14B active parameters
(github.com/rednote-hilab)
2 points
simonpure
a year ago
discuss
217.
▲
Every Flop Counts: Scaling 300B Moe LLMs Without Premium GPUs [pdf]
(github.com/inclusionAI)
2 points
mountainview
a year ago
discuss
218.
▲
Lamini Memory Tuning: near-perfect fact recall via 1M-way MoE [pdf]
(github.com/lamini-ai)
2 points
Bluestein
2 years ago
discuss
219.
▲
Show HN: SwiftLM – Qwen Chat on iPhone, 100B+ Moe on M5 Pro 64GB (Native Swift)
(github.com/SharpAI)
1 point
aegis_camera
2 months ago
2 comments
220.
▲
DirectStorage LLM Weight Streaming: 4x faster loading, MoE expert streaming
(github.com/kibbyd)
1 point
kibbyd1985
4 months ago
1 comment
221.
▲
Micro-Expert-Router: Running Mixtral-Class Moe Models on NVMe SSDs Without a GPU
(github.com/randyap8-wq)
1 point
randyap8
10 days ago
discuss
222.
▲
Why Gemma-4 26B MoE works in HuggingFace but breaks in prod inference engines
(github.com/maeddesg)
1 point
maeddesg
22 days ago
discuss
223.
▲
Has anyone else hit expert homogeneity collapse in small MoE models?
(github.com/eriirfos-eng)
1 point
rfi-irfos
a month ago
discuss
224.
▲
ARCHE3-7B – Sparse Moe with SmartRouter and Foundation Curriculum Training
1 point
OpenSynapseLabs
2 months ago
discuss
225.
▲
QuantumLeap: 2.3× faster MoE inference with intelligent expert caching
(github.com/MartinCrespoC)
1 point
ikharoz
2 months ago
discuss
226.
▲
Show HN: Adaptive-K – Cut MoE inference costs 30-50% with entropy-guided routing
(github.com/Gabrobals)
1 point
Gabrielebalsamo
5 months ago
discuss
227.
▲
Show HN: LLM Inference Performance Analytic Tool for Moe Models (DeepSeek/etc.)
(github.com/kevinyuan)
1 point
kevin-2025
6 months ago
discuss
228.
▲
DeepSeek-VL2: Moe Vision-Language Models for Advanced Multimodal Understanding [pdf]
(github.com/deepseek-ai)
1 point
limoce
a year ago
discuss
229.
▲
Aria: Open Multimodal Native Moe
(github.com/rhymes-ai)
1 point
simonpure
2 years ago
discuss
230.
▲
Yosoro – Moe Style Markdown NoteBook
(github.com/IceEnd)
1 point
tvvocold
8 years ago
discuss
231.
▲
Show HN: Terminal-Bench-RL: Training long-horizon terminal agents with RL
(github.com/Danau5tin)
125 points
Danau5tin
10 months ago
12 comments
232.
▲
Show HN: Pica – Rust-based agentic AI infrastructure (open-source)
(picaos.com)
63 points
moekatib
a year ago
44 comments
233.
▲
Show HN: LeanRL: Fast PyTorch RL with Torch.compile and CUDA Graphs
(github.com/pytorch-labs)
53 points
vmoens
2 years ago
5 comments
234.
▲
Show HN: KTransformers–236B Model and 1M Context LLM Inference on Local Machines
(github.com/kvcache-ai)
20 points
sssummer
2 years ago
3 comments
235.
▲
Show HN: Run 500B+ Parameter LLMs Locally on a Mac Mini
(github.com/opengraviton)
17 points
fatihturker
3 months ago
10 comments
236.
▲
Show HN: Lemonade: Run LLMs Locally with GPU and NPU Acceleration
(github.com/lemonade-sdk)
15 points
ramkrishna2910
10 months ago
discuss
237.
▲
Show HN: KTransformers:671B DeepSeek-R1 on a Single Machine-286 tokens/s Prefill
(github.com/kvcache-ai)
14 points
sssummer
a year ago
discuss
238.
▲
Show HN: OpenGraviton – Run 500B+ parameter models on a consumer Mac Mini
(opengraviton.github.io)
13 points
fatihturker
3 months ago
5 comments
239.
▲
Show HN: Trained an LLM to predict "What will Trump do?"
(huggingface.co)
10 points
bturtel
4 months ago
2 comments
240.
▲
Show HN: GhydraMCP – Agentic reverse engineering across multiple binaries
(github.com/teal-bauer)
5 points
moeffju
a year ago
discuss
More