Search: github.com/kmoe | Heykuki News

Heykuki News

Top New Best Ask Show Jobs

Top New Best Ask Show Jobs

211.

Show HN: Tok/s on a 35B MoE model using a $100 AMD crypto APU and Vulkan (github.com/akandr)

2 points

2 months ago

212.

Mistral: Light-weight library for mixture-of-experts (MoE) training (github.com/mistralai)

2 points

2 years ago

213.

Show HN: Ported Cerebras REAP to MLX – Prune MoE Experts on a MacBook (github.com/egesabanci)

2 points

4 days ago

214.

Live 204-node MoE visualization reveals emergent cognitive stratification (github.com/eriirfos-eng)

2 points

15 days ago

215.

Show HN: 35B MoE LLM and other models locally on an old AMD crypto APU (BC250) (github.com/akandr)

2 points

3 months ago

216.

Dots.llm1: open-source MoE LLM with 142B total and 14B active parameters (github.com/rednote-hilab)

2 points

a year ago

217.

Every Flop Counts: Scaling 300B Moe LLMs Without Premium GPUs [pdf] (github.com/inclusionAI)

2 points

a year ago

218.

Lamini Memory Tuning: near-perfect fact recall via 1M-way MoE [pdf] (github.com/lamini-ai)

2 points

2 years ago

219.

Show HN: SwiftLM – Qwen Chat on iPhone, 100B+ Moe on M5 Pro 64GB (Native Swift) (github.com/SharpAI)

1 point

2 months ago

220.

DirectStorage LLM Weight Streaming: 4x faster loading, MoE expert streaming (github.com/kibbyd)

1 point

4 months ago

221.

Micro-Expert-Router: Running Mixtral-Class Moe Models on NVMe SSDs Without a GPU (github.com/randyap8-wq)

1 point

10 days ago

222.

Why Gemma-4 26B MoE works in HuggingFace but breaks in prod inference engines (github.com/maeddesg)

1 point

22 days ago

223.

Has anyone else hit expert homogeneity collapse in small MoE models? (github.com/eriirfos-eng)

1 point

a month ago

224.

ARCHE3-7B – Sparse Moe with SmartRouter and Foundation Curriculum Training

1 point

OpenSynapseLabs

2 months ago

225.

QuantumLeap: 2.3× faster MoE inference with intelligent expert caching (github.com/MartinCrespoC)

1 point

2 months ago

226.

Show HN: Adaptive-K – Cut MoE inference costs 30-50% with entropy-guided routing (github.com/Gabrobals)

1 point

Gabrielebalsamo

5 months ago

227.

Show HN: LLM Inference Performance Analytic Tool for Moe Models (DeepSeek/etc.) (github.com/kevinyuan)

1 point

6 months ago

228.

DeepSeek-VL2: Moe Vision-Language Models for Advanced Multimodal Understanding [pdf] (github.com/deepseek-ai)

1 point

a year ago

229.

Aria: Open Multimodal Native Moe (github.com/rhymes-ai)

1 point

2 years ago

230.

Yosoro – Moe Style Markdown NoteBook (github.com/IceEnd)

1 point

8 years ago

231.

Show HN: Terminal-Bench-RL: Training long-horizon terminal agents with RL (github.com/Danau5tin)

125 points

10 months ago

232.

Show HN: Pica – Rust-based agentic AI infrastructure (open-source) (picaos.com)

63 points

a year ago

233.

Show HN: LeanRL: Fast PyTorch RL with Torch.compile and CUDA Graphs (github.com/pytorch-labs)

53 points

2 years ago

234.

Show HN: KTransformers–236B Model and 1M Context LLM Inference on Local Machines (github.com/kvcache-ai)

20 points

2 years ago

235.

Show HN: Run 500B+ Parameter LLMs Locally on a Mac Mini (github.com/opengraviton)

17 points

3 months ago

236.

Show HN: Lemonade: Run LLMs Locally with GPU and NPU Acceleration (github.com/lemonade-sdk)

15 points

10 months ago

237.

Show HN: KTransformers:671B DeepSeek-R1 on a Single Machine-286 tokens/s Prefill (github.com/kvcache-ai)

14 points

a year ago

238.

Show HN: OpenGraviton – Run 500B+ parameter models on a consumer Mac Mini (opengraviton.github.io)

13 points

3 months ago

239.

Show HN: Trained an LLM to predict "What will Trump do?" (huggingface.co)

10 points

4 months ago

240.

Show HN: GhydraMCP – Agentic reverse engineering across multiple binaries (github.com/teal-bauer)

5 points

a year ago