Search: github.com/tnfe | Heykuki News

Heykuki News

Top New Best Ask Show Jobs

Top New Best Ask Show Jobs

601.

ParaAttention: Speed Up Flux and Mochi Inference with Multiple GPUs (github.com/chengzeyi)

1 point

2 years ago

602.

Llama Deck:CLI for running multiple language implementations of LLM inference (github.com/xxxbf0222)

1 point

2 years ago

603.

Benchmarked Llama2 and mistral across popular inference engines and precisions (github.com/premAI-io)

1 point

2 years ago

604.

Curated List of 50 Open-Source LLM Inference Tools: Seeking Contributions (github.com/vince-lam)

1 point

2 years ago

605.

Show HN: Fortran inference code for the Mamba state space language model (github.com/rbitr)

1 point

2 years ago

606.

GPT-Fast: Simple and efficient GPT inference in <1000 LOC of Python (github.com/pytorch-labs)

1 point

3 years ago

607.

Generate Nix packages from URLs with hash prefetching and dependency inference (github.com/nix-community)

1 point

3 years ago

608.

Show HN: Kylo – Simple FAQ Bot Built with Facebook's Infersent (github.com/avinassh)

1 point

7 years ago

609.

Clevr-Iep: Inferring and Executing Programs for Visual Reasoning (github.com/facebookresearch)

1 point

9 years ago

610.

XcodeGhost infectd Apps List (github.com/zengyun-programmer)

1 point

11 years ago

611.

Configurable zombie infection simulation (github.com/Ellzord)

1 point

11 years ago

612.

Why Gemma-4 26B MoE works in HuggingFace but breaks in prod inference engines (github.com/maeddesg)

1 point

23 days ago

613.

Show HN: Gosd: High-performance Stable Diffusion inference in pure Go(no CGO) (github.com/l8bloom)

1 point

a month ago

614.

WebLLM is a high-performance in-browser LLM inference engine (github.com/mlc-ai)

1 point

a month ago

615.

Rcarmo/gte-go: Golang inference for the GTE Small embedding model (github.com/rcarmo)

1 point

a month ago

616.

Show HN: JibarOS, a shared inference runtime for Android (github.com/Jibar-OS)

1 point

2 months ago

617.

ORAC-NT MedChem Copilot that blocks synthetically infeasible molecules (github.com/Kretski)

1 point

2 months ago

618.

New ML inference language dropped today (github.com/m0at)

1 point

2 months ago

619.

QuantumLeap: 2.3× faster MoE inference with intelligent expert caching (github.com/MartinCrespoC)

1 point

2 months ago

620.

Show HN: Mamba SSM in Rust – training and inference with custom CUDA kernels (github.com/silvermpx)

1 point

3 months ago

621.

Show HN: Go LLM inference with a Vulkan GPU back end that beats Ollama's CUDA (github.com/computerex)

1 point

3 months ago

622.

Speculative Speculative Decoding: Really, Really Fast LLM Inference (github.com/tanishqkumar)

1 point

3 months ago

623.

Show HN: SAM 3 Inference on Modal in Under 10 Seconds (github.com/TheFloatingString)

1 point

3 months ago

624.

Show HN: oMLX – Native Mac inference server that persists KV cache to SSD (github.com/jundot)

1 point

4 months ago

625.

MicroGPT: train & inference in 243 lines of code (gist.github.com)

1 point

4 months ago

626.

MicroGPT - Train and inference a GPT in pure, dependency-free Python (200 lines) (gist.github.com)

1 point

4 months ago

627.

Show HN: ARIA Protocol – P2P distributed 1-bit LLM inference at 120 tok/s on CPU (github.com/spmfrance-cloud)

1 point

4 months ago

628.

Show HN: ARIA – P2P distributed inference protocol for 1-bit LLMs on CPU (github.com/spmfrance-cloud)

1 point

4 months ago

629.

Show HN: Weed–Minimalist AI/ML inference and backprogation in the style of Qrack (github.com/vm6502q)

1 point

wrathfulspatula

4 months ago

630.

PowerInfer: Fast LLM Inference on a Consumer-Grade GPU (github.com/Tiiny-AI)

1 point

4 months ago