Search: github.com/tnfe | Heykuki News

Heykuki News

Top New Best Ask Show Jobs

Top New Best Ask Show Jobs

511.

OMLX – Ollama for MLX (LLM Inference Server for Apple Silicon) (github.com/jundot)

2 points

4 months ago

512.

Show HN: Omni-NLI – A multi-interface server for natural language inference

2 points

4 months ago

513.

Show HN: EmbodIOS – AI Operating System with Kernel-Level Inference (github.com/dddimcha)

2 points

5 months ago

514.

Rig: Distributed LLM inference across machines in Rust (github.com/buyukakyuz)

2 points

5 months ago

515.

Tract: Self-contained, TensorFlow and ONNX inference (github.com/sonos)

2 points

5 months ago

516.

EmbodIOS - AI inference as the operating system (3.5s cold start) (github.com/dddimcha)

2 points

5 months ago

517.

HF-mem: CLI to estimate inference memory requirements for Hugging Face models (github.com/alvarobartt)

2 points

5 months ago

518.

Mini-SGLang: A lightweight yet high-performance inference framework for LLM (github.com/sgl-project)

2 points

6 months ago

519.

Go apps can directly integrate llama.cpp for HW accelerated local inference (github.com/hybridgroup)

2 points

6 months ago

520.

Show HN: Olla – Lightweight LLM Proxy for Homelab and OnPrem AI Inference

2 points

thushanfernando

10 months ago

521.

WebAssembly binding for llama.cpp – Enabling on-browser LLM inference (github.com/ngxson)

2 points

a year ago

522.

Show HN: Dwani.ai – multimodal inference API for Indian languages (dwani.ai)

2 points

a year ago

523.

GPT4Free: "educational project" for free LLM inference from various services (github.com/xtekky)

2 points

a year ago

524.

OmniPainter: Training-Free Stylized Text-to-Image Generation with Fast Inference (github.com/maxin-cn)

2 points

a year ago

525.

GPU-enabled Llama 3 inference in Java from scratch (github.com/beehive-lab)

2 points

a year ago

526.

BitNet 1.58bit GPU Inference Kernel (github.com/microsoft)

2 points

a year ago

527.

Kubernetes-native distributed LLM inference framework (github.com/llm-d)

2 points

a year ago

528.

Show HN: Contextual AI Document Parser – Infer hierarchy for long, complex docs

2 points

a year ago

529.

Lambda calculus - compiler, type inference, and evaluator in less than 100 LOC (gist.github.com)

2 points

a year ago

530.

Protobuf-ts-types: zero-codegen TypeScript type inference from protobuf messages (github.com/nathanhleung)

2 points

a year ago

531.

Eagle-3 Speculative Decoding for LLM Inference (5.6x speedup) (github.com/SafeAILab)

2 points

a year ago

532.

Show HN: Kernel-level LLM inference via /dev/llm0 (github.com/randombk)

2 points

a year ago

533.

Rust Type Inference Broke with Update to Deranged Crate (github.com/jhpratt)

2 points

a year ago

534.

DeepDive: In-Depth Decryption of LLMs Construction and Inference from Scratch (github.com/therealoliver)

2 points

a year ago

535.

Show HN: OptiLLMBench – Test how inference optimization tricks scale up LLMs

2 points

a year ago

536.

Deepseek.cpp: CPU inference for the DeepSeek family of LLMs in pure C++ (github.com/andrewkchan)

2 points

a year ago

537.

Jlama: LLM Inference Engine for Java (github.com/tjake)

2 points

a year ago

538.

Show HN: EmbedAnything – Rust Powered Inference, Ingestion and Indexing (github.com/StarlightSearch)

2 points

a year ago

539.

JetStream: Throughput+memory optimized engine for LLM inference on XLA devices (github.com/google)

2 points

2 years ago

540.

Duck-Lisp: optional free-form parenthesis inference (github.com/oitzujoey)

2 points

2 years ago