Search: github.com/tnfe | Heykuki News

Heykuki News

Top New Best Ask Show Jobs

Top New Best Ask Show Jobs

631.

High Performance LLM Inference Operator Library from Tencent (github.com/Tencent)

1 point

4 months ago

632.

Show HN: ResourceAI – Local LLM inference optimized for consumer iGPUs

1 point

4 months ago

633.

Show HN: VelinScript 3.0 – eine neue Sprache MIT bidirektionaler Type‑Inference (github.com/SkyliteDesign)

1 point

5 months ago

634.

Fast_topk_batched: High-performance batched Top-K selection for CPU inference (github.com/RAZZULLIX)

1 point

5 months ago

635.

Show HN: Adaptive-K – Cut MoE inference costs 30-50% with entropy-guided routing (github.com/Gabrobals)

1 point

Gabrielebalsamo

5 months ago

636.

Inference-Time Constitutional AI (github.com/mdiskint)

1 point

5 months ago

637.

WeDLM Reconciling Diff Lang Models with Std Causal Attention for Fast Inference (github.com/Tencent)

1 point

5 months ago

638.

Show HN: Binfer, an experimental LLM inference engine in TypeScript and CUDA (github.com/bwasti)

1 point

6 months ago

639.

TileRT: Tile-Based Runtime for Ultra-Low-Latency LLM Inference (github.com/tile-ai)

1 point

7 months ago

640.

Pure Go hardware accelerated local inference on VLMs using llama.cpp (github.com/hybridgroup)

1 point

7 months ago

641.

Show HN: Serverless platform for inference of time-series foundation models (faim.it.com)

1 point

7 months ago

642.

LitServe: Build custom AI inference engines (github.com/Lightning-AI)

1 point

7 months ago

643.

Yzma = embedding+inference on VLM/LLM/SLM/TLM in pure Go using llama.cpp (github.com/hybridgroup)

1 point

8 months ago

644.

Build your own AI model inference engines (github.com/Lightning-AI)

1 point

8 months ago

645.

Open Retrieval-Based Inference Toolkit (github.com/schmitech)

1 point

10 months ago

646.

Pydantic/GenAI-prices – Calculate prices for calling LLM inference APIs (github.com/pydantic)

1 point

10 months ago

647.

Show HN: Pure CUDA C Inference for Qwen3 0.6B in One File, No Dependencies (github.com/gigit0000)

1 point

10 months ago

648.

Confidential AI Inference with Attestation: Run LLMs and Agents on Tees (github.com/nearai)

1 point

a year ago

649.

Ask HN: What Inference Server do you use to host TTS Models?

1 point

a year ago

650.

ArtificialCast: Type-safe transformation powered by inference (github.com/Zorokee)

1 point

a year ago

651.

A collection of reproducible LLM inference engine benchmarks: SGLang vs. vLLM (github.com/Michaelvll)

1 point

a year ago

652.

The Path to Open-Sourcing the DeepSeek Inference Engine (github.com/deepseek-ai)

1 point

a year ago

653.

Show HN: SQL-based inference for Gradient Boosting Models (github.com/mattismegevand)

1 point

a year ago

654.

Show HN: Acord – A Daemon for AI Inference (github.com/alpaca-core)

1 point

a year ago

655.

Cost-efficient and pluggable Infrastructure components for GenAI inference (github.com/vllm-project)

1 point

a year ago

656.

Cost-efficient and pluggable Infrastructure components for GenAI inference (github.com/vllm-project)

1 point

a year ago

657.

Show HN: TokenFlow – Visualize LLM inference speed (dave.ly)

1 point

a year ago

658.

Show HN: Bodhi App – Local LLM Inference (getbodhi.app)

1 point

a year ago

659.

CUDA/Metal accelerated language model inference (github.com/zeux)

1 point

a year ago

660.

Computer vision models inference directly on mobile (github.com/software-mansion)

1 point

a year ago