Heykuki News

TopNewBestAskShowJobs
TopNewBestAskShowJobs
31.
Show HN: 1gbps Tokenizer written in Assembly. 20x faster than HuggingFace (github.com/dogmaticdev)
3 points
dogmaticdev
a month ago
2 comments
32.
Node.js Open Source LLM Tokenizer (github.com/jakecyr)
2 points
jakecyr
2 years ago
1 comment
33.
LLM Tokenization Demo (github.com/tokfan)
2 points
tokfan
10 months ago
discuss
34.
The Worst (But Only) Claude 3 Tokenizer (github.com/javirandor)
2 points
dpaleka
2 years ago
discuss
35.
Neural Tokenizer (github.com/Kyubyong)
2 points
kyubyong
9 years ago
discuss
36.
Show HN: CLI Tokenizer – A tiny tool for prompt engineers (github.com/ericciarla)
1 point
ericciarla
2 years ago
discuss
37.
Stripe on Apple watchOS 3 (github.com/appintheair)
4 points
Bayram
10 years ago
discuss
38.
LLM Tokenizer in Zig (github.com/Mario-SO)
1 point
mariodev__
9 months ago
discuss
39.
Very simple javascript highlighter that can be used in blog posts (github.com/fatih-erikli)
1 point
fatih-erikli
2 years ago
discuss
40.
Show HN: I'm writing a library to apply NLP techniques to StarCraft 2 (github.com/ZephyrBlu)
1 point
ZephyrBlu
5 years ago
discuss
41.
CmusicAi a New Cryptocurrency for Artists
2 points
robinair
2 years ago
discuss
42.
PRFI Protocol:Decentralized API Tokenization with Oof-of-Work Mining (github.com/sr-oliveiraa)
1 point
gustavudeoli
10 months ago
discuss
43.
Card Network Tokenization: A Savior or Hidden Menace (github.com/juspay)
1 point
manojr13
3 years ago
discuss
44.
Show HN: Tokenkit – Convert LLMs to new tokenizers (incl byte-level Llama/Gemma) (github.com/bminixhofer)
1 point
bminixhofer
a year ago
discuss
45.
Show HN: TokenDagger – A tokenizer faster than OpenAI's Tiktoken (github.com/M4THYOU)
281 points
matthewolfe
a year ago
73 comments
46.
Tiktoken: OpenAI’s Tokenizer (github.com/openai)
153 points
azhenley
3 years ago
74 comments
47.
Code for the Byte Pair Encoding algorithm, commonly used in LLM tokenization (github.com/karpathy)
81 points
magoghm
2 years ago
31 comments
48.
55x Speedup of Andrej Karpathy's Minbpe LLM Tokenizer with PyTorch/CUDA (github.com/kuprel)
19 points
kuprel
2 years ago
9 comments
49.
Show HN: Open-source card tokenization service in Rust (github.com/juspay)
14 points
thala
3 years ago
discuss
50.
XML Tokenizer that's 4x faster than stdlib's XML (github.com/muktihari)
10 points
todsacerdoti
2 years ago
1 comment
51.
Show HN: A Command-Line Sentence Tokenizer Written in Golang (github.com/neurosnap)
6 points
qudat
11 years ago
1 comment
52.
From Scratch GPT Built with NumPy (Tokenizer, Model, Adam) (github.com/codiceSpaghetti)
6 points
xnan
a year ago
discuss
53.
Show HN: Rust BPE tokenizer for Qwen models that's 12x faster than HuggingFace (github.com/sweepai)
5 points
williamzeng0
8 months ago
discuss
54.
Chiffon: A very small ECMAScript parser, tokenizer in JS (github.com/polygonplanet)
5 points
shawndumas
11 years ago
discuss
55.
Show HN: Tokenomics of a Reserve Currency (github.com/Intercoin)
4 points
EGreg
5 years ago
discuss
56.
Jargon: tokenizers and lemmatizers for Go (github.com/clipperhouse)
4 points
mwsherman
8 years ago
discuss
57.
Fast JSON parser in Rust that uses SIMD and avoids tokenisation (github.com/pikkr)
4 points
tambourine_man
9 years ago
discuss
58.
OpenAI's Tokenizer Page for OSS Models (github.com/1rgs)
3 points
rgs224
3 years ago
1 comment
59.
VoxCPM: Tokenizer-Free TTS for Context-Aware Speech Generation and Voice Cloning (github.com/OpenBMB)
3 points
chaosprint
6 months ago
discuss
60.
SSDD: Single-Step Diffusion Decoder for Efficient Image Tokenization (github.com/facebookresearch)
3 points
montyanderson
8 months ago
discuss
More