Heykuki News
Top
New
Best
Ask
Show
Jobs
Toggle theme
Login
Top
New
Best
Ask
Show
Jobs
31.
▲
Show HN: 1gbps Tokenizer written in Assembly. 20x faster than HuggingFace
(github.com/dogmaticdev)
3 points
dogmaticdev
a month ago
2 comments
32.
▲
Node.js Open Source LLM Tokenizer
(github.com/jakecyr)
2 points
jakecyr
2 years ago
1 comment
33.
▲
LLM Tokenization Demo
(github.com/tokfan)
2 points
tokfan
10 months ago
discuss
34.
▲
The Worst (But Only) Claude 3 Tokenizer
(github.com/javirandor)
2 points
dpaleka
2 years ago
discuss
35.
▲
Neural Tokenizer
(github.com/Kyubyong)
2 points
kyubyong
9 years ago
discuss
36.
▲
Show HN: CLI Tokenizer – A tiny tool for prompt engineers
(github.com/ericciarla)
1 point
ericciarla
2 years ago
discuss
37.
▲
Stripe on Apple watchOS 3
(github.com/appintheair)
4 points
Bayram
10 years ago
discuss
38.
▲
LLM Tokenizer in Zig
(github.com/Mario-SO)
1 point
mariodev__
9 months ago
discuss
39.
▲
Very simple javascript highlighter that can be used in blog posts
(github.com/fatih-erikli)
1 point
fatih-erikli
2 years ago
discuss
40.
▲
Show HN: I'm writing a library to apply NLP techniques to StarCraft 2
(github.com/ZephyrBlu)
1 point
ZephyrBlu
5 years ago
discuss
41.
▲
CmusicAi a New Cryptocurrency for Artists
2 points
robinair
2 years ago
discuss
42.
▲
PRFI Protocol:Decentralized API Tokenization with Oof-of-Work Mining
(github.com/sr-oliveiraa)
1 point
gustavudeoli
10 months ago
discuss
43.
▲
Card Network Tokenization: A Savior or Hidden Menace
(github.com/juspay)
1 point
manojr13
3 years ago
discuss
44.
▲
Show HN: Tokenkit – Convert LLMs to new tokenizers (incl byte-level Llama/Gemma)
(github.com/bminixhofer)
1 point
bminixhofer
a year ago
discuss
45.
▲
Show HN: TokenDagger – A tokenizer faster than OpenAI's Tiktoken
(github.com/M4THYOU)
281 points
matthewolfe
a year ago
73 comments
46.
▲
Tiktoken: OpenAI’s Tokenizer
(github.com/openai)
153 points
azhenley
3 years ago
74 comments
47.
▲
Code for the Byte Pair Encoding algorithm, commonly used in LLM tokenization
(github.com/karpathy)
81 points
magoghm
2 years ago
31 comments
48.
▲
55x Speedup of Andrej Karpathy's Minbpe LLM Tokenizer with PyTorch/CUDA
(github.com/kuprel)
19 points
kuprel
2 years ago
9 comments
49.
▲
Show HN: Open-source card tokenization service in Rust
(github.com/juspay)
14 points
thala
3 years ago
discuss
50.
▲
XML Tokenizer that's 4x faster than stdlib's XML
(github.com/muktihari)
10 points
todsacerdoti
2 years ago
1 comment
51.
▲
Show HN: A Command-Line Sentence Tokenizer Written in Golang
(github.com/neurosnap)
6 points
qudat
11 years ago
1 comment
52.
▲
From Scratch GPT Built with NumPy (Tokenizer, Model, Adam)
(github.com/codiceSpaghetti)
6 points
xnan
a year ago
discuss
53.
▲
Show HN: Rust BPE tokenizer for Qwen models that's 12x faster than HuggingFace
(github.com/sweepai)
5 points
williamzeng0
8 months ago
discuss
54.
▲
Chiffon: A very small ECMAScript parser, tokenizer in JS
(github.com/polygonplanet)
5 points
shawndumas
11 years ago
discuss
55.
▲
Show HN: Tokenomics of a Reserve Currency
(github.com/Intercoin)
4 points
EGreg
5 years ago
discuss
56.
▲
Jargon: tokenizers and lemmatizers for Go
(github.com/clipperhouse)
4 points
mwsherman
8 years ago
discuss
57.
▲
Fast JSON parser in Rust that uses SIMD and avoids tokenisation
(github.com/pikkr)
4 points
tambourine_man
9 years ago
discuss
58.
▲
OpenAI's Tokenizer Page for OSS Models
(github.com/1rgs)
3 points
rgs224
3 years ago
1 comment
59.
▲
VoxCPM: Tokenizer-Free TTS for Context-Aware Speech Generation and Voice Cloning
(github.com/OpenBMB)
3 points
chaosprint
6 months ago
discuss
60.
▲
SSDD: Single-Step Diffusion Decoder for Efficient Image Tokenization
(github.com/facebookresearch)
3 points
montyanderson
8 months ago
discuss
More