Heykuki News
Top
New
Best
Ask
Show
Jobs
Toggle theme
Login
Top
New
Best
Ask
Show
Jobs
31.
▲
Writing an LLM from scratch, part 12 – multi-head attention
(gilesthomas.com)
3 points
gpjt
a year ago
discuss
32.
▲
Getting MathML to render properly in Chrome-based browsers
(gilesthomas.com)
3 points
LorenDB
a year ago
discuss
33.
▲
How Do LLMs Work?
(gilesthomas.com)
2 points
gpjt
9 months ago
1 comment
34.
▲
Jax Back Ends and Devices
(gilesthomas.com)
2 points
gpjt
a day ago
discuss
35.
▲
Using Safetensors with Flax
(gilesthomas.com)
2 points
gpjt
2 days ago
discuss
36.
▲
First Looking into Jax
(gilesthomas.com)
2 points
ibobev
5 days ago
discuss
37.
▲
10Gb Ethernet: what I had to (re)learn
(gilesthomas.com)
2 points
ibobev
a month ago
discuss
38.
▲
LLM from scratch, part 32k – Interventions: gradient accumulation
(gilesthomas.com)
2 points
gpjt
2 months ago
discuss
39.
▲
LLM from scratch, part 32j – trying to train a better model in the cloud
(gilesthomas.com)
2 points
gpjt
2 months ago
discuss
40.
▲
Writing an LLM from scratch, part 32i – Interventions: what is in the noise?
(gilesthomas.com)
2 points
ibobev
2 months ago
discuss
41.
▲
Writing an LLM from scratch, part 32h – Interventions: full fat float32
(gilesthomas.com)
2 points
ibobev
2 months ago
discuss
42.
▲
Automating starting Lambda Labs instances
(gilesthomas.com)
2 points
ibobev
2 months ago
discuss
43.
▲
Writing an LLM from scratch, part 32g – Interventions: weight tying
(gilesthomas.com)
2 points
ibobev
2 months ago
discuss
44.
▲
Writing an LLM from scratch, part 32g – Interventions: weight tying
(gilesthomas.com)
2 points
gpjt
2 months ago
discuss
45.
▲
Writing an LLM from scratch, part 32B – Interventions: gradient clipping
(gilesthomas.com)
2 points
gpjt
4 months ago
discuss
46.
▲
Writing an LLM from scratch, part 31 – the models are now on Hugging Face
(gilesthomas.com)
2 points
gpjt
5 months ago
discuss
47.
▲
LLM from scratch, part 29 – using DDP to train a base model in the cloud
(gilesthomas.com)
2 points
gpjt
5 months ago
discuss
48.
▲
Why smart instruction-following makes prompt injection easier
(gilesthomas.com)
2 points
ibobev
7 months ago
discuss
49.
▲
Writing an LLM from scratch, part 25 – instruction fine-tuning
(gilesthomas.com)
2 points
gpjt
7 months ago
discuss
50.
▲
Revisiting Karpathy's 'Unreasonable Effectiveness of Recurrent Neural Networks'
(gilesthomas.com)
2 points
gpjt
8 months ago
discuss
51.
▲
What AI chatbots are doing under the hood
(gilesthomas.com)
2 points
gpjt
9 months ago
discuss
52.
▲
LLM from scratch, part 18 – residuals, shortcut connections, and the Talmud
(gilesthomas.com)
2 points
gpjt
10 months ago
discuss
53.
▲
Writing an LLM from scratch, part 11 – batches
(gilesthomas.com)
2 points
gpjt
a year ago
discuss
54.
▲
LLM Quantisation Weirdness
(gilesthomas.com)
2 points
gpjt
2 years ago
discuss
55.
▲
Fun with Google Books Ngram Viewer and the long S
(gilesthomas.com)
2 points
gpjt
15 years ago
discuss
56.
▲
How to bet on the bubble? (with list of 2010/11 YC startup hosting providers)
(gilesthomas.com)
1 point
gpjt
15 years ago
7 comments
57.
▲
How many python programmers are there in the World today?
(gilesthomas.com)
1 point
lifeisstillgood
12 years ago
2 comments
58.
▲
SNI-based Reverse Proxying for SSL connections
(gilesthomas.com)
1 point
chesh
13 years ago
1 comment
59.
▲
10Gb Ethernet: what I had to (re)learn
(gilesthomas.com)
1 point
gpjt
a month ago
1 comment
60.
▲
Do reasoning LLMs need their own Philosophical Language?
(gilesthomas.com)
1 point
gpjt
a year ago
1 comment
More