Heykuki News

TopNewBestAskShowJobs
TopNewBestAskShowJobs
31.
Writing an LLM from scratch, part 12 – multi-head attention (gilesthomas.com)
3 points
gpjt
a year ago
discuss
32.
Getting MathML to render properly in Chrome-based browsers (gilesthomas.com)
3 points
LorenDB
a year ago
discuss
33.
How Do LLMs Work? (gilesthomas.com)
2 points
gpjt
9 months ago
1 comment
34.
Jax Back Ends and Devices (gilesthomas.com)
2 points
gpjt
a day ago
discuss
35.
Using Safetensors with Flax (gilesthomas.com)
2 points
gpjt
2 days ago
discuss
36.
First Looking into Jax (gilesthomas.com)
2 points
ibobev
5 days ago
discuss
37.
10Gb Ethernet: what I had to (re)learn (gilesthomas.com)
2 points
ibobev
a month ago
discuss
38.
LLM from scratch, part 32k – Interventions: gradient accumulation (gilesthomas.com)
2 points
gpjt
2 months ago
discuss
39.
LLM from scratch, part 32j – trying to train a better model in the cloud (gilesthomas.com)
2 points
gpjt
2 months ago
discuss
40.
Writing an LLM from scratch, part 32i – Interventions: what is in the noise? (gilesthomas.com)
2 points
ibobev
2 months ago
discuss
41.
Writing an LLM from scratch, part 32h – Interventions: full fat float32 (gilesthomas.com)
2 points
ibobev
2 months ago
discuss
42.
Automating starting Lambda Labs instances (gilesthomas.com)
2 points
ibobev
2 months ago
discuss
43.
Writing an LLM from scratch, part 32g – Interventions: weight tying (gilesthomas.com)
2 points
ibobev
2 months ago
discuss
44.
Writing an LLM from scratch, part 32g – Interventions: weight tying (gilesthomas.com)
2 points
gpjt
2 months ago
discuss
45.
Writing an LLM from scratch, part 32B – Interventions: gradient clipping (gilesthomas.com)
2 points
gpjt
4 months ago
discuss
46.
Writing an LLM from scratch, part 31 – the models are now on Hugging Face (gilesthomas.com)
2 points
gpjt
5 months ago
discuss
47.
LLM from scratch, part 29 – using DDP to train a base model in the cloud (gilesthomas.com)
2 points
gpjt
5 months ago
discuss
48.
Why smart instruction-following makes prompt injection easier (gilesthomas.com)
2 points
ibobev
7 months ago
discuss
49.
Writing an LLM from scratch, part 25 – instruction fine-tuning (gilesthomas.com)
2 points
gpjt
7 months ago
discuss
50.
Revisiting Karpathy's 'Unreasonable Effectiveness of Recurrent Neural Networks' (gilesthomas.com)
2 points
gpjt
8 months ago
discuss
51.
What AI chatbots are doing under the hood (gilesthomas.com)
2 points
gpjt
9 months ago
discuss
52.
LLM from scratch, part 18 – residuals, shortcut connections, and the Talmud (gilesthomas.com)
2 points
gpjt
10 months ago
discuss
53.
Writing an LLM from scratch, part 11 – batches (gilesthomas.com)
2 points
gpjt
a year ago
discuss
54.
LLM Quantisation Weirdness (gilesthomas.com)
2 points
gpjt
2 years ago
discuss
55.
Fun with Google Books Ngram Viewer and the long S (gilesthomas.com)
2 points
gpjt
15 years ago
discuss
56.
How to bet on the bubble? (with list of 2010/11 YC startup hosting providers) (gilesthomas.com)
1 point
gpjt
15 years ago
7 comments
57.
How many python programmers are there in the World today? (gilesthomas.com)
1 point
lifeisstillgood
12 years ago
2 comments
58.
SNI-based Reverse Proxying for SSL connections (gilesthomas.com)
1 point
chesh
13 years ago
1 comment
59.
10Gb Ethernet: what I had to (re)learn (gilesthomas.com)
1 point
gpjt
a month ago
1 comment
60.
Do reasoning LLMs need their own Philosophical Language? (gilesthomas.com)
1 point
gpjt
a year ago
1 comment
More