Search: gilesthomas.com | Heykuki News

Heykuki News

Top New Best Ask Show Jobs

Top New Best Ask Show Jobs

61.

Using Safetensors with Flax (gilesthomas.com)

1 point

2 days ago

62.

10Gb/s Ethernet: using mini-heatsinks with a 10GBASE-T SFP+ module (gilesthomas.com)

1 point

19 days ago

63.

LLM from scratch (32l) – Interventions: updated instruction fine-tuning results (gilesthomas.com)

1 point

2 months ago

64.

An LLM becomes more coherent as we train it (gilesthomas.com)

1 point

2 months ago

65.

Interventions: Trying to train a better model in the cloud (gilesthomas.com)

1 point

2 months ago

66.

Writing an LLM from scratch, part 32i – Interventions: what is in the noise? (gilesthomas.com)

1 point

2 months ago

67.

Writing an LLM from scratch, part 32B – Interventions: gradient clipping (gilesthomas.com)

1 point

4 months ago

68.

Writing an LLM from scratch, part 32c – Interventions: removing dropout (gilesthomas.com)

1 point

4 months ago

69.

Writing an LLM from scratch, part 32d – Interventions: adding attention bias (gilesthomas.com)

1 point

4 months ago

70.

Writing an LLM from scratch, part 32c – Interventions: removing dropout (gilesthomas.com)

1 point

4 months ago

71.

Writing an LLM from scratch, part 32a – Interventions: training a baseline model (gilesthomas.com)

1 point

4 months ago

72.

Getting a Custom PyTorch LLM onto the Hugging Face Hub (gilesthomas.com)

1 point

4 months ago

73.

Getting a Custom PyTorch LLM onto the Hugging Face Hub (gilesthomas.com)

1 point

4 months ago

74.

Writing an LLM from scratch, part 31 – the models are now on Hugging Face (gilesthomas.com)

1 point

5 months ago

75.

Digging into the LLM-as-a-Judge Results (gilesthomas.com)

1 point

5 months ago

76.

Digging into the LLM-as-a-Judge Results (gilesthomas.com)

1 point

5 months ago

77.

Writing an LLM from scratch, part 30 – digging into the LLM-as-a-judge results (gilesthomas.com)

1 point

5 months ago

78.

Writing an LLM from scratch, part 27 – what's left, and what's next? (gilesthomas.com)

1 point

7 months ago

79.

Writing an LLM from scratch, part 24 – the transcript hack (gilesthomas.com)

1 point

7 months ago

80.

Retro Language Models: Rebuilding Karpathy's RNN in PyTorch (gilesthomas.com)

1 point

7 months ago

81.

Writing an LLM from scratch, part 23 – fine-tuning for classification (gilesthomas.com)

1 point

7 months ago

82.

Writing an LLM from scratch, part 23 – fine-tuning for classification (gilesthomas.com)

1 point

7 months ago

83.

Revisiting Karpathy's 'The Unreasonable Effectiveness of RNNs' (gilesthomas.com)

1 point

8 months ago

84.

Writing an LLM from scratch, part 21 – perplexed by perplexity (gilesthomas.com)

1 point

8 months ago

85.

Writing an LLM from scratch, part 21 – perplexed by perplexity (gilesthomas.com)

1 point

8 months ago

86.

How Do LLMs Work? (gilesthomas.com)

1 point

9 months ago

87.

The fixed length bottleneck and the feed forward network (gilesthomas.com)

1 point

10 months ago

88.

Writing an LLM from scratch, part 16 – layer normalisation (gilesthomas.com)

1 point

a year ago

89.

Writing an LLM from scratch, part 14 – the complexity of self-attention at scale (gilesthomas.com)

1 point

a year ago

90.

Adding /Llms.txt (gilesthomas.com)

1 point

a year ago