Heykuki News
Top
New
Best
Ask
Show
Jobs
Toggle theme
Login
Top
New
Best
Ask
Show
Jobs
1.
▲
Fp8 runs ~100 tflops faster when the kernel name has "cutlass" in it
(github.com/triton-lang)
338 points
mmastrac
8 months ago
166 comments
2.
▲
Gluon: a GPU programming language based on the same compiler stack as Triton
(github.com/triton-lang)
83 points
matt_d
9 months ago
24 comments
3.
▲
Fp8 runs ~100 tflops faster when the kernel name has "cutlass" in it
(github.com/triton-lang)
4 points
mmastrac
a year ago
discuss
4.
▲
Triton Extensions: a framework for developing and building compiler extensions
(github.com/triton-lang)
2 points
matt_d
5 months ago
discuss
5.
▲
Triton Plugins
(github.com/triton-lang)
2 points
zer0zzz
6 months ago
discuss
6.
▲
Triton Support for Blackwell
(github.com/triton-lang)
2 points
elgatolopez
a year ago
discuss
7.
▲
I fixed a segfault in Triton that broke every RTX 5070/5080/5090
(github.com/triton-lang)
1 point
pat90000
3 months ago
discuss
8.
▲
Triton CUDA Tile IR Back End
(github.com/triton-lang)
1 point
my123
4 months ago
discuss
9.
▲
Automatic Warp Specialization in Triton
(github.com/triton-lang)
1 point
subharmonicon
a year ago
discuss
10.
▲
Triton Language and Compiler
(github.com/openai)
3 points
tosh
3 years ago
discuss
11.
▲
OpenAI Triton: language and compiler for highly efficient Deep-Learning
(github.com/openai)
1 point
tosh
2 years ago
discuss
12.
▲
Show HN: 80% faster, 50% less memory, 0% loss of accuracy Llama finetuning
(github.com/unslothai)
385 points
danielhanchen
3 years ago
119 comments
13.
▲
Show HN: Finetune Llama-3.1 2x faster in a Colab
(colab.research.google.com)
16 points
danielhanchen
2 years ago
2 comments
14.
▲
Finetune language models 30x faster
(unsloth.ai)
2 points
danielhanchen
3 years ago
discuss
15.
▲
Show HN: Efficient `Torch.cdist` Using Triton
(github.com/jinensetpal)
1 point
codeinassembly
a year ago
discuss