Heykuki News

TopNewBestAskShowJobs
TopNewBestAskShowJobs
1.
Fp8 runs ~100 tflops faster when the kernel name has "cutlass" in it (github.com/triton-lang)
338 points
mmastrac
8 months ago
166 comments
2.
Gluon: a GPU programming language based on the same compiler stack as Triton (github.com/triton-lang)
83 points
matt_d
9 months ago
24 comments
3.
Fp8 runs ~100 tflops faster when the kernel name has "cutlass" in it (github.com/triton-lang)
4 points
mmastrac
a year ago
discuss
4.
Triton Extensions: a framework for developing and building compiler extensions (github.com/triton-lang)
2 points
matt_d
5 months ago
discuss
5.
Triton Plugins (github.com/triton-lang)
2 points
zer0zzz
6 months ago
discuss
6.
Triton Support for Blackwell (github.com/triton-lang)
2 points
elgatolopez
a year ago
discuss
7.
I fixed a segfault in Triton that broke every RTX 5070/5080/5090 (github.com/triton-lang)
1 point
pat90000
3 months ago
discuss
8.
Triton CUDA Tile IR Back End (github.com/triton-lang)
1 point
my123
4 months ago
discuss
9.
Automatic Warp Specialization in Triton (github.com/triton-lang)
1 point
subharmonicon
a year ago
discuss
10.
Triton Language and Compiler (github.com/openai)
3 points
tosh
3 years ago
discuss
11.
OpenAI Triton: language and compiler for highly efficient Deep-Learning (github.com/openai)
1 point
tosh
2 years ago
discuss
12.
Show HN: 80% faster, 50% less memory, 0% loss of accuracy Llama finetuning (github.com/unslothai)
385 points
danielhanchen
3 years ago
119 comments
13.
Show HN: Finetune Llama-3.1 2x faster in a Colab (colab.research.google.com)
16 points
danielhanchen
2 years ago
2 comments
14.
Finetune language models 30x faster (unsloth.ai)
2 points
danielhanchen
3 years ago
discuss
15.
Show HN: Efficient `Torch.cdist` Using Triton (github.com/jinensetpal)
1 point
codeinassembly
a year ago
discuss