Heykuki News
Top
New
Best
Ask
Show
Jobs
Toggle theme
Login
Top
New
Best
Ask
Show
Jobs
JetStream: Throughput+memory optimized engine for LLM inference on XLA devices
github.com/google
2 points
lnyan
2 years ago
No comment yet
JetStream: Throughput+memory optimized engine for LLM inference on XLA devices | Heykuki News