A high-throughput and memory-efficient inference and serving engine for LLMs
amd cuda inference pytorch transformer llama gpt rocm model-serving tpu hpu mlops xpu llm inferentia llmops llm-serving qwen deepseek trainium
- Updated
Jun 27, 2025 - Python
A high-throughput and memory-efficient inference and serving engine for LLMs
Easy and lightning fast training of 🤗 Transformers on Habana Gaudi processor (HPU)
Computer vision for tuberculosis classification and lung segmentation.
Add a description, image, and links to the hpu topic page so that developers can more easily learn about it.
To associate your repository with the hpu topic, visit your repo's landing page and select "manage topics."