Skip to content
AI Data Center Systems
Search
Ctrl
K
Cancel
GitHub
Select theme
Dark
Light
Auto
AI Data Center Network
Efficient LLM Inference Systems
Deep Learning for Network Engineers
AI Systems Performance Engineering
CME295 Lecture Notes
Training
Storage
Efficient LLM Inference Systems
appendix
Appendix
hardware-architectures
Hardware Architectures for LLM Inference
Domain-Specific Architectures for AI Inference
How to Think About GPUs
How to Think About NPUs
All About Rooflines
How to Think About TPUs
llm-inference
LLM Inference
transformer
Transformer
week01
Efficient LLM Inference Systems
week02
Week 2 — Hardware Foundations for Inference
week03
Week 3 — Transformer Inference and the KV Cache
week04
Week 4 — Quantization
results
Week 4 Lab Results
GitHub
Select theme
Dark
Light
Auto
Appendix
Reusable background notes for LLM inference topics that appear across multiple weeks.
Hardware Architectures for LLM Inference
LLM Inference
Transformer