Overview
Date | Topic | Readings |
---|---|---|
Wed 08/27 | Introduction: datacenter infra | |
Mon 09/01 | How to profile your system |
Network core
Date | Topic | Readings |
---|---|---|
Wed 09/03 | Datacenter topology | Fat-Tree; Jupiter Rising |
Optional: Facebook’s Data Center Fabric; VL2; Jellyfish | ||
Mon 09/08 | Congestion control I: Rate control | DCQCN; HPCC |
Optional: DCTCP; TIMELY | ||
Wed 09/10 | Congestion control II: Admission control | Homa; dcPIM |
Optional: NDP; FastPass; pHost | ||
Mon 09/15 | Load balancing | Hedera; CONGA |
Optional: HULA; PLB; Conweave | ||
Wed 09/17 | Networking infra for ML | HPN; Distributed AI at Meta |
Mon 09/22 | Network communication for ML | TCCL; CRUX |
Optional: TACCL; TE-CCL | ||
Wed 09/24 | Optical networking | Jupiter Evolving; SIRIUS |
Optional: Lightwave Fabrics; Rotornet; ProjecToR |
Host HW
Date | Topic | Readings |
---|---|---|
Mon 09/29 | Interconnect I: PCIe | Understanding PCIe Perf; Routable PCIe |
Optional: Low-latency compatible PCIe | ||
Wed 10/01 | Interconnect II: CXL | POND; CXL |
Optional: TPP; CXL-virtual | ||
Mon 10/06 | Interconnect III: CXL II | Model CXL; Cord |
Wed 10/08 | Silicon Photonics | sip-ML; Lightning |
Optional: Server-scale SP connectivity | ||
Mon 10/13 | No class | — |
Wed 10/15 | No class | Project Proposal Deadline |
Mon 10/20 | Host Network I | Understanding the Host Network; Manageable Host |
Optional: Hostping; HostCC; Federated Cache | ||
Wed 10/22 | Host Network II | GH; AMD Exascale |
Mon 10/27 | FPGA-based computing/networking | Faery; ACCL+ |
Optional: OS abstraction; DUA |
OS layer
Date | Topic | Readings |
---|---|---|
Wed 10/29 | Software network stacks | Understanding Overheads; NetChannel |
Optional: Shenango; eRPC | ||
Mon 11/03 | Hardware network stacks | ZeroNIC; FlexTOE |
Optional: Tonic; BlueField | ||
Wed 11/05 | Storage stacks | XRP; blk-switch |
Optional: FlashShare | ||
Mon 11/10 | Remote storage stacks | i10; nvme-tcp |
Wed 11/12 | CPU scheduling | ZygOS; Caladan |
Optional: Arachne; Syrup | ||
Mon 11/17 | Memory management | Colloid; TMO |
Optional: MEMTIS | ||
Wed 11/19 | I/O Memory protection | F&S; Scalable IOMMU |
Optional: vIOMMU; V-PROBE | ||
Mon 11/24 | I/O Virtualization | PicNIC; FreeFlow |
Optional: FVM |
ML Sys
Date | Topic | Readings |
---|---|---|
Wed 11/26 | No class | — |
Mon 12/01 | System for inference | vLLM; Orca |
Optional: Parrot; AQUA | ||
Wed 12/03 | System for training | NCCL; DeepSeek-v3 |
Project presentations
Date | Topic | Readings |
---|---|---|
Mon 12/08 | Student project presentations (TBD) | — |