Projects
Open-source and research projects.
NanoPD
A from-scratch Prefill/Decode disaggregation inference engine for LLMs, covering the full stack from CUDA kernels to adaptive routers (~2000 lines Python + 400 lines CUDA C++).
HPC Games
Competitive high-performance computing challenges at PKU.
More projects coming as I build more things.