Projects

Open-source and research projects.

NanoPD

A from-scratch Prefill/Decode disaggregation inference engine for LLMs, covering the full stack from CUDA kernels to adaptive routers (~2000 lines Python + 400 lines CUDA C++).

Blog post · GitHub

HPC Games

Competitive high-performance computing challenges at PKU.

ABC write-up · DE write-up


More projects coming as I build more things.

Comments