About Me

I am a fourth-year Ph.D. candidate in Computer Science at the University of Pennsylvania’s Distributed Systems Lab, advised by Prof. Boon Thau Loo and Prof. Vincent Liu. My research centers on resource-efficient large language model (LLM) systems:

  • Large-scale and multimodal LLMs serving architectures,
  • Reinforcement learning frameworks for post-training reasoning models.

Prior to Penn, I was a Master’s student at Columbia University, where I researched on privacy-preserving model training with Prof. Asaf Cidon. I also collaborated closely with Prof. Ethan Katz-Bassett, Prof. Roxana Geambasu, Prof. Ryan Stutsman and Prof. Mathias Lécuyer.

Before coming to the U.S., I developed quantitative investment algorithm and systems in the financial sector. I earned my B.S. in Financial Mathematics in 2015 from Southern University of Science and Technology, as a member of its founding cohort.

Highlighted Projects

GPU Multiplexing for Multiple Heterogenous LLMs Serving

  • Identified head-of-line blocking issue in existing systems; designed a novel vLLM/Ray micro-service architecture.
  • Engineered efficient multi-model KV cache management and robust NCCL concurrency controls.
  • Optimized sharding, replication, placement, and scheduling strategies, validated via SimPy/Vidur simulation.
  • Designed and conducted experiments demonstrating a 1.6x throughput improvement.

Serving Multimodal LLMs via Shared Backbone

  • Proposed system design and implementation details.
  • Contributed the core GPU multiplexing source code and adapted it for the shared backbone architecture.
  • Identified the applicability of Coflow scheduling during discussions, influencing the project’s scheduling approach.

Privacy Budget Scheduling in Machine Learning Training

  • Framed the research problem of scheduling differential privacy as a resource.
  • Proposed a dynamic algorithm DPF (Dominant Private Block Fairness) based on DRF (dominant resource fairness).
  • Developed rigorous proofs for the game-theory properties of the new algorithm.
  • Evaluated different scheduling strategies via discrete-event simulation using SimPy, demonstrated 2x more allocated jobs than FCFS baseline approach.

I also researched fault-tolerant distributed storage and efficient software for high-performance storage devices.

Awards

  • Manjushri Fellowship by University of Pennsylvania, 2021
  • Financial Risk Manager (FRM) Certification, by Global Association of Risk Professionals, 2015
  • China Merchant Bank Scholarship, by Southern University of Science and Technology, 2012-2014
  • Pioneering Undergraduate Fellowship, by Southern University of Science and Technology, 2011-2014
  • First Prize in China High School Biology Olympiad, by China Association of Science and Technology, 2010