Lei Gao
I am a PhD student at the Ming Hsieh Department of Electrical and Computer Engineering at USC working with Prof. Murali Annavaram. I obtained my Bachelor’s degree from UC Santa Barbara and my Master’s degree from USC.
My research interests are centered around efficient LLM fine-tuning and inference, cross-device federated learning systems, and responsible AI. Here is a copy of my CV.
News
- 05/19/2025: Join Microsoft as a summer Research Intern in the Strategic Planning and Architecture (SPARC) group, working on LLM infrastructure for the Azure ecosystem. If you’re in the Bay Area too, hit me up—always happy to hang or chat!
- 05/15/2025: Our paper “KVPR: Efficient LLM Inference with I/O-Aware Partial KV Cache Recomputation” has been accepted by ACL Findings 2025. See you in Vienna, Austria!
- 12/10/2024: Our paper “KVPR: Efficient LLM Inference with I/O-Aware Partial KV Cache Recomputation” has been accepted by AAAI SEAS workshop 2025. See you in Pennsylvania, USA!
- 10/15/2024: Our paper “Enabling Resource-Efficient On-Device Fine-Tuning of LLMs Using Only Inference Engines” has been accepted by NeurIPS ENLSP workshop 2024. See you in Vancouver, Canada!
- 3/13/2024: Our paper “Ethos: Rectifying Language Models in Orthogonal Parameter Space” has been accepted by NAACL Findings 2024. See you in Mexico City, Mexico!
- 12/18/2023: Our paper “Ethos: Rectifying Language Models in Orthogonal Parameter Space” has been accepted for spotlight presentation at the Responsible Language Model (ReLM) workshop during the AAAI 2024 conference. See you in Vancouver, Canada!
Teaching Assistant
- EE508/599: Systems for Machine Learning (Fall 2023, Spring 2024, Spring 2025). I created slides and final project for our class.
- EE109: Introduction to Embedded Systems (Fall 2024)
Professional Service
Reviewer:
- ARR (Jan 2025, Jan 2024), ICLR SCOPE Workshop (Feb 2025), NeurIPS ENLSP Workshop (Sep 2024), IEEE Computer Architecture Letters (Sep 2023)
Volunteer Service
Mentor:
- USC Undergraduate Research in Viterbi Engineering (CURVE) Program (Fall 2024, Spring 2025)
Talk:
- Invited Presentation at AMD on KVPR for Efficient LLM Inference, April 2025