Zilong Wang

I am a Research Scientist at Google DeepMind. I completed my Ph.D. in Computer Science at UC San Diego in 2025, advised by Professor Jingbo Shang. Before that, I received my B.S. in Computer Science from Peking University in 2020, where I worked with Professor Xiaojun Wan.

My research interests lie in agentic RL for LLMs and building effective and reliable LLM agents for code generation. If you'd like to discuss research—or just chat—feel free to reach out at zlwang.ucsd [at] gmail [dot] com.

X /  Github  /  Scholar /  Linkedin

profile photo

Selected Publications

Learning to Optimize Multi-objective Alignment through Dynamic Reward Weighting
Yining Lu, Zilong Wang**, Shiyang Li, Xin Liu, Changlong Yu, Qingyu Yin, Zhan Shi, Zixuan Zhang, Meng Jiang (** corresponding author)
Preprint, 2025
arXiv

Use adaptive weighting for multi-objective RL, achieving SOTA for all individual rewards

Training Language Models to Generate Quality Code with Program Analysis Feedback
Feng Yao*, Zilong Wang*, Liyuan Liu, Junxia Cui, Li Zhong, Xiaohan Fu, Haohui Mai, Vish Krishnan, Jianfeng Gao, Jingbo Shang (* equal contribution)
NeurIPS, 2025
arXiv / code

Build effective and reliable coding LLMs with hybrid rewards combining program analysis and unit tests

RRO: LLM Agent Optimization Through Rising Reward Trajectories
Zilong Wang, Jingfeng Yang, Sreyashi Nag, Samarth Varshney, Xianfeng Tang, Haoming Jiang, Jingbo Shang, Sheikh Muhammad Sarwar
COLM, 2025
arXiv

Mine rising-reward trajectories for efficient process-reward data collection

Debug like a Human: A Large Language Model Debugger via Verifying Runtime Execution Step-by-step
Li Zhong, Zilong Wang, Jingbo Shang (** corresponding author)
ACL Findings, 2024
arXiv / code / featured: MarkTechPost / talk: BAAI / sota: HumanEval 98.2%

Enable runtime-verified, stepwise reasoning via execution trace for precise LLM-based code debugging

Chain-of-Table: Evolving Tables in the Reasoning Chain for Table Understanding
Zilong Wang, Hao Zhang, Chun-Liang Li, Julian Martin Eisenschlos, Vincent Perot, Zifeng Wang, Lesly Miculicich, Yasuhisa Fujii, Jingbo Shang, Chen-Yu Lee, Tomas Pfister
ICLR, 2024
arXiv / code / featured: Google Research Blog

Introduce iterative table transformation to power the first tabular reasoning agent

Can ChatGPT replace StackOverflow? A Study on Robustness and Reliability of Large Language Model Code Generation
Li Zhong, Zilong Wang
AAAI, 2024
arXiv / code / featured: TheRegister

Evaluate real-world API reliability of coding LLMs at scale, showing LLMs weren't as good as StackOverflow in 2024

Experiences

Research Scientist, Google DeepMind
2025 - Present | Mountain View, California
Applied Scientist, Amazon
2025 | Palo Alto, California
Research Intern, Google Research
2022 - 2024 | Mountain View & Sunnyvale, California
Research Intern, Adobe Research
2021 | San Jose, California
Research Intern, Microsoft Research Asia
2020 - 2021 | Beijing, China

Last updated: January 2026 | Template by Jon Barron