|
Zilong Wang
I am a Research Scientist at Google DeepMind. I completed my Ph.D. in Computer Science at
UC San Diego in 2025,
advised by Professor Jingbo Shang.
Before that, I received my B.S. in Computer Science from
Peking University in 2020,
where I worked with Professor Xiaojun Wan.
My research interests lie in agentic RL for LLMs and building effective and reliable LLM agents for code generation.
If you'd like to discuss research—or just chat—feel free to reach out at
zlwang.ucsd [at] gmail [dot] com.
X /
Github /
Scholar /
Linkedin
|
|
Learning to Optimize Multi-objective Alignment through Dynamic Reward Weighting
Yining Lu, Zilong Wang**, Shiyang Li, Xin Liu, Changlong Yu, Qingyu Yin, Zhan Shi, Zixuan Zhang, Meng Jiang (** corresponding author)
Preprint, 2025
arXiv
Use adaptive weighting for multi-objective RL, achieving SOTA for all individual rewards
|
Training Language Models to Generate Quality Code with Program Analysis Feedback
Feng Yao*, Zilong Wang*, Liyuan Liu, Junxia Cui, Li Zhong, Xiaohan Fu, Haohui Mai, Vish Krishnan, Jianfeng Gao, Jingbo Shang (* equal contribution)
NeurIPS, 2025
arXiv /
code
Build effective and reliable coding LLMs with hybrid rewards combining program analysis and unit tests
|
RRO: LLM Agent Optimization Through Rising Reward Trajectories
Zilong Wang, Jingfeng Yang, Sreyashi Nag, Samarth Varshney, Xianfeng Tang, Haoming Jiang, Jingbo Shang, Sheikh Muhammad Sarwar
COLM, 2025
arXiv
Mine rising-reward trajectories for efficient process-reward data collection
|
Debug like a Human: A Large Language Model Debugger via Verifying Runtime Execution Step-by-step
Li Zhong, Zilong Wang, Jingbo Shang (** corresponding author)
ACL Findings, 2024
arXiv /
code /
featured: MarkTechPost /
talk: BAAI /
sota: HumanEval 98.2%
Enable runtime-verified, stepwise reasoning via execution trace for precise LLM-based code debugging
|
Chain-of-Table: Evolving Tables in the Reasoning Chain for Table Understanding
Zilong Wang, Hao Zhang, Chun-Liang Li, Julian Martin Eisenschlos, Vincent Perot, Zifeng Wang, Lesly Miculicich, Yasuhisa Fujii, Jingbo Shang, Chen-Yu Lee, Tomas Pfister
ICLR, 2024
arXiv /
code /
featured: Google Research Blog
Introduce iterative table transformation to power the first tabular reasoning agent
|
Can ChatGPT replace StackOverflow? A Study on Robustness and Reliability of Large Language Model Code Generation
Li Zhong, Zilong Wang
AAAI, 2024
arXiv /
code /
featured: TheRegister
Evaluate real-world API reliability of coding LLMs at scale, showing LLMs weren't as good as StackOverflow in 2024
|
Research Scientist, Google DeepMind
2025 - Present | Mountain View, California
|
Applied Scientist, Amazon
2025 | Palo Alto, California
|
Research Intern, Google Research
2022 - 2024 | Mountain View & Sunnyvale, California
|
Research Intern, Adobe Research
2021 | San Jose, California
|
Research Intern, Microsoft Research Asia
2020 - 2021 | Beijing, China
|
|
Last updated: January 2026 | Template by Jon Barron
|
|