Zilong Wang
Zilong Wang
Ph.D. Candidate at UC San Diego

About Me

Welcome! I am a Ph.D. student at University of California, San Diego advised by Prof. Jingbo Shang.

My research focuses on Natural Language Processing and Data Mining. I am interested in leveraging rich information and releasing the potential in the multi-modality and large language models.

Currently, my major focus is about the Visually-rich Document Understanding. I would like to extract the essential information within the documents and, meanwhile, reduce the human efforts involved through weak, distant, or even no supervision.

Before joining UC San Diego, I received my B.S. in Computer Science from Peking University, where I was advised by Prof. Xiaojun Wan.


  • University of California, San Diego, 2020.9 - present
  • Peking University, 2016.9 - 2020.7
    • B.S. in Computer Science
    • Advisor: Prof. Xiaojun Wan
    • Outstanding Graduate of Beijing City and Peking University


  • Google Cloud, Research Intern, 2023.4 - 2023.9
    • Advisor: Dr. Chen-Yu Lee
  • Google Research, Research Intern, 2022.6 - 2022.9
    • Advisor: Dr. Sandeep Tata
    • A benchmark for Visually-rich Document Understanding GitHub Repo.
  • Adobe Research, Research Intern at Document Intelligence Group, 2021.6 - 2021.9
    • Advisor: Dr. Vlad Morariu
    • A multi-modal pre-trained language modeling leveraging hierarchical structure in visually-rich documents.
  • Microsoft Research Asia, Research Intern at NLC Group, 2020.9 - 2021.3
    • Advisor: Dr. Lei Cui
    • Extraction of reading order for document image understanding
    • Pre-training of language model with reading order dataset
  • University of Illinois, Urbana-Champaign, Research Assistant, 2019.6 - 2019.10
    • Advisor: Prof. Kevin Chang
    • Evaluation of semantic capacity for scientific terms
  • Sensetime Company, Research Intern at OCR Group, 2019.11 - 2020.9
    • Extraction of form structure for general form understanding
  • Peking University, Research Assistant at One Lab with Prof. Xiaojun Wan, 2017.9 - 2020.6
    • Sentiment analysis and emotion detection in multi-party dialogues
    • Emotion detection in multimodality scenarios

Selected Publications

  • VRDU: A Benchmark for Visually-rich Document Understanding
  • MGDoc: Pre-training with Multi-granular Hierarchy for Document Image Understanding
    • Zilong Wang, Jiuxiang Gu, Chris Tensmeyer, Nikolaos Barmpalios, Ani Nenkova, Tong Sun, Jingbo Shang and Vlad I. Morariu
    • EMNLP 2022 [Paper]
  • Formulating Few-shot Fine-tuning Towards Language Model Pre-training: A Pilot Study on Named Entity Recognition
    • Zihan Wang, Kewen Zhao, Zilong Wang, Jingbo Shang
    • EMNLP Findings 2022 [Paper]
  • Towards Few-shot Entity Recognition in Document Images: A Label-aware Sequence-to-Sequence Framework
  • LayoutReader: Pre-training of Text and Layout for Reading Order Detection
  • DocStruct: A Multimodal Method to Extract Hierarchy Structure in Document for General Form Understanding
    • Zilong Wang, Mingjie Zhan, Ding Liang
    • EMNLP Findings 2020 [Paper] [Code]
  • Exploring Semantic Capacity of Terms
    • Jie Huang*, Zilong Wang*, Kevin Chen-Chuan Chang, Wen-mei Hwu, Jinjun Xiong
    • * Asterisk indicates equal contribution.
    • EMNLP 2020 [Paper] [Code]


  • Jacobs School of Engineering Fellowship
  • Outstanding Graduate of Beijing City
  • Outstanding Graduate of Peking University
  • Junyuan Scholarship
  • Merit Student (top 5%)
  • Kwang-Hua Scholarship


I like all outdoor activities! Camping, hiking, travelling… I am also a beginner of photography.