About Me
Welcome! I am a Ph.D. student at University of California, San Diego advised by Prof. Jingbo Shang.
My research focuses on Natural Language Processing and Data Mining. I am interested in leveraging rich information and releasing the potential in the multi-modality and large language models.
Currently, my major focus is about the Visually-rich Document Understanding. I would like to extract the essential information within the documents and, meanwhile, reduce the human efforts involved through weak, distant, or even no supervision.
Before joining UC San Diego, I received my B.S. in Computer Science from Peking University, where I was advised by Prof. Xiaojun Wan.
Education
- University of California, San Diego, 2020.9 - present
- Ph.D. in Computer Science
- Advisor: Prof. Jingbo Shang
- Peking University, 2016.9 - 2020.7
- B.S. in Computer Science
- Advisor: Prof. Xiaojun Wan
- Outstanding Graduate of Beijing City and Peking University
Experience
- Google Cloud, Research Intern, 2023.4 - 2023.9
- Advisor: Dr. Chen-Yu Lee
- Google Research, Research Intern, 2022.6 - 2022.9
- Advisor: Dr. Sandeep Tata
- A benchmark for Visually-rich Document Understanding GitHub Repo.
- Adobe Research, Research Intern at Document Intelligence Group, 2021.6 - 2021.9
- Advisor: Dr. Vlad Morariu
- A multi-modal pre-trained language modeling leveraging hierarchical structure in visually-rich documents.
- Microsoft Research Asia, Research Intern at NLC Group, 2020.9 - 2021.3
- Advisor: Dr. Lei Cui
- Extraction of reading order for document image understanding
- Pre-training of language model with reading order dataset
- University of Illinois, Urbana-Champaign, Research Assistant, 2019.6 - 2019.10
- Advisor: Prof. Kevin Chang
- Evaluation of semantic capacity for scientific terms
- Sensetime Company, Research Intern at OCR Group, 2019.11 - 2020.9
- Extraction of form structure for general form understanding
- Peking University, Research Assistant at One Lab with Prof. Xiaojun Wan, 2017.9 - 2020.6
- Sentiment analysis and emotion detection in multi-party dialogues
- Emotion detection in multimodality scenarios
Selected Publications
- VRDU: A Benchmark for Visually-rich Document Understanding
- MGDoc: Pre-training with Multi-granular Hierarchy for Document Image Understanding
- Zilong Wang, Jiuxiang Gu, Chris Tensmeyer, Nikolaos Barmpalios, Ani Nenkova, Tong Sun, Jingbo Shang and Vlad I. Morariu
- EMNLP 2022 [Paper]
- Formulating Few-shot Fine-tuning Towards Language Model Pre-training: A Pilot Study on Named Entity Recognition
- Zihan Wang, Kewen Zhao, Zilong Wang, Jingbo Shang
- EMNLP Findings 2022 [Paper]
- Towards Few-shot Entity Recognition in Document Images: A Label-aware Sequence-to-Sequence Framework
- LayoutReader: Pre-training of Text and Layout for Reading Order Detection
- DocStruct: A Multimodal Method to Extract Hierarchy Structure in Document for General Form Understanding
- Exploring Semantic Capacity of Terms
Awards
- Jacobs School of Engineering Fellowship
- Outstanding Graduate of Beijing City
- Outstanding Graduate of Peking University
- Junyuan Scholarship
- Merit Student (top 5%)
- Kwang-Hua Scholarship
Misc.
I like all outdoor activities! Camping, hiking, travelling… I am also a beginner of photography.