DONG Yiying's CV
π§ yiying.dong[at]connect.polyu.hk
π Personal Website
π© Gender: Female | π Age: 24 | π Nationality: Chinese
π¬ Research Interests: Multimodal Retrieval and Reasoning, LLM and Related area.
π Education
The Hong Kong Polytechnic University β PhD Student
Sep. 2024 β Present
Advisor: Prof. Jiannong Cao
GPA: 3.65/4.0
Chongqing University of Posts and Telecommunications (CQUPT) β B.S (Hons) in Computer Science and Technology
Sep. 2018 β Jun. 2022
GPA: 3.65/4.0 | Score: 88/100 | Rank: 2/139 | IELTS: 7.0
University of California, Santa Barbara (UCSB) β Summer School
Jul. 2021 β Aug. 2021
Topic: Building the Blockchain World: Technology, Society and Innovation
Nanyang Technological University (NTU), Singapore β AI Internship Programme
Jan. 2021 β Mar. 2021
NTU AI Lab under Dr. Teoh Teik Toe
π§βπ¬ Work Experience
Eastern Institute of Technology, Ningbo β Research Assistant, Multimodal AI Lab
Mar. 2024 β Aug. 2024
- Researched multimodal models for vertical domain applications.
- Worked on information extraction and video generation.
- Filed a patent based on research outcomes.
π Publications
- REF-VLM: Triplet-Based Referring Paradigm for Unified Visual Decoding (Under Review)
- Fine-Grained Knowledge-Aware Compression for Large Language Models (Under Review)
- Think Twice Before Answering: Iterative Multi-hop KB-VQA with Memory Trace (In Progress)
π§Ύ Patents & Copyrights
- CN119091002A: Video & slideshow generation method/device/system
- CN117711505A: Enzyme kinetic parameter prediction method/device
- CN117457110A: Protein solubility prediction method/system
- CN112422946B: Smart yoga motion guidance system using 3D reconstruction
π¬ Research & Project Experience
Multimodal Retrieval Augmentation β Feb. 2025 β May. 2025
- Built a multimodal KB with 2M Wikipedia articles.
- Integrated multi-level visual encoders for complex VQA.
- Hierarchical retrieval + reranking improved Recall@K and answerβs quality of LLM.
Large Vision Language Model (EIT) β May. 2024 β Dec. 2024
- Developed REF-VLM visual decoding using Triplet Reference Paradigm.
- Curated 100M+ multimodal dialogues across 25 tasks.
- Proposed VD-CoT and parameter-free mask-guided aggregation.
π Honors & Awards
- PolyU Research Postgraduate Scholarship, 2024
- USYD-CSC Research Tuition Fee Scholarship, 2023
- Faculty of Engineering Research Scholarship, 2022
- China National Scholarship (Top 0.2%), 2021
- 1st Prize Scholarship & Merit Student at CQUPT (Top 1%), 2020
- 1st Prize Scholarship & Merit Student at CQUPT (Top 1%), 2019
π Skills
- Programming: Python, C, C++, Java, R, LaTeX
- ML Libraries: PyTorch, TensorFlow, Keras, Caffe, OpenCV
