DONG Yiying's CV

📧 yiying.dong[at]connect.polyu.hk
🌐 Personal Website
👩 Gender: Female | 🎂 Age: 24 | 🌏 Nationality: Chinese
🔬 Research Interests: Multimodal Retrieval and Reasoning, LLM and Related area.

🎓 Education

The Hong Kong Polytechnic University — PhD Student
Sep. 2024 – Present
Advisor: Prof. Jiannong Cao
GPA: 3.65/4.0

Chongqing University of Posts and Telecommunications (CQUPT) — B.S (Hons) in Computer Science and Technology
Sep. 2018 – Jun. 2022
GPA: 3.65/4.0 | Score: 88/100 | Rank: 2/139 | IELTS: 7.0

University of California, Santa Barbara (UCSB) — Summer School
Jul. 2021 – Aug. 2021
Topic: Building the Blockchain World: Technology, Society and Innovation

Nanyang Technological University (NTU), Singapore — AI Internship Programme
Jan. 2021 – Mar. 2021
NTU AI Lab under Dr. Teoh Teik Toe

🧑‍🔬 Work Experience

Eastern Institute of Technology, Ningbo — Research Assistant, Multimodal AI Lab
Mar. 2024 – Aug. 2024

Researched multimodal models for vertical domain applications.
Worked on information extraction and video generation.
Filed a patent based on research outcomes.

📚 Publications

REF-VLM: Triplet-Based Referring Paradigm for Unified Visual Decoding (Under Review)
Fine-Grained Knowledge-Aware Compression for Large Language Models (Under Review)
Think Twice Before Answering: Iterative Multi-hop KB-VQA with Memory Trace (In Progress)

🧾 Patents & Copyrights

CN119091002A: Video & slideshow generation method/device/system
CN117711505A: Enzyme kinetic parameter prediction method/device
CN117457110A: Protein solubility prediction method/system
CN112422946B: Smart yoga motion guidance system using 3D reconstruction

🔬 Research & Project Experience

Multimodal Retrieval Augmentation — Feb. 2025 – May. 2025

Built a multimodal KB with 2M Wikipedia articles.
Integrated multi-level visual encoders for complex VQA.
Hierarchical retrieval + reranking improved Recall@K and answer’s quality of LLM.

Large Vision Language Model (EIT) — May. 2024 – Dec. 2024

Developed REF-VLM visual decoding using Triplet Reference Paradigm.
Curated 100M+ multimodal dialogues across 25 tasks.
Proposed VD-CoT and parameter-free mask-guided aggregation.

🏆 Honors & Awards

PolyU Research Postgraduate Scholarship, 2024
USYD-CSC Research Tuition Fee Scholarship, 2023
Faculty of Engineering Research Scholarship, 2022
China National Scholarship (Top 0.2%), 2021
1st Prize Scholarship & Merit Student at CQUPT (Top 1%), 2020
1st Prize Scholarship & Merit Student at CQUPT (Top 1%), 2019

🛠 Skills

Programming: Python, C, C++, Java, R, LaTeX
ML Libraries: PyTorch, TensorFlow, Keras, Caffe, OpenCV