AI Researcher

Hi, I'm Jie Qin

Senior Algorithm Researcher

Meituan | Beidou Talent Program

I am currently a Senior Algorithm Researcher at Meituan (Beidou Talent Program), focusing on unified multimodal large models. I received my Ph.D. in Pattern Recognition and Intelligent Systems from the Institute of Automation, Chinese Academy of Sciences (CASIA) in 2024. My research interests encompass multimodal perception, vision foundation models, and multi-agent collaborative generation systems.

Research Interests

Unified Multimodal Large Models (MLLMs) Vision Foundation Models (VFMs) Multimodal Perception Generative Models & Agents
jayqinliu@gmail.com
Google Scholar
Jie Qin

Latest News

  • 2026.02

    One paper is accepted by CVPR 2026.

  • 2026.01

    One paper is accepted by ICLR 2026.

  • 2025.09

    One paper is accepted by NeurIPS 2025.

  • 2024.07

    One paper is accepted by ECCV 2024.

  • 2024.06

    Successfully obtained my Ph.D. degree from the Institute of Automation, CAS!

  • 2024.02

    One paper is accepted by AAAI 2024.

  • 2023.09

    One paper is accepted by ICCV 2023.

  • 2023.03

    One paper is accepted by CVPR 2023.

  • 2022.09

    One paper is accepted by ECCV 2022.

  • 2022.02

    One paper is accepted by AAAI 2022.

Experience

2024.06 - Present

Senior Algorithm Researcher

Meituan | Beidou Talent Program | M17-MM Dept
2021.06 - 2023.12

Algorithm Intern

ByteDance | Intelligent Creation AutoML Platform
2019.12 - 2020.12

Algorithm Intern

Horizon Robotics | Fundamental Vision Algorithm Dept.

Education

2019.09 - 2024.06

Institute of Automation, Chinese Academy of Sciences

Ph.D. (Direct) | Pattern Recognition & Intelligent Systems
2015.09 - 2019.06

Nanjing University of Aeronautics and Astronautics

B.E. | College of Automation

Ranked Top 3% in the major.

Research Focus

Multimodal Large Language Models (MLLMs)

Focusing on building unified end-to-end frameworks that seamlessly integrate multimodal understanding and generation capabilities. Exploring advanced architectures like stacked autoregressive models.

Representative Works

Vision Foundation Models (VFMs)

Designing scalable and efficient vision transformers capable of handling native resolutions and dynamic sequences. Investigating 100% codebook utilization in vector-quantized networks.

Representative Works

Multimodal Perception & Segmentation

Tackling open-vocabulary, weak-supervised, and semi-supervised image segmentation by aligning cross-modal features and discovering additional supervisions for robust visual perception.

Representative Works

Generative Models & Embodied Agents

Developing LLM-driven multi-agent collaborative frameworks for complex image generation tasks and visual-guided robotic systems for real-world automated measurement and interaction.

Representative Works

Selected Publications

STAR method diagram

STAR: STacked AutoRegressive Scheme for Unified Multimodal Learning

Jie Qin, Jiancheng Huang, Limeng Qiao, Lin Ma
arXiv

Proposed a 'Stacked AR Expansion + 4-Stage Progressive Training' scheme and a self-developed FullVQ discretizer, achieving leading generation/editing abilities without compromising understanding metrics.

UniViTAR method diagram

UniViTAR: Unified Vision Transformer with Native Resolution

Limeng Qiao, Yiyang Gan, Bairui Wang, Jie Qin, Shuang Xu, Siqi Yang, Lin Ma
NeurIPS (CCF-A), 2025

Systematically upgraded ViT to construct a unified vision foundation model supporting native resolution and dynamic sequences, achieving SOTA on multiple tasks.

FVQ method diagram

Scalable Training for Vector-Quantized Networks with 100% Codebook Utilization

Yifan Chang, Jie Qin, Limeng Qiao, Xiaofeng Wang, Zheng Zhu, Lin Ma, Xingang Wang
ICLR (CCF-A), 2026

Proposed FVQ (FullVQ) to optimize code vectors through a compress-process-recover pipeline, achieving 100% codebook utilization.

DiffusionAgent method diagram

DiffusionAgent: Navigating Expert Models for Agentic Image Generation

Jie Qin, Jie Wu, Weifeng Chen, Yuxi Ren, Xuefeng Xiao, Rui Wang, Shilei Wen
arXiv

Designed a multi-agent collaborative generation framework driven by LLMs, implementing task decomposition, expert scheduling, and self-optimization via 'Plan-Execute-Reflect' cycles.

FreeSeg method diagram

FreeSeg: Unified, Universal and Open-Vocabulary Image Segmentation

Jie Qin, Jie Wu, Pengxiang Yan, Ming Li, Ren Yuxi, Xuefeng Xiao, et al.
CVPR (CCF-A), 2023

Proposed a unified multimodal segmentation model fusing cross-modal features for open-vocabulary unified modeling in semantic, instance, and panoptic tasks.

MGD method diagram

Multi-Granularity Distillation Scheme Towards Lightweight Semi-Supervised Semantic Segmentation

Jie Qin, Jie Wu, Ming Li, Xuefeng Xiao, Min Zheng, Xingang Wang
ECCV (CCF-B), 2022

Proposed a multi-granularity distillation scheme (MGD) to facilitate lightweight semi-supervised semantic segmentation.

ResizeMix method diagram

ResizeMix: Mixing Data while Preserving Object Information and Label Validity

Jie Qin, Jiemin Fang, Qian Zhang, Wenyu Liu, Xingang Wang, Xinggang Wang
CVMJ (IF=6.9, SCI-I), 2023

Introduced a novel data augmentation method, ResizeMix, that mixes data by resizing patches to preserve object information and label validity.

Wps-sam method diagram

Wps-sam: Towards weakly-supervised part segmentation with foundation models

Xin-Jian Wu, Ruisong Zhang, Jie Qin, Shijie Ma, Cheng-Lin Liu
ECCV (CCF-B), 2024

Explored weakly-supervised part segmentation leveraging the prior knowledge from vision foundation models.

AMR method diagram

Activation Modulation and Recalibration Scheme for Weakly Supervised Semantic Segmentation

Jie Qin, Jie Wu, Xuefeng Xiao, Lujun Li, Xingang Wang
AAAI (CCF-A), 2022

Proposed an AMR scheme to discover additional supervision and narrow the gap between full and weak supervisions.

Honors & Awards

Champion, CVPR 2022 Instance Segmentation on Synthetic Data Challenge
5th Place, CVPR 2022 3rd Agriculture Vision Challenge
Merit Student, University of Chinese Academy of Sciences (2021, 2022)
Outstanding Student Leader, University of Chinese Academy of Sciences (2021)
National Encouragement Scholarship (2016, 2017)

Academic Services

Journal Reviewer

IEEE TPAMI, IEEE TIP, IJCV, TCSVT, PR

Conference Reviewer

CVPR, ICCV, ECCV, NeurIPS, ICLR, ICML, AAAI, ACM MM, IJCAI