Weifeng Lin (林炜丰)

I am currently a second-year Ph.D. student at the MMLab, The Chinese University of Hong Kong (CUHK), supervised by Hongsheng Li. I earned both my Bachelor's and Master's degrees from the South China University of Technology in 2021 and 2024, respectively, being advised by Lianwen Jin.

Research Interest: Interative Vision-Language Models, the Large-scale Video Generation Model, and the Intersection of Robotics. I am also deeply committed to contributing to open-source projects, believing they are a cornerstone for the sustainable growth of the AI community.

Email: wflin37@gmail.com

Weifeng Lin
News
Selected Publications
PAM
Perceive Anything: Recognize, Explain, Caption, and Segment Anything in Images and Videos
Weifeng Lin, Xinyu Wei, Ruichuan An, Tianhe Ren, Tingwei Chen, Renrui Zhang, Ziyu Guo, Wentao Zhang, Lei Zhang, Hongsheng Li
NeurIPS, 2025
MINT-CoT
MINT-CoT: Enabling Interleaved Visual Tokens in Mathematical Chain-of-Thought Reasoning
Xinyan Chen, Renrui Zhang, Dongzhi Jiang, Aojun Zhou, Shilin Yan, Weifeng Lin, Hongsheng Li
NeurIPS, 2025
UI-Genie
UI-Genie: A Self-Improving Approach for Iteratively Boosting MLLM-based Mobile GUI Agents
Han Xiao, Guozhi Wang, Yuxiang Chai, Zimu Lu, Weifeng Lin, Hao He, Lue Fan, Liuyang Bian, Rui Hu, Liang Liu, Shuai Ren, Yafei Wen, Xiaoxin Chen, Aojun Zhou, Hongsheng Li
NeurIPS, 2025
PixWizard
PixWizard: Versatile Image-to-Image Visual Assistant with Open-Language Instructions
Weifeng Lin, Xinyu Wei, Renrui Zhang, Le Zhuo, Shitian Zhao, Siyuan Huang, Junlin Xie, Yu Qiao, Peng Gao, Hongsheng Li
ICLR, 2025
Draw-and-Understand
Draw-and-Understand: Leveraging Visual Prompts to Enable MLLMs to Comprehend What You Want
Weifeng Lin, Xinyu Wei, Renrui Zhang, Ruichuan An, Peng Gao, Bocheng Zou, Yulin Luo, Siyuan Huang, Shanghang Zhang, Hongsheng Li
ICLR, 2025
Lumina-mGPT
Lumina-mgpt: Illuminate flexible photorealistic text-to-image generation with multimodal generative pretraining
Dongyang Liu*, Shitian Zhao*, Le Zhuo*, Weifeng Lin*, Yu Qiao, Hongsheng Li, Peng Gao
Report Preprint, 2024
Sphinx-x
Sphinx-X: Scaling data and parameters for a family of multi-modal large language models
Dongyang Liu*, Renrui Zhang*, Longtian Qiu*, Siyuan Huang*, Weifeng Lin*, Shitian Zhao, Shijie Geng, Ziyi Lin, Peng Jin, Kaipeng Zhang, Wenqi Shao, Chao Xu, Conghui He, Junjun He, Hao Shao, Pan Lu, Hongsheng Li, Yu Qiao, Peng Gao
ICML, 2024
M2SD
M2SD: Multiple Mixing Self-Distillation for Few-Shot Class-Incremental Learning
Jinhao Lin, Ziheng Wu, Weifeng Lin, Ziheng Wu, Jun Huang, RongHua Luo
AAAI, 2024
SMT
Scale-Aware Modulation Meet Transformer
Weifeng Lin, Ziheng Wu, Jiayu Chen, Jun Huang, Lianwen Jin
ICCV, 2023
Experience
Research Intern, Robotics
2025.09 - 2026.01
Shenzhen, China
Research Intern, Agent
2024.11 - 2025.06
Shenzhen, China
Research Intern, OpenGVLab
2024.01 - 2024.06
Shanghai, China
Research Intern, Machine Learning Platform for AI
2022.11 - 2023.12
Hangzhou, China
Reviewer Service
Conference on Computer Vision and Pattern Recognition (CVPR), 2025
International Conference on Computer Vision (ICCV), 2025
International Conference on Machine Learning (ICML), 2025
International Conference on Learning Representations (ICLR), 2024, 2025
Conference on Neural Information Processing Systems (NeurIPS), 2024, 2025
AAAI Conference on Artificial Intelligence (AAAI), 2025