Weifeng Lin (林炜丰)

I am currently a PhD candidate at the Multimedia Laboratory (MMLab), The Chinese University of Hong Kong, supervised by Prof. Hongsheng Li. I earned both my Bachelor's and Master's degrees from the South China University of Technology in 2021 and 2024, respectively, where I had the privilege of being advised by Prof. Lianwen Jin and enjoyed a memorable seven years.

My research interests include Computer Vision (CV), Multimodal Large Language Models, and Generative Models. I am deeply committed to contributing to open-source projects, as I firmly believe they are a cornerstone for the sustainable growth of the AI community.

Email: wflin37@gmail.com  /  Google Scholar  /  Github

profile photo
Preprint
PixWizard: Versatile Image-to-Image Visual Assistant with Open-Language Instructions
Weifeng Lin, Xinyu Wei, Renrui Zhang, Le Zhuo, Shitian Zhao, Siyuan Huang, Junlin Xie, Yu Qiao, Peng Gao, Hongsheng Li

arxiv Preprint, 2024
[Paper] | [Code]
Keywords: Image Generation, Image-to-Image, Instruction-based Visual Assistant

Draw-and-Understand: Leveraging Visual Prompts to Enable MLLMs to Comprehend What You Want
Weifeng Lin, Xinyu Wei, Renrui Zhang, Ruichuan An, Peng Gao, Bocheng Zou, Yulin Luo, Siyuan Huang, Shanghang Zhang, Hongsheng Li

arxiv Preprint, 2024
[Paper] | [Code]
Keywords: MLLM, Visual prompts

Lumina-mgpt: Illuminate flexible photorealistic text-to-image generation with multimodal generative pretraining
Dongyang Liu*, Shitian Zhao*, Le Zhuo*, Weifeng Lin*, Yu Qiao, Hongsheng Li, Peng Gao

arxiv Preprint, 2024
[Paper] | [Code]
Keywords: Multimodal Autoregressive Generation models

Hierarchical Side-Tuning for Vision Transformers
Weifeng Lin, Ziheng Wu, Jiayu Chen, Wentao Yang, Mingxin Huang, Jun Huang, Lianwen Jin

arxiv Preprint, 2023
[paper] | [Code]
Keywords: Parameter-efficient transfer learning for vision Transformers in various vision tasks (classification, detection, segmentation)

Publications
Scale-Aware Modulation Meet Transformer
Weifeng Lin, Ziheng Wu, Jiayu Chen, Jun Huang, Lianwen Jin

International Conference on Computer Vision (ICCV), 2023
[paper] | [Code]
Keywords: Hybrid CNN-Transformer Network (Vision Transformer Backbone)

M2SD: Multiple Mixing Self-Distillation for Few-Shot Class-Incremental Learning
Jinhao Lin, Ziheng Wu, Weifeng Lin, Ziheng Wu, Jun Huang, RongHua Luo

Association for the Advancement of Artificial Intelligence (AAAI), 2024
[paper]
Keywords: Few-shot Class-incremental learning

Rapid Diffusion: Building Domain-Specific Text-to-Image Synthesizers with Fast Inference Speed
Bingyan Liu*, Weifeng Lin*, Zhongjie Duan, Chengyu Wang, Ziheng Wu, Zipeng Zhang, Kui Jia, Lianwen Jin, Cen Chen, Jun Huang

Annual Meeting of the Association for Computational Linguistics (ACL-Industry), 2023
[paper] | [Code]
Keywords: Text-to-image latent diffusion models with rich entity knowledge.

Building A Mobile Text Recognizer via Truncated SVD-based Knowledge Distillation-Guided NAS
Weifeng Lin, Canyu Xie, Dezhi Peng, Jiapeng Wang, Lianwen Jin, Wei Ding, Lianwen Jin, Cong Yao, Mengchao He

British Machine Vision Conference (BMVC), 2023
[Paper] | [Detection] | [Recognition] | [Demo]
Keywords: Mobile Text Recognizer

Experience
Shanghai AI Lab

Position: Research Intern, OpenGVLab
Location: Shanghai, China
Time: 2024.01 - 2024.06

Alibaba Cloud Intelligent

Position: Research Intern, Machine Learning - Deep Learning Algorithms
Location: Hangzhou, China
Time: 2022.11 - 2023.12


template adapted from this awesome website