|
PixWizard: Versatile Image-to-Image Visual Assistant with Open-Language Instructions
Weifeng Lin,
Xinyu Wei,
Renrui Zhang,
Le Zhuo,
Shitian Zhao,
Siyuan Huang,
Junlin Xie,
Yu Qiao,
Peng Gao,
Hongsheng Li
arxiv Preprint, 2024
[Paper] |
[Code]
Keywords: Image Generation, Image-to-Image, Instruction-based Visual Assistant
|
|
Draw-and-Understand: Leveraging Visual Prompts to Enable MLLMs to Comprehend What You Want
Weifeng Lin,
Xinyu Wei,
Renrui Zhang,
Ruichuan An,
Peng Gao,
Bocheng Zou,
Yulin Luo,
Siyuan Huang,
Shanghang Zhang,
Hongsheng Li
arxiv Preprint, 2024
[Paper] |
[Code]
Keywords: MLLM, Visual prompts
|
|
Hierarchical Side-Tuning for Vision Transformers
Weifeng Lin,
Ziheng Wu,
Jiayu Chen,
Wentao Yang,
Mingxin Huang,
Jun Huang,
Lianwen Jin
arxiv Preprint, 2023
[paper] |
[Code]
Keywords: Parameter-efficient transfer learning for vision Transformers in various vision tasks (classification, detection, segmentation)
|
|
Rapid Diffusion: Building Domain-Specific Text-to-Image Synthesizers with Fast Inference Speed
Bingyan Liu*,
Weifeng Lin*,
Zhongjie Duan,
Chengyu Wang,
Ziheng Wu,
Zipeng Zhang,
Kui Jia,
Lianwen Jin,
Cen Chen,
Jun Huang
Annual Meeting of the Association for Computational Linguistics (ACL-Industry), 2023
[paper] |
[Code]
Keywords: Text-to-image latent diffusion models with rich entity knowledge.
|
|
Shanghai AI Lab
Position: Research Intern, OpenGVLab
Location: Shanghai, China
Time: 2024.01 - 2024.06
|
|
Alibaba Cloud Intelligent
Position: Research Intern, Machine Learning - Deep Learning Algorithms
Location: Hangzhou, China
Time: 2022.11 - 2023.12
|
|