Yue Cao 曹越

Technical Staff
Beijing Academy of Artificial Intelligence

caoyue10 AT gmail DOT com

Short Bio

Yue Cao is currently a technical staff in Beijing Academy of Artificial Intelligence (BAAI), focusing on foundation model, self-supervised learning and multimodal learning.

Prior to that, he was a senior researcher at Microsoft Research Asia between 2019 and 2022, headed by Baining Guo and closely collaborated with Han Hu, Zheng Zhang and Steve Lin. His work of Swin Transformer won the Best Paper Award (Marr Prize) of ICCV 2021. He received both B.S. degree and Ph.D. degree from the School of Software, Tsinghua University with highest honors, under supervision of Prof. Jianmin Wang and Prof. Mingsheng Long in 2014 and 2019. During his Ph.D. study, he was a research intern in MSRA between 2018 and 2019, mentored by Jifeng Dai.


2023.02 EVA-01, Painter, UViT, understanding/data-scaling on MIM, iCLIP are accepted by CVPR 2023, congrats!

2023.01 SimMIM, Video-Swin, Swin-V2 are all selected as the top 15 most influential papers in CVPR 2022. [list]

2022.12 Invited as Area Chair in ICCV 2023.

2022.12 A generalist Painter is created, capable of seven representative vision tasks via in-context inference. [pdf] [Code]

2022.11 EVA Unit-01 Launched, the best 1B Vision Foundation Model to date! All models and code are released. [pdf] [Code]

2022.10 Our study on frozen vision foundation model is accepted by NeurIPS 2022 as spotlight, congrats! [pdf]

2022.06 Gived a talk on MIM pre-training in BAAI 2022. [slides]

2022.03 Our SimMIM, Video Swin, Swin V2 are accepted by CVPR 2022, congrats!

2021.10 Our Swin Transformer (a general-purpose vision backbone) won the Best Paper Award (Marr Prize) of ICCV 2021!!!

2021.10 Gived a tutorial on Vision Transformer in VALSE 2021. [slides]

2020.12 The extension of GCNet (Best Paper Award at ICCV 2019 Neural Architects Workshop) got accepted by TPAMI.

2019.11 Our work on multi-modality pre-training (VL-BERT) was reviewed by Bill Gates and accepted by ICLR 2020.

Selected Publications [Full List] [Google Scholar]

(Interns or Students, *Equal Contribution)

  1. Images Speak in Images: A Generalist Painter for In-Context Visual Learning
    Xinlong Wang*, Wen Wang*, Yue Cao*, Chunhua Shen, Tiejun Huang
    IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2023 [pdf] [Code]

  2. EVA: Exploring the Limits of Masked Visual Representation Learning at Scale
    Yuxin Fang, Wen Wang, Binhui Xie, Quan Sun, Ledell Wu, Xinggang Wang, Tiejun Huang, Xinlong Wang, Yue Cao
    IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2023 [pdf] [Code]

  3. SimMIM: A Simple Framework for Masked Image Modeling
    Zhenda Xie*, Zheng Zhang*, Yue Cao*, Yutong Lin, Jianmin Bao, Zhuliang Yao, Qi Dai, Han Hu*
    IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2022
    [PDF] [Code] [Understanding] [Data Scaling] PaperDigest Most Influential Papers

  4. Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
    Ze Liu*, Yutong Lin*, Yue Cao*, Han Hu*, Yixuan Wei, Zheng Zhang, Steve Lin, Baining Guo
    International Conference on Computer Vision (ICCV), 2021 Best Paper Award (Marr Prize)
    [PDF] [Code@Cls] [Code@Det] [Code@Seg] [Code@MoBY] [Code@Video]

  5. Parametric Instance Classification for Unsupervised Visual Feature Learning
    Yue Cao*, Zhenda Xie*, Bin Liu*, Yutong Lin, Zheng Zhang, Han Hu
    Neural Information Processing Systems (NeurIPS), 2020 [PDF] [Code@Github] [Post@Synced]

  6. VL-BERT: Pre-training of Generic Visual-Linguistic Representations
    Weijie Su*, Xizhou Zhu*, Yue Cao, Bin Li, Lewei Lu, Furu Wei, Jifeng Dai
    International Conference on Learning Representations (ICLR), 2020
    [PDF] [Code@Github] [Post@Synced] PaperDigest Most Influential Papers

  7. GCNet: Non-local Networks Meet Squeeze-Excitation Networks and Beyond
    Yue Cao*, Jiarui Xu*, Steve Lin, Fangyun Wei, Han Hu
    International Conference on Computer Vision Workshop (ICCVW), 2019
    [PDF] [Code@Github] [Code@mmdet] [Post@Synced] Best Paper Award
    IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2020 [PDF]

  8. Deep Hashing Network for Efficient Similarity Retrieval
    Han Zhu, Mingsheng Long, Jianmin Wang, Yue Cao
    AAAI Conference on Artificial Intelligence (AAAI), 2016
    [PDF] [Code@Github] PaperDigest Most Influential Papers

  9. Learning Transferable Features with Deep Adaptation Networks
    Mingsheng Long, Yue Cao, Jianmin Wang, Michael I. Jordan
    International Conference on Machine Learning (ICML), 2015
    [PDF] [Code@Github] [Code@Github] PaperDigest Most Influential Papers
    IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2018 [PDF]

Honors and Awards

Professional Services

PC Member | Reviewer