Yue Cao 曹越

Lightyear AI

caoyue10 AT gmail DOT com

Short Bio

Yue Cao is now working on a stealth startup. He co-founded Lightyear AI (光年之外 in Chinese), aiming to accelerate the process of making AGI beneficial for all of humanity.

Before this entrepreneurial venture, he was a technical staff in Beijing Academy of Artificial Intelligence (BAAI), leading the multimodal and vision research center. Between 2019 and 2022, he was a senior researcher at Microsoft Research Asia, headed by Baining Guo and closely collaborated with Han Hu, Zheng Zhang and Steve Lin. His work of Swin Transformer won the Best Paper Award (Marr Prize) of ICCV 2021. Yue's academic journey began at Tsinghua University's School of Software, where he earned both his B.S. and Ph.D. degrees with top honors, under the supervision of Prof. Jianmin Wang and Prof. Mingsheng Long in 2014 and 2019. During his Ph.D. study, he was a research intern in MSRA between 2018 and 2019, mentored by Jifeng Dai.


2023.09 EVA is selected as the 7th most influential paper in CVPR 2023. [list]

2023.03 EVA-02, SegGPT, and EVA-CLIP (best-performing CLIP models) are launched.

2023.02 EVA-01, Painter, UViT, understanding/data-scaling on MIM, iCLIP are accepted by CVPR 2023, congrats!

2023.01 SimMIM, Video-Swin, Swin-V2 are all selected as the top 15 most influential papers in CVPR 2022. [list]

2022.12 Invited as Area Chair in ICCV 2023.

2022.12 A generalist Painter is created, capable of seven representative vision tasks via in-context inference. [pdf] [Code]

2022.11 EVA Unit-01 Launched, the best 1B Vision Foundation Model to date! All models and code are released. [pdf] [Code]

2022.10 Our study on frozen vision foundation model is accepted by NeurIPS 2022 as spotlight, congrats! [pdf]

2022.06 Gived a talk on MIM pre-training in BAAI 2022. [slides]

2022.03 Our SimMIM, Video Swin, Swin V2 are accepted by CVPR 2022, congrats!

2021.10 Our Swin Transformer (a general-purpose vision backbone) won the Best Paper Award (Marr Prize) of ICCV 2021!!!

2021.10 Gived a tutorial on Vision Transformer in VALSE 2021. [slides]

2020.12 The extension of GCNet (Best Paper Award at ICCV 2019 Neural Architects Workshop) got accepted by TPAMI.

2019.11 Our work on multi-modality pre-training (VL-BERT) was reviewed by Bill Gates and accepted by ICLR 2020.

Selected Publications [Full List] [Google Scholar]

(Interns or Students, *Equal Contribution)

  1. Seggpt: Segmenting everything in context
    Xinlong Wang*, Xiaosong Zhang*, Yue Cao*, Wen Wang, Chunhua Shen, Tiejun Huang
    International Conference on Computer Vision (ICCV), 2023 [pdf] [Painter] [Code]

  2. EVA: Exploring the Limits of Masked Visual Representation Learning at Scale
    Yuxin Fang, Wen Wang, Binhui Xie, Quan Sun, Ledell Wu, Xinggang Wang, Tiejun Huang, Xinlong Wang, Yue Cao
    IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2023
    [pdf] [Code] PaperDigest Most Influential Papers

  3. SimMIM: A Simple Framework for Masked Image Modeling
    Zhenda Xie*, Zheng Zhang*, Yue Cao*, Yutong Lin, Jianmin Bao, Zhuliang Yao, Qi Dai, Han Hu*
    IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2022
    [PDF] [Code] [Understanding] [Data Scaling] PaperDigest Most Influential Papers

  4. Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
    Ze Liu*, Yutong Lin*, Yue Cao*, Han Hu*, Yixuan Wei, Zheng Zhang, Steve Lin, Baining Guo
    International Conference on Computer Vision (ICCV), 2021 Best Paper Award (Marr Prize)
    [PDF] [Code@Cls] [Code@Det] [Code@Seg] [Code@MoBY] [Code@Video]

  5. Parametric Instance Classification for Unsupervised Visual Feature Learning
    Yue Cao*, Zhenda Xie*, Bin Liu*, Yutong Lin, Zheng Zhang, Han Hu
    Neural Information Processing Systems (NeurIPS), 2020 [PDF] [Code@Github] [Post@Synced]

  6. VL-BERT: Pre-training of Generic Visual-Linguistic Representations
    Weijie Su*, Xizhou Zhu*, Yue Cao, Bin Li, Lewei Lu, Furu Wei, Jifeng Dai
    International Conference on Learning Representations (ICLR), 2020
    [PDF] [Code@Github] [Post@Synced] PaperDigest Most Influential Papers

  7. GCNet: Non-local Networks Meet Squeeze-Excitation Networks and Beyond
    Yue Cao*, Jiarui Xu*, Steve Lin, Fangyun Wei, Han Hu
    International Conference on Computer Vision Workshop (ICCVW), 2019
    [PDF] [Code@Github] [Code@mmdet] [Post@Synced] Best Paper Award
    IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2020 [PDF]

  8. Deep Hashing Network for Efficient Similarity Retrieval
    Han Zhu, Mingsheng Long, Jianmin Wang, Yue Cao
    AAAI Conference on Artificial Intelligence (AAAI), 2016
    [PDF] [Code@Github] PaperDigest Most Influential Papers

  9. Learning Transferable Features with Deep Adaptation Networks
    Mingsheng Long, Yue Cao, Jianmin Wang, Michael I. Jordan
    International Conference on Machine Learning (ICML), 2015
    [PDF] [Code@Github] [Code@Github] PaperDigest Most Influential Papers
    IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2018 [PDF]

Honors and Awards

Professional Services

PC Member | Reviewer