Yue Cao is now working on a stealth startup. He co-founded Lightyear AI (光年之外 in Chinese), aiming to accelerate the process of making AGI beneficial for all of humanity.
Before this entrepreneurial venture, he was a technical staff in Beijing Academy of Artificial Intelligence (BAAI), leading the multimodal and vision research center. Between 2019 and 2022, he was a senior researcher at Microsoft Research Asia, headed by Baining Guo and closely collaborated with Han Hu, Zheng Zhang and Steve Lin. His work of Swin Transformer won the Best Paper Award (Marr Prize) of ICCV 2021. Yue's academic journey began at Tsinghua University's School of Software, where he earned both his B.S. and Ph.D. degrees with top honors, under the supervision of Prof. Jianmin Wang and Prof. Mingsheng Long in 2014 and 2019. During his Ph.D. study, he was a research intern in MSRA between 2018 and 2019, mentored by Jifeng Dai.
2023.09 EVA is selected as the 7th most influential paper in CVPR 2023. [list]
2023.03 EVA-02, SegGPT, and EVA-CLIP (best-performing CLIP models) are launched.
2023.02 EVA-01, Painter, UViT, understanding/data-scaling on MIM, iCLIP are accepted by CVPR 2023, congrats!
2023.01 SimMIM, Video-Swin, Swin-V2 are all selected as the top 15 most influential papers in CVPR 2022. [list]
2022.12 Invited as Area Chair in ICCV 2023.
2022.12 A generalist Painter is created, capable of seven representative vision tasks via in-context inference. [pdf] [Code]
2022.11 EVA Unit-01 Launched, the best 1B Vision Foundation Model to date! All models and code are released. [pdf] [Code]
2022.10 Our study on frozen vision foundation model is accepted by NeurIPS 2022 as spotlight, congrats! [pdf]
2022.06 Gived a talk on MIM pre-training in BAAI 2022. [slides]
2022.03 Our SimMIM, Video Swin, Swin V2 are accepted by CVPR 2022, congrats!
2021.10 Our Swin Transformer (a general-purpose vision backbone) won the Best Paper Award (Marr Prize) of ICCV 2021!!!
2021.10 Gived a tutorial on Vision Transformer in VALSE 2021. [slides]
2020.12 The extension of GCNet (Best Paper Award at ICCV 2019 Neural Architects Workshop) got accepted by TPAMI.
2019.11 Our work on multi-modality pre-training (VL-BERT) was reviewed by Bill Gates and accepted by ICLR 2020.
(†Interns or Students, *Equal Contribution)
Seggpt: Segmenting everything in context
Xinlong Wang*, Xiaosong Zhang*†, Yue Cao*, Wen Wang, Chunhua Shen, Tiejun Huang
International Conference on Computer Vision (ICCV), 2023 [pdf] [Painter] [Code]
EVA: Exploring the Limits of Masked Visual Representation Learning at Scale
Yuxin Fang†, Wen Wang†, Binhui Xie†, Quan Sun, Ledell Wu, Xinggang Wang, Tiejun Huang, Xinlong Wang, Yue Cao
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2023
[pdf] [Code] PaperDigest Most Influential Papers
SimMIM: A Simple Framework for Masked Image Modeling
Zhenda Xie*†, Zheng Zhang*, Yue Cao*, Yutong Lin†, Jianmin Bao, Zhuliang Yao†, Qi Dai, Han Hu*
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2022
[PDF] [Code] [Understanding] [Data Scaling] PaperDigest Most Influential Papers
Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
Ze Liu*†, Yutong Lin*†, Yue Cao*, Han Hu*, Yixuan Wei, Zheng Zhang, Steve Lin, Baining Guo
International Conference on Computer Vision (ICCV), 2021 Best Paper Award (Marr Prize)
[PDF] [Code@Cls] [Code@Det] [Code@Seg] [Code@MoBY] [Code@Video]
Parametric Instance Classification for Unsupervised Visual Feature Learning
Yue Cao*, Zhenda Xie*†, Bin Liu*†, Yutong Lin†, Zheng Zhang, Han Hu
Neural Information Processing Systems (NeurIPS), 2020 [PDF] [Code@Github] [Post@Synced]
VL-BERT: Pre-training of Generic Visual-Linguistic Representations
Weijie Su*†, Xizhou Zhu*†, Yue Cao, Bin Li, Lewei Lu, Furu Wei, Jifeng Dai
International Conference on Learning Representations (ICLR), 2020
[PDF] [Code@Github] [Post@Synced] PaperDigest Most Influential Papers
GCNet: Non-local Networks Meet Squeeze-Excitation Networks and Beyond
Yue Cao*, Jiarui Xu*, Steve Lin, Fangyun Wei, Han Hu
International Conference on Computer Vision Workshop (ICCVW), 2019
[PDF] [Code@Github] [Code@mmdet] [Post@Synced] Best Paper Award
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2020 [PDF]
Deep Hashing Network for Efficient Similarity Retrieval
Han Zhu, Mingsheng Long, Jianmin Wang, Yue Cao
AAAI Conference on Artificial Intelligence (AAAI), 2016
[PDF] [Code@Github] PaperDigest Most Influential Papers
Learning Transferable Features with Deep Adaptation Networks
Mingsheng Long, Yue Cao, Jianmin Wang, Michael I. Jordan
International Conference on Machine Learning (ICML), 2015
[PDF] [Code@Github] [Code@Github] PaperDigest Most Influential Papers
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2018 [PDF]
PC Member | Reviewer