Yue Cao 曹越

Senior Researcher
Visual Computing Group
Microsoft Research Asia

caoyue10 AT gmail DOT com

Short Bio

Yue Cao is currently a senior researcher in Microsoft Research Asia, headed by Baining Guo and closely collaborated with Han Hu, Zheng Zhang and Steve Lin. Prior to that, he received both B.S. degree and Ph.D. degree from the School of Software, Tsinghua University with highest honors, under supervision of Prof. Jianmin Wang and Prof. Mingsheng Long in 2014 and 2019. During his Ph.D. study, he was a research intern in MSRA between 2018 and 2019, mentored by Jifeng Dai. His work of Swin Transformer won the Best Paper Award (Marr Prize) of ICCV 2021.


If you are interested in the internship, and joint Ph.D. program on deep learning, please drop me an email.

2021.10 Our Swin Transformer (a general-purpose vision backbone) won the Best Paper Award (Marr Prize) of ICCV 2021!!!

2021.10 Gived a tutorial on Vision Transformer in VALSE 2021. [slides]

2021.06 Our Video Swin Transformer achieved SOTA on Kinetics-400, Kinetics-600 and Something-Something V2.

2021.03 Our PixPro and CBN got accepted by CVPR 2021, congrats!

2020.12 The extension of GCNet (Best Paper Award at ICCV 2019 Neural Architects Workshop) got accepted by TPAMI.

2019.11 Our work on multi-modality pre-training (VL-BERT) was reviewed by Bill Gates and accepted by ICLR 2020.

Selected Publications [Full List] [Google Scholar]

(Interns or Students, *Equal Contribution)

  1. SimMIM: A Simple Framework for Masked Image Modeling
    Zhenda Xie*, Zheng Zhang*, Yue Cao*, Yutong Lin, Jianmin Bao, Zhuliang Yao, Qi Dai, Han Hu*
    Arxiv 2021 [PDF] [Code]

  2. Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
    Ze Liu*, Yutong Lin*, Yue Cao*, Han Hu*, Yixuan Wei, Zheng Zhang, Steve Lin, Baining Guo
    International Conference on Computer Vision (ICCV), 2021 Best Paper Award (Marr Prize)
    [PDF] [Code@Cls] [Code@Det] [Code@Seg] [Code@MoBY] [Code@Video]

  3. Parametric Instance Classification for Unsupervised Visual Feature Learning
    Yue Cao*, Zhenda Xie*, Bin Liu*, Yutong Lin, Zheng Zhang, Han Hu
    Neural Information Processing Systems (NeurIPS), 2020 [PDF] [Code@Github] [Post@Synced]

  4. VL-BERT: Pre-training of Generic Visual-Linguistic Representations
    Weijie Su*, Xizhou Zhu*, Yue Cao, Bin Li, Lewei Lu, Furu Wei, Jifeng Dai
    International Conference on Learning Representations (ICLR), 2020
    [PDF] [Code@Github] [Post@Synced] PaperDigest Most Influential Papers

  5. GCNet: Non-local Networks Meet Squeeze-Excitation Networks and Beyond
    Yue Cao*, Jiarui Xu*, Steve Lin, Fangyun Wei, Han Hu
    International Conference on Computer Vision Workshop (ICCVW), 2019
    [PDF] [Code@Github] [Code@mmdet] [Post@Synced] Best Paper Award
    IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2020 [PDF]

  6. Deep Cauchy Hashing for Hamming Space Retrieval
    Yue Cao, Mingsheng Long, Bin Liu, Jianmin Wang
    IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018 [PDF] [Code@Github]

  7. Deep Visual-Semantic Hashing for Cross-Modal Retrieval
    Yue Cao, Mingsheng Long, Jianmin Wang, Qiang Yang, Philip S. Yu
    ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), 2016 [PDF]

  8. Deep Hashing Network for Efficient Similarity Retrieval
    Han Zhu, Mingsheng Long, Jianmin Wang, Yue Cao
    AAAI Conference on Artificial Intelligence (AAAI), 2016
    [PDF] [Code@Github] PaperDigest Most Influential Papers

  9. Learning Transferable Features with Deep Adaptation Networks
    Mingsheng Long, Yue Cao, Jianmin Wang, Michael I. Jordan
    International Conference on Machine Learning (ICML), 2015
    [PDF] [Code@Github] [Code@Github] PaperDigest Most Influential Papers
    IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2018 [PDF]

Honors and Awards

Professional Services

PC Member | Reviewer