(†Interns or Students, *Equal Contribution)
2022
Revealing the Dark Secrets of Masked Image Modeling
Zhenda Xie*†, Zigang Geng*†, Jingcheng Hu†, Zheng Zhang, Han Hu, Yue Cao*
Arxiv 2022 [PDF]
On Data Scaling in Masked Image Modeling
Zhenda Xie*†, Zheng Zhang*, Yue Cao*, Yutong Lin†, Yixuan Wei†, Qi Dai, Han Hu
Arxiv 2022 [PDF]
iCAR: Bridging Image Classification and Image-text Alignment for Visual Recognition
Yixuan Wei†, Yue Cao, Zheng Zhang, Zhuliang Yao, Zhenda Xie, Han Hu, Baining Guo
Arxiv 2022 [PDF] [Code]
Could Giant Pre-trained Image Models Extract Universal Representations?
Yutong Lin*†, Ze Liu*†, Zheng Zhang, Han Hu, Nanning Zheng, Stephen Lin, Yue Cao
Neural Information Processing Systems (NeurIPS), 2022 [PDF]
A Simple Baseline for Zero-shot Semantic Segmentation with Pre-trained Vision-language Model
Mengde Xu*†, Zheng Zhang*, Fangyun Wei*, Yutong Lin†, Yue Cao, Han Hu, Xiang Bai
Europe Conference on Computer Vision (ECCV), 2022 [PDF]
A Simple Approach and Benchmark for 21,000-Category Object Detection
Yutong Lin†, Chen Li†, Yue Cao, Zheng Zhang, Jianfeng Wang, Lijuan Wang, Zicheng Liu, Han Hu
Europe Conference on Computer Vision (ECCV), 2022 [PDF]
SimMIM: A Simple Framework for Masked Image Modeling
Zhenda Xie*†, Zheng Zhang*, Yue Cao*, Yutong Lin†, Jianmin Bao, Zhuliang Yao†, Qi Dai, Han Hu*
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2022 [PDF] [Code]
Swin Transformer V2: Scaling Up Capacity and Resolution
Ze Liu*†, Han Hu*, Yutong Lin†, Zhuliang Yao†, Zhenda Xie†, Yixuan Wei†, Jia Ning†, Yue Cao, Zheng Zhang, Li Dong, Furu Wei, Baining Guo
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2022 [PDF]
Video Swin Transformer
Ze Liu*†, Jia Ning*†, Yue Cao*, Yixuan Wei†, Zheng Zhang, Steve Lin, Han Hu*
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2022 [PDF] [Code@Github]
Correlation-Aware Deep Tracking
Fei Xie†, Chunyu Wang, Guangting Wang, Yue Cao, Wankou Yang, Wenjun Zeng
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2022
2021
Self-supervised Learning with Swin Transformers
Zhenda Xie*†, Yutong Lin*†, Zhuliang Yao†, Zheng Zhang, Qi Dai, Yue Cao, Han Hu
Arxiv 2021 [PDF] [Code@Github]
Bootstrap Your Object Detector via Mixed Training
Mengde Xu*†, Zheng Zhang*, Fangyun Wei, Yutong Lin, Yue Cao, Stephen Lin, Han Hu, Xiang Bai
Neural Information Processing Systems (NeurIPS), 2021 [PDF]
Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
Ze Liu*†, Yutong Lin*†, Yue Cao*, Han Hu*, Yixuan Wei†, Zheng Zhang, Steve Lin, Baining Guo
International Conference on Computer Vision (ICCV), 2021 (Oral) - 3 Strong Accepts [PDF] [Code@Github]
Group-Free 3D Object Detection via Transformers
Ze Liu†, Zheng Zhang, Yue Cao, Han Hu, Xin Tong
International Conference on Computer Vision (ICCV), 2021 [PDF] [Code@Github]
Leveraging Batch Normalization for Vision Transformers
Zhuliang Yao†, Yue Cao, Yutong Lin, Ze Liu, Zheng Zhang, Han Hu
International Conference on Computer Vision Workshop (ICCVW), 2021 [PDF]
Breaking Shortcut: Exploring Fully Convolutional Cycle-Consistency for Video Correspondence Learning
Yansong Tang*†, Zhenyu Jiang*†, Zhenda Xie*†, Yue Cao, Zheng Zhang, Philip H.S. Torr, Han Hu
International Conference on Computer Vision Workshop (ICCVW), 2021 [PDF]
Propagate Yourself: Exploring Pixel-Level Consistency for Unsupervised Visual Representation Learning
Zhenda Xie*†, Yutong Lin*†, Zheng Zhang, Yue Cao, Steve Lin, Han Hu
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2021 [PDF] [Code@Github]
Cross-Iteration Batch Normalization
Zhuliang Yao†, Yue Cao, Shuxin Zheng, Gao Huang, Steve Lin
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2021 [PDF] [Code@Github]
2020
Parametric Instance Classification for Unsupervised Visual Feature Learning
Yue Cao*, Zhenda Xie*†, Bin Liu*†, Yutong Lin†, Zheng Zhang, Han Hu
Neural Information Processing Systems (NeurIPS), 2020 [PDF] [Code@Github] [Post@Synced]
RepPoints V2: Verification Meets Regression for Object Detection
Yihong Chen†, Zheng Zhang, Yue Cao, Liwei Wang, Steve Lin, Han Hu
Neural Information Processing Systems (NeurIPS), 2020 [PDF] [Code@Github] [Post@CVer]
A Closer Look at Local Aggregation Operators in Point Cloud Analysis
Ze Liu*†, Han Hu*, Yue Cao, Zheng Zhang, Xin Tong
Europe Conference on Computer Vision (ECCV), 2020 [PDF] [Code@Github]
Disentangled Non-Local Neural Networks
Minghao Yin*†, Zhuliang Yao*†, Yue Cao, Xiu Li, Zheng Zhang, Steve Lin, Han Hu
Europe Conference on Computer Vision (ECCV), 2020 [PDF] [Code@Seg] [Code@Det]
Negative Margin Matters: Understanding Margin in Few-shot Classification
Bin Liu†, Yue Cao, Yutong Lin, Qi Li, Zheng Zhang, Mingsheng Long, Han Hu
Europe Conference on Computer Vision (ECCV), 2020 [PDF] [Code@Github]
Spotlight
Memory Enhanced Global-Local Aggregation for Video Object Detection
Yihong Chen†, Yue Cao, Han Hu, Liwei Wang
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020 [PDF] [Code@Github] [Post@MSRA]
VL-BERT: Pre-training of Generic Visual-Linguistic Representations
Weijie Su*†, Xizhou Zhu*†, Yue Cao, Bin Li, Lewei Lu, Furu Wei, Jifeng Dai
International Conference on Learning Representations (ICLR), 2020 [PDF] [Code@Github] [Post@Synced]
2019
GCNet: Non-local Networks Meet Squeeze-Excitation Networks and Beyond
Yue Cao*, Jiarui Xu*, Steve Lin, Fangyun Wei, Han Hu
International Conference on Computer Vision Workshop on Neural Architects (ICCVW), 2019
[PDF] [Code@Github] [Code@mmdet] [Post@Synced] Best Paper Award
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2020 [PDF]
Spatial-Temporal Relation Networks for Multi-Object Tracking
Jiarui Xu†, Yue Cao, Zheng Zhang, Han Hu
International Conference on Computer Vision (ICCV), 2019 [PDF]
Max-Margin Hamming Hashing
Rong Kang†, Yue Cao, Mingsheng Long, Jianmin Wang, Philip S. Yu
International Conference on Computer Vision (ICCV), 2019 [PDF] [Code@Github]
2018
Deep Triplet Quantization
Bin Liu†, Yue Cao, Mingsheng Long, Jianmin Wang
ACM Multimedia Conference (ACMMM), 2018 [PDF] [Code@Github]
Oral
Cross-Modal Hamming Hashing
Yue Cao*, Bin Liu*, Mingsheng Long, Jianmin Wang
Europe Conference on Computer Vision (ECCV), 2018 [PDF]
Transferable Representation Learning with Deep Adaptation Networks
Mingsheng Long, Yue Cao, Zhangjie Cao, Jianmin Wang, Michael I. Jordan
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2018 [PDF]
HashGAN: Deep Learning to Hash with Pair Conditional Wasserstein GAN
Yue Cao*, Bin Liu*, Mingsheng Long, Jianmin Wang
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018
[PDF] [Code@Github]
Deep Cauchy Hashing for Hamming Space Retrieval
Yue Cao, Mingsheng Long, Bin Liu, Jianmin Wang
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018
[PDF] [Code@Github]
Unsupervised Domain Adaptation with Distribution Matching Machines
Yue Cao, Mingsheng Long, Jianmin Wang
AAAI Conference on Artificial Intelligence (AAAI), 2018
[PDF]
2017
Correlation Hashing Network for Efficient Cross-Modal Retrieval
Yue Cao, Mingsheng Long, Jianmin Wang, Philip S. Yu
28th British Machine Vision Conference (BMVC), 2017
[PDF]
Deep Visual-Semantic Quantization for Efficient Image Retrieval
Yue Cao, Mingsheng Long, Jianmin Wang, Shichen Liu
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017
[PDF] [Code@Github]
Collective Deep Quantization for Efficient Cross-Modal Retrieval
Yue Cao, Mingsheng Long, Jianmin Wang, Shichen Liu
AAAI Conference on Artificial Intelligence (AAAI), 2017
[PDF] [Code@Github]
2016
Deep Visual-Semantic Hashing for Cross-Modal Retrieval
Yue Cao, Mingsheng Long, Jianmin Wang, Qiang Yang, Philip S. Yu
ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), 2016 [PDF]
Composite Correlation Quantization for Efficient Multimodal Retrieval
Mingsheng Long, Yue Cao, Jianmin Wang, Philip S. Yu
ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), 2016
[PDF][Code]
Correlation Autoencoder Hashing for Supervised Cross-Modal Search
Yue Cao, Mingsheng Long, Jianmin Wang, Han Zhu
ACM International Conference on Multimedia Retrieval (ICMR), 2016
[PDF] [Slides]
Oral
Deep Learning of Transferable Representation for Scalable Domain Adaptation
Mingsheng Long, Jianmin Wang, Yue Cao, Jiaguang Sun, Philip S. Yu
IEEE Transactions on Knowledge and Data Engineering (TKDE), 2016 [PDF]
Deep Quantization Network for Efficient Image Retrieval
Yue Cao, Mingsheng Long, Jianmin Wang, Han Zhu, Qingfu Wen
AAAI Conference on Artificial Intelligence (AAAI), 2016
[PDF] [Slides] [Code@Github]
Oral
Deep Hashing Network for Efficient Similarity Retrieval
Han Zhu, Mingsheng Long, Jianmin Wang, Yue Cao
AAAI Conference on Artificial Intelligence (AAAI), 2016
[PDF] [Code@Github]
Cleaning Timestamps with Temporal Constraints
Shaoxu Song, Yue Cao, Jianmin Wang
International Conference on Very Large Data Bases (VLDB), 2016
[PDF]
Oral
2015
Learning Transferable Features with Deep Adaptation Networks
Mingsheng Long, Yue Cao, Jianmin Wang, Michael I. Jordan
International Conference on Machine Learning (ICML), 2015
[PDF] [Code@Github]