photo

Enze Xie (谢恩泽)

CV / GitHub / Google Scholar / Zhihu / Email: Johnny_ez@163.com | xieenze@hku.hk

I am a PhD student in Department of Computer Science, The University of Hong Kong (HKU) since 2019, supervised by Prof. Ping Luo and co-supervised by Prof. Wenping Wang. I also work very close with my friend Wenhai Wang and Prof. Chunhua Shen. I obtained B.S. from Nanjing University of Aeronautics and Astronautics (2016) and M.S. from TongJi University (2019). From 2018 to present, I collaborated with several researchers in industry e.g. Face++(Megvii), SenseTime, Facebook, Huawei and NVIDIA.

My research interest is computer vision in 2D and 3D. I did some works about instance-level detection and self/semi/weak-supervised learning. I developed a few well-known computer vision algorithms including PolarMask, which was selected as CVPR 2020 Top-10 Influential Papers. I co-developed OpenSelfSup(1k+ star), a popular self-supervised learning framework.

I am looking for a full-time job or postdoctoral position. Please feel free to contact me through the email.



Publications

(* indicates equal contribution)

Selected Papers

SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers

Enze Xie, Wenhai Wang, Zhiding Yu, Anima Anandkuma, Jose M. Alvarez, Ping Luo
Tech report, arXiv [paper] [code] [中文解读] [demo]
NVIDIA's first Vision Transformer work, used in several teams.


DetCo: Unsupervised Contrastive Learning for Object Detection

Enze Xie*, Jian Ding*, Wenhai Wang, Xiaohang Zhan, Hang Xu, Zhenguo Li, Ping Luo
ICCV2021 [paper] [code]
We introduce a detection-friendly unsupervised pre-training solution using large-scale unlabeled data.


PVTv2: Improved Baselines with Pyramid Vision Transformer

Wenhai Wang, Enze Xie, Xiang Li, Deng-Ping Fan, Kaitao Song, Ding Liang, Tong Lu, Ping Luo, Ling Shao
Tech report, arXiv [paper] [code]
A better version of PVT.


Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions

Wenhai Wang, Enze Xie, Xiang Li, Deng-Ping Fan, Kaitao Song, Ding Liang, Tong Lu, Ping Luo, Ling Shao
ICCV2021 (Oral) [paper] [code]
The first work to extend Vision Transformer for object detection and segmentation.


PolarMask: Single Shot Instance Segmentation with Polar Representation

Enze Xie*, Peize Sun*, Xiaoge Song*, Wenhai Wang, Ding Liang, Chunhua Shen, Ping Luo
CVPR2020 (Oral) [paper] [code] [中文解读] [talk] [CVPR20 Top-10 Influential Papers]
We introduced a new Polar Representation to reformulate instance segmentation.


PolarMask++: Enhanced Polar Representation for Single-Shot Instance Segmentation and Beyond

Enze Xie*, Wenhai Wang*, Mingyu Ding, Ruimao Zhang, Ping Luo
TPAMI2021 [paper] [code]
We extend PolarMask(CVPR'20) to several instance-level detection tasks.


PAN++: Towards Efficient and Accurate End-to-End Spotting of Arbitrarily-Shaped Text

Wenhai Wang*, Enze Xie*, Xiang Li, Xuebo Liu, Ding Liang, Zhibo Yang, Tong Lu, Chunhua Shen
TPAMI2021 [paper] [code]
We extend PSENet (CVPR'19) and PAN (ICCV'19) to a text spotting system.


Other Papers

Watch Only Once: An End-to-End Video Action Detection Framework

Shoufa Chen, Peize Sun, Enze Xie, Chongjian Ge, Jiannan Wu, Lan Ma, Jiajun Shen, Ping Luo
ICCV2021


What Makes for End-to-End Object Detection?

Peize Sun, Yi Jiang, Enze Xie, Wenqi Shao, Zehuan Yuan, Changhu Wang, Ping Luo
ICML2021


Segmenting Transparent Objects in the Wild with Transformer

Enze Xie, Wenjia Wang, Wenhai Wang, Peize Sun, Hang Xu, Ding Liang, Ping Luo
IJCAI2021 [paper] [code & dataset]


Segmenting Transparent Objects in the Wild

Enze Xie, Wenjia Wang, Wenhai Wang, Mingyu Ding, Chunhua Shen, Ping Luo
ECCV2020 [paper] [code & dataset]


Scene Text Image Super-Resolution in the Wild

Wenjia Wang*, Enze Xie*, Xuebo Liu, Wenhai Wang, Ding Liang, Chunhua Shen, Xiang Bai
ECCV2020 [paper] [code & dataset]


Differentiable Hierarchical Graph Grouping for Multi-Person Pose Estimation

Sheng Jin, Wentao Liu, Enze Xie, Wenhai Wang, Chen Qian, Wanli Ouyang, Ping Luo
ECCV2020 [paper]


AE TextSpotter: Learning Visual and Linguistic Representation for Ambiguous Text Spotting

Wenhai Wang, Xuebo Liu, Xiaozhong Ji, Enze Xie, Ding Liang, ZhiBo Yang, Tong Lu, Chunhua Shen, Ping Luo
ECCV2020 [paper] [Project Web]


Efficient and Accurate Arbitrary-Shaped Text Detection with Pixel Aggregation Network

Wenhai Wang*, Enze Xie*, Xiaoge Song, Yuhang Zang, Wenjia Wang, Tong Lu, Gang Yu, Chunhua Shen
ICCV 2019 [paper] [code]


Shape Robust Text Detection with Progressive Scale Expansion Network

Wenhai Wang*, Enze Xie*, Xiang Li, Wenbo Hou, Tong Lu, Gang Yu, Shuai Shao
CVPR 2019 [paper] [code]


Scene Text Detection with Supervised Pyramid Context Network

Enze Xie*, Yuhang Zang*, Shuai Shao, Gang Yu, Cong Yao, Guangyao Li
AAAI 2019 [paper]


Technical Report

CycleMLP: A MLP-like Architecture for Dense Prediction

Shoufa Chen, Enze Xie, Chongjian Ge, Ding Liang, Ping Luo
Tech report, arXiv [paper] [code]


Towards Ultra-Resolution Neural Style Transfer via Thumbnail Instance Normalization

Zhe Chen, Wenhai Wang, Enze Xie, Tong Lu, Ping Luo
Tech report, arXiv [paper] [code]


Unsupervised Pretraining for Object Detection by Patch Reidentification

Jian Ding*, Enze Xie*, Hang Xu, Chenhan Jiang, Zhenguo Li, Ping Luo, Gui-Song Xia
Tech report, arXiv [paper] [code]


TransTrack: Multiple-Object Tracking with Transformer

Peize Sun, Yi Jiang, Rufeng Zhang, Enze Xie, Jinkun Cao, Xinting Hu, Tao Kong, Zehuan Yuan, Changhu Wang, Ping Luo
Tech report, arXiv [paper] [code]


OneNet: Towards End-to-End One-Stage Object Detection

Peize Sun, Yi Jiang, Enze Xie, Zehuan Yuan, Changhu Wang, Ping Luo
Tech report, arXiv [paper] [code]


SelfText Beyond Polygon: Unconstrained Text Detection with Box Supervision and Dynamic Self-Training

Weijia Wu*, Enze Xie* , Ruimao Zhang, Wenhai Wang, Guan Pang, Zhen Li, Hong Zhou, Ping Luo
Tech report, arXiv [paper] [code]


1st Place Solutions for OpenImage2019--Object Detection and Instance Segmentation

Yu Liu, Guanglu Song, Yuhang Zang, Yan Gao, Enze Xie, Junjie Yan, Chen Change Loy, Xiaogang Wang
Tech report, arXiv [paper]


TextSR: Content-Aware Text Super-Resolution Guided by Recognition

Wenjia Wang*, Enze Xie*, Peize Sun, Wenhai Wang, Lixun Tian, Chunhua Shen, Ping Luo
Tech report, arXiv [paper] [code]
Improved version has been accepted by ECCV2020


Experience

NVIDIA Research
2021.03 – Now

Research Intern
working on 3D detection->tracking->forecasting in autonomous driving with Zhiding Yu, Jose M. Alvarez, Sanja Fidler, and Anima Anandkumar

AI Theory Group, HUAWEI Noah's Ark Lab
2020.06 – 2021.02

Research Intern
working on self-supervised learning and Transformer for dense prediction with Hang Xu and Zhenguo Li

Apply Machine Learning (AML) Team, Facebook AI
2020.05 – 2020.07

Research Intern -> Project Co-Operator (Due to COVID19)
working on weak and semi-supervised OCR with Guan Pang

General Model Team, SenseTime Research
2019.07 – 2020.03

Research Intern
working on instace-level detection with Ding Liang

Detection Team, Megvii(Face++) Research
2018.04 – 2019.07

Research Intern
working on OCR and instance-level detection with Gang Yu

Challenges

Rank 1 in National Artificial Intelligence Competition - Remote Sensing Segmentation (bonus 100,0000 RMB)
2020

Rank 1 in Google Open Images 2019 - Instance Segmentation
2019

Rank 1 in ICDAR 2019 Arbitrary-Shaped Text Detection
2019

Rank 2 in ICDAR 2019 Large-scale Street View Text Detection
2019


Professional Activities

Reviewer for NeurIPS, CVPR, IJCAI, ICCV, T-MM, WACV, ACCV

SPC for IJCAI2021


Invited Talks

Huawei Noah's Ark Lab - AI Theory Group : "Instance Level Detection and Beyond"
2021

SenseTime : "Self-Supervised Learning for Classification and Beyond"
2020

Microsoft Research Asia (MSRA) VCG : "Polar Representation in Instance Segmentation"
2020
Hong Kong Computer Vision Workshop(HKCVW) : "Real-Time Scene Text Detection"
2019



Honours and Awards

Hong Kong and China Gas Company Limited Postgraduate Prize
2021

Outstanding Master Thesis Award, Tongji University
2019

Some of my Friends

Wenhai Wang (NJU), Wenjia Wang (SenseTime), Jingbo Wang (CUHK), Xiaohang Zhan (CUHK), Guan Pang (Facebook AI), Chunhua Shen (Uni Adelaide)