Hengshuang Zhao

Assistant Professor

Rm 424, Chow Yei Ching Building
Department of Computer Science
The University of Hong Kong

Email: hszhao[at]cs.hku.hk


Biography

I am an Assistant Professor at the Department of Computer Science at The University of Hong Kong. My general research interests cover the broad area of computer vision, machine learning and artificial intelligence, with special emphasis on building intelligent visual systems. My research goal is to utilize artificial intelligence techniques to make machines perceive, understand and interact with the surrounding environment, and ultimately make high positive impacts on various fields.

Previously, I have spent wonderful times as a postdoctoral researcher at Computer Science and Artificial Intelligence Laboratory (CSAIL) at MIT, working with Prof. Antonio Torralba, at Torr Vision Group at the University of Oxford (beautiful Oxford), working with Prof. Philip Torr. I obtained my Ph.D. degree at CSE Department at The Chinese University of Hong Kong, supervised by Prof. Jiaya Jia. During Ph.D., I have spent wonderful times as an intern with Dr. Xiaohui Shen, Dr. Zhe Lin, Dr. Kalyan Sunkavalli, Dr. Brian Price at Adobe (San Jose), Prof. Raquel Urtasun at Uber (Toronto), and Dr. Vladlen Koltun at Intel (Santa Clara).

Our current research interests and focus: 1. visual scene understanding, perception, reconstruction, representation learning, multimodal learning; 2. visual content creation, generation, and manipulation (image/video/3d); 3. autonomous driving, embodied ai, robot learning, reinforcement learning, LLM applications etc.

Prospective students: I am looking for self-motivated PhD, Postdoc, Intern, and Visiting Scholar, working together on exciting and cutting-edge computer vision, machine learning and artificial intelligence projects. If you are interested in working with me, please drop me an email with your resume. For summer interns, please apply for the HKU CS Research Internship Programme.

Pinned: Highly optimized codebase available for 3D scene understanding Pointcept (PTv1&PTv2&PTv3&MSC&PPT).

Highly optimized codebase available for semantic segmentation semseg (PSPNet&PSANet).

Unified raw operator for 2D image recognition SAN and 3D point cloud recognition PointTransformerV1, V2, V3.

Unified panoptic segmentation UPSNet (logit level), and PanopticFCN (representation level).

Unified modeling for joint 2D-3D scene recognition BPNet, tracking UniTrack, and multi-task learning MTFormer.

Unified open-world perception system for detection UniDetector and segmentation OPSNet.

Students

Publications [Google Scholar]

Experiences

Professional Activities

  • Program Committee:
      Area Chair for IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2024.
      Area Chair for European Conference on Computer Vision (ECCV), 2024.
      Senior Program Committee for AAAI Conference on Artificial Intelligence (AAAI), 2024.
      Area Chair for ACM International Conference on Multimedia (ACMMM), 2024.
      Area Chair for Neural Information Processing Systems (NeurIPS), 2023.
      Area Chair for IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
      Area Chair for IEEE Winter Conference on Applications of Computer Vision (WACV), 2023.
      Senior Program Committee for AAAI Conference on Artificial Intelligence (AAAI), 2023.
  • Conference Reviewer:
      IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
      IEEE International Conference on Computer Vision (ICCV).
      European Conference on Computer Vision (ECCV).
      Neural Information Processing Systems (NeurIPS).
      International Conference on Machine Learning (ICML).
      International Conference on Learning Representations (ICLR).
      ACM Conference on SIGGRAPH (SIGGRAPH).
      ACM Conference on SIGGRAPH Asia (SIGGRAPH Asia).
      AAAI Conference on Artificial Intelligence (AAAI).
      IEEE Winter Conference on Applications of Computer Vision (WACV).
      British Machine Vision Conference (BMVC).
      IEEE Intelligent Vehicles Symposium (IV).
      Asian Conference on Computer Vision (ACCV).
      IEEE International Conference on Robotics and Automation (ICRA).
  • Journal Reviewer:
      IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI).
      International Journal of Computer Vision (IJCV).
      ACM Transactions on Graphics (SIGGRAPH).
      IEEE Transactions on Image Processing (TIP).
      IEEE Transactions on Robotics (T-RO).
      IEEE Robotics and Automation Letters (RA-L).
      IEEE Transactions on Multimedia (TMM).
      IEEE Transactions on Neural Networks and Learning Systems (TNNLS).
      Transactions on Machine Learning Research (TMLR).
      Pattern Recognition Letters (PRLETTERS).
      Journal of Visual Communications and Image Representation (JVCI).

Talks & Presentations

  • International Digital Economy Academy (IDEA): "Towards Unified Scene Understanding: Representation, Operator and Framework", May. 2022.
  • VALSE Webinar: "Scene Understanding in 3D and 2D-3D", Apr. 2022.
  • AI Time Young Scientist: "Towards Unified Scene Understanding: Representation, Operator and Framework", Apr. 2022.
  • MIT CSAIL: "Towards Unified Scene Understanding: Representation, Operator and Framework", Nov. 2021.
  • MIT CSAIL: "Advancing Visual Intelligence via Neural System Design", Oct. 2021.
  • ICCV VSP Workshop, "Towards Unified Scene Understanding: Representation, Operator and Framework", Oct. 2021.
  • University of Oxford, Apr. 2021.
  • Imperial College London, Mar. 2021.
  • University College London, Mar. 2021.
  • Max Planck Institute for Informatics, Mar. 2021.
  • King Abdullah University of Science and Technology, Mar. 2021.
  • University of Southern California, Mar. 2021.
  • Tsinghua University, Mar. 2021.
  • The Hong Kong University of Science and Technology, Guangzhou, Mar. 2021.
  • Microsoft Research: "Advancing Visual Intelligence via Neural System Design", Mar. 2021.
  • The University of Hong Kong, Feb. 2021.
  • The Chinese University of Hong Kong, Shenzhen, Feb. 2021.
  • National University of Singapore, Jan. 2021.
  • Nanyang Technological University, Jan. 2021.
  • Peking University, Dec. 2020.
  • Apple Research: "Pixel-Level Scene Understanding with Segmentation", Nov. 2020.
  • Intel Intelligent Systems Lab: "Point Transformer", Oct. 2020.
  • Huawei Research UK: "Exploring Self-attention for Image Recognition", Oct. 2020.
  • University of Oxford: "Pixel-Level Scene Understanding with Segmentation", Sep. 2020.
  • JIANGMEN: "Exploring Self-attention for Image Recognition", Jul. 2020.
  • Google Research: "Pixel-Level Scene Understanding with Segmentation", Feb. 2020.
  • MIT CSAIL: "Pixel-Level Scene Understanding with Segmentation", Jun. 2019.
  • UC Berkeley ICSI: "Pixel-Level Scene Understanding with Segmentation", Jun. 2019.
  • UC Berkeley BAIR: "Pixel-Level Scene Understanding with Segmentation", Jun. 2019.
  • VALSE Webinar: "Pixel-Level Image Understanding with Semantic Segmentation and Panoptic Segmentation", May 2019.
  • Intel Intelligent Systems Lab: "Self-attention Networks for Image Recognition", May 2019.
  • Intel Intelligent Systems Lab: "Image Segmentation with Application", Jan. 2019.
  • Uber ATG: "Unified Panoptic Segmentation Network (UPSNet): A Unified Framework for Image Understanding", Aug. 2018.
  • CVPR WAD Workshop: "IBN-PSANet: Winning WAD Drivable Area Challenge", Jun. 2018.
  • VALSE Webinar: "PSPNet and ICNet: Semantic Segmentation with High Accuracy and High Efficiency", Jul. 2017.
  • Adobe Bay Area Research Showcase: "Compositing-aware Image Search", Jul. 2017.
  • ECCV ILSVRC Workshop: "Understanding Scene in the Wild", Oct. 2016.

Honors & Awards

Patents

  • US16905478 (In process), "Image processing method, apparatus, electronic device, storage medium, program product".
  • US16385333 (In process), "Method and system for scene parsing and storage medium".
  • US16929429 (In process), "Compositing aware digital image search".
  • CN201810893153 (In process), "Image processing method, apparatus, electronic device, storage medium, program product".
  • CN201611097543 (In process), "Scene parsing method and system, electronic equipment".
  • US15986401 (Issued Aug. 18, 2020), "Compositing aware digital image search".
  • CN201611097445 (Issued Aug. 11, 2020), "Deep neural network training method and system, electronic equipment".
  • CN201310233990 (Issued Feb. 24, 2016), "GPU-based object 3D shape measurement method".
  • CN201220412358 (Issued Feb. 20, 2013), "Automatic fish tank”.

Teaching

  • COMP3314A: Machine Learning
    Fall, 2022-2023
  • ENGG5103: Techniques for Data Mining
    Fall, 2018-2019
  • ENGG2601A: Technology, Society and Engineering Practice
    Spring, 2017-2018
  • ENGG5103: Techniques for Data Mining
    Fall, 2017-2018
  • CSCI2100B: Data Structures
    Spring, 2016-2017
  • CSCI3160: Design and Analysis of Algorithms
    Fall, 2016-2017
  • CSCI2520: Data Structures & Applications
    Spring, 2015-2016
  • CSCI1120: Introduction to Computing Using C++
    Fall, 2015-2016
© Hengshuang Zhao | Last updated: 01/01/2024