ICRA 2018 Tutorial Proposal

Creating Annotated Scene Meshes for Training and Testing Robot Systems

Binh-Son Hua1       Duc Thanh Nguyen2       Lap-Fai Yu3       Sai-Kit Yeung1       Daniela Rus4

1Singapore University of Technology and Design 2Deakin University 3University of Massachusetts Boston 4Massachusetts Institute of Technology

Capturing, reconstructing, and annotating 3D scenes from real world are often known as daunting tasks in preparing a high-quality dataset for 3D scene understanding despite recent advances in color and depth sensors. In contrast to 2D image datasets which have been readily and widely available, 3D scene mesh datasets for training and testing robotics algorithms have been scarce since creating such datasets often requires huge efforts in building a robust 3D scene reconstruction and annotation pipeline.

This tutorial aims to equip its audience with general knowledge about the state-of-the-art approaches in 3D scene reconstruction and annotation, as well as the technical and implementation knowledge about how to build a complete pipeline to reconstruct and annotate 3D scenes. Several topics for building the pipeline will be extensively discussed, including data capturing, real-time and offline reconstruction, automatic and interactive annotation, quality control and benchmarking metrics. A WebGL pipeline for 3D scene segmentation will also be demonstrated during the tutorial.



Monday, May 21, 2018

  • 08:30 AM: Welcome Speech and Overview
  • 08:45 AM: 3D Scene Reconstruction
  • 09.30 AM: 3D Scene Annotation
  • 10:15 AM: Break
  • 10:30 AM: Pipeline, Datasets, and Applications
  • 11.15 AM: WebGL Annotation Demo
  • 11:30 AM: Panel Discussion, Q&A


  1. 3D Scene Reconstruction
    • Dense surface reconstruction with depth sensors
    • Structure from motion

  2. 3D Scene Annotation
    • Automatic segmentation
      • Graphcut and Markov random field
      • Conditional random field
      • Convolutional neural network
      • Hybrid techniques
    • Interactive segmentation

  3. Putting it all together: Pipeline, Datasets, and Applications

  4. References
    Example code


Binh-Son Hua is currently a postdoctoral researcher in Singapore University of Technology and Design. He received his PhD degree in Computer Science from National University of Singapore in 2015. His research interests are 3D reconstruction, 3D scene understanding, and physically based rendering. His recent works are published in both computer graphics and vision venues, including Eurographics, TVCG, 3DV, and CVPR.

Duc Thanh Nguyen received his Ph.D. degree in Computer Science from the University of Wollongong, Australia, in 2012. Currently, he is a lecturer at the School of Information Technology, Deakin University, Australia. His research interests include Computer Vision and Pattern Recognition. Dr. Nguyen has published his work in highly-ranked publication venues in Computer Vision and Pattern Recognition such as the Journal of Pattern Recognition, CVPR, ICCV and ECCV. He also has served a technical program committee member of the IEEE Int. Conf. Image Process. (from 2012) and reviewers of the IEEE Trans. Intell. Transp. Syst., IEEE Trans. Image Process., IEEE Signal Processing Letters, Image and Vision Computing.

Lap-Fai (Craig) Yu is an assistant professor at the University of Massachusetts at Boston. He obtained his PhD degree in computer science from UCLA in 2013. His research interests are in computer graphics and vision, especially in the topics of synthesizing and analysing 3D models from the perspectives of functionality, physics, intentionality and causality. He is the recipient of the Cisco Outstanding Graduate Research Award, the UCLA Dissertation Year Fellowship, the Sir Edward Youde Memorial Fellowship and the Award of Excellence from Microsoft Research. His research has been featured in New Scientist, the UCLA Headlines and newspapers internationally. He regularly serves on the program committee of Eurographics, Pacific Graphics and IEEE Virtual Reality.

Sai-Kit Yeung is currently an Assistant Professor at the Singapore University of Technology and Design (SUTD), where he leads the Vision, Graphics and Computational Design (VGD) Group. He was also a Visiting Assistant Professor at Stanford University and MIT. Before joining SUTD, he had been a Postdoctoral Scholar in the Department of Mathematics, University of California, Los Angeles (UCLA). He was also a visiting student at the Image Processing Research Group at UCLA in 2008 and at the Image Sciences Institute, University Medical Center Utrecht, the Netherlands in 2007. He received his PhD in Electronic and Computer Engineering from the Hong Kong University of Science and Technology (HKUST) in 2009. He also received a BEng degree (First Class Honors) in Computer Engineering in 2003 and a MPhil degree in Bioengineering in 2005 from HKUST. His research interests include computer vision, computer graphics and computational fabrication.

Daniela Rus is the Andrew (1956) and Erna Viterbi Professor of Electrical Engineering and Computer Science and Director of the Computer Science and Artificial Intelligence Laboratory (CSAIL) at MIT. Rus’s research interests are in robotics, mobile computing, and data science. Rus is a Class of 2002 MacArthur Fellow, a fellow of ACM, AAAI, IEEE and RAS, and a member of the National Academy of Engineering, and the American Academy for Arts and Science. She earned her PhD in Computer Science from Cornell University. Prior to joining MIT, Rus was a professor in the Computer Science Department at Dartmouth College.


Please email us at scenenn@gmail.com if you have any comments and feedback. Thank you!

Last update: Sep 14, 2017