Multi-Modal Perception and Control

Organizers: Filipe Veiga, Roberto Calandra, Aude Billard and Jan Peters


Humans perform their daily tasks by making use and integrating the diverse information from all the five senses. As robotics research tackles increasingly complex problems, robots are required to perform a variety of tasks in more diverse and unstructured environments where interactions with the environment often cannot be accurately and completely modeled. To successfully complete tasks such as driving safely on a road, cleaning a cluttered kitchen or manipulating objects in-hand, the use of several sensing modalities can be beneficial or even necessary. Using multiple sensing modalities can lead to a better understanding of the robots environment, with some sensing modalities capturing information for which other modalities are inaccurate or non-informative. However, the manner in which the information is extracted from each modality and fused together can greatly influence the end results. In this workshop, we focus on both theoretical understanding and practical engineering of robotic systems using multiple sensing modalities. This focus translates into questions such as:

  • How do we decide what sensor modalities are relevant for a specific task?
  • How do we efficiently make use of multiple sensing modalities?
  • How to use learning to integrate high-dimensional sensory inputs into control?
  • Can we take inspiration from the way that humans solve multi-sensory integration?
We gather experts in the fields of machine learning, robotics, autonomous driving and neuroscience that have performed work across multiple sensing modalities such as touch, vision, depth, sound and LIDAR, and have experienced both the benefits and difficulties of using multiple modalities.