1st Multimodal Learning and Applications Workshop

In conjunction with ECCV 2018.

Munich, September 9th 2018

1st Multimodal Learning and Applications Workshop (MULA 2018)

The exploitation of the power of big data in the last few years led to a big step forward in many applications of Computer Vision. However, most of the tasks tackled so far are involving mainly visual modality due to the unbalanced number of labelled samples available among modalities (e.g., there are many huge labelled datasets for images while not as many for audio or IMU based classification), resulting in a huge gap in performance when algorithms are trained separately.

This workshop aims to bring together communities of machine learning and multimodal data fusion. We expect contributions involving video, audio, depth, IR, IMU, laser, text, drawings, synthetic, etc. Position papers with feasibility studies and cross-modality issues with highly applicative flair are also encouraged therefore we expect a positive response from academic and industrial communities.

This is an open call for papers, soliciting original contributions considering recent findings in theory, methodologies, and applications in the field of multimodal machine learning. Potential topics include, but are not limited to:

  • Multimodal learning
  • Cross-modal learning
  • Self-supervised learning for multimodal data
  • Multimodal data generation and sensors
  • Unsupervised learning on multimodal data
  • Cross-modal adaptation
  • Multimodal data fusion
  • Multimodal transfer learning
  • Multimodal applications (e.g. drone vision, autonomous driving, industrial inspection, etc.)


Papers will be limited up to 14 pages according to the ECCV format (c.f. main conference authors guidelines). All papers will be reviewed by at least two reviewers with double blind policy. Papers will be selected based on relevance, significance and novelty of results, technical merit, and clarity of presentation. Papers will be published in ECCV 2018 proceedings.

All the papers should be submitted using CMT website.

Important Dates

  • Deadline for submission: July 6th, 2018
  • ---EXTENDED---
  • Deadline for submission: July 13th, 2018 - 23:59 CET
  • Notification of acceptance: August 1st, 2018
  • Camera Ready submission deadline: September 20th, 2018
  • Workshop date: September 9th, 2018 (morning)


The workshop has received 28 valid submissions, for a total of 11 papers accepted, 3 of them as orals.

08:20 - Initial remarks and workshop introduction
08:30 - Boosting LiDAR-based Semantic Labeling by Cross-Modal Training Data Generation - Florian Piewak; Peter Pinggera; Manuel Schäfer; David Peter; Beate Schwarz; Nick Schneider; Markus Enzweiler; David Pfeiffer; Marius Zöllner

08:50 - Invited Speaker: Daniel Cremers - Sensor Fusion and Multi-modal Learning for Direct visual SLAM

09:40 - Visually Indicated Sound Generation by Perceptually Optimized Classification - Kan Chen; Chuanxi Zhang; Chen Fang; Zhaowen Wang; Trung Bui; Ram Nevatia
10:00 - Learning to Learn from Web Data through Deep Semantic Embeddings - Raul Gomez; Lluis Gomez; Jaume Gibert; Dimosthenis Karatzas

10:20 - Coffee Break

10:30 - Invited Speaker: Raquel Urtasun - A Future with Affordable Self-driving Vehicles

11:20 - Best Paper Award ceremony sponsored by Bosch
Visually Indicated Sound Generation by Perceptually Optimized Classification - Kan Chen; Chuanxi Zhang; Chen Fang; Zhaowen Wang; Trung Bui; Ram Nevatia

11:25 - Spotlight session (3 mins presentation for each poster)
12:00 - Poster Session

  • Learning from Barcelona Instagram data what Locals and Tourists post about its Neighbourhoods - Raul Gomez; Lluis Gomez; Jaume Gibert; Dimosthenis Karatzas
  • A Structured Listwise Approach to Learning to Rank for Image Tagging - Jorge Sanchez; Franco Luque; Leandro Lichtenzstein
  • CentralNet: a Novel Multilayer Approach for Multimodal Fusion - Valentin Vielzeuf; Alexis Lechervy; Stephane Pateux; Frederic Jurie
  • Where and What Am I Eating? Image-based Food Menu Recognition - Marc Bolaños; Marc Valdivia; Petia Radeva
  • ThermalGAN: Multimodal Color-to-Thermal Image Translation for Person Re-Identification in Multispectral Dataset - Vladimir V Kniaz
  • Visual-Semantic Alignment Across Domains Using a Semi-Supervised Approach - Angelo Carraggi; Marcella Cornia; Lorenzo Baraldi; Rita Cucchiara
  • Generalized Bayesian Canonical Correlation Analysis with Missing Modalities - Toshihiko Matsuura; Kuniaki Saito; Yoshitaka Ushiku; Tatsuya Harada
  • Unpaired Thermal to Visible Spectrum Transfer using Adversarial Training - Adam Nyberg; Abdelrahman Eldesokey; David Dr Gustafsson; David Bergström

  • Invited Speakers

    Raquel Urtasun is the Head of Uber ATG Toronto. She is also an Associate Professor in the Department of Computer Science at the University of Toronto, a Canada Research Chair in Machine Learning and Computer Vision and a co-founder of the Vector Institute for AI. Her research interests include machine learning, computer vision, robotics and remote sensing. Her lab was selected as an NVIDIA NVAIL lab. She is a recipient of an NSERC EWR Steacie Award, an NVIDIA Pioneers of AI Award, a Ministry of Education and Innovation Early Researcher Award, three Google Faculty Research Awards, an Amazon Faculty Research Award, a Connaught New Researcher Award and a Best Paper Runner up Prize awarded at the Conference on Computer Vision and Pattern Recognition (CVPR).

    Daniel Cremers is Managing Director of the TUM Department of Informatics. He serves as general chair for the European Conference on Computer Vision 2018 in Munich. In December 2010 he was listed among “Germany's top 40 researchers below 40” (Capital). On March 1st 2016, Prof. Cremers received the Gottfried Wilhelm Leibniz Award, the biggest award in German academia. His research interests include Computer vision & mathematical image analysis (segmentation, motion estimation, multiview reconstruction, visual SLAM), autonomous quadrocopters, statistical shape analysis, variational methods and partial differential equations, convex and combinatorial optimization, machine learning & statistical inference.


    Paolo Rota

    Università di Trento, Italy
    Istituto Italiano di Tecnologia, Italy

    Vittorio Murino

    Istituto Italiano di Tecnologia, Italy

    Michael Ying Yang

    University of Twente, Netherlands

    Bodo Rosenhahn

    Institut für Informationsverarbeitung, Leibniz-Universität Hannover, Germany

    Program Committee

    • Alper Yilmaz, Ohio State University, USA
    • Andrew Zisserman, University of Oxford, UK
    • Christian Heipke, Universitat Hannover, Germany
    • Duc-Tien Dang-Nguyen, insight-centre, Ireland
    • Elisa Ricci, FBK, Italy
    • Jacopo Cavazza, Istituto Italiano di Tecnologia, Italy
    • Juhan Nam, KAIST, South Korea
    • Martin Kampel, TU Wien, Austria
    • Nicola Conci, Universita' di Trento, Italy
    • Pietro Morerio, Istituto Italiano di Tecnologia, Italy
    • Relja Arandjelović, DeepMind, UK
    • Francesco Setti, Universita' di Verona, Italy
    • Christoph Reinders, LU Hannover, Germany



    For additional info please contact us here