3DMV: Learning 3D with Multi-View Supervision

CVPR 2023 Workshop





Call for papers:   February 27th

Submission Deadline:   April 9th

Workshop Day:   June 19th

Event Pictures

Learning 3D with Multi-View Supervision @ CVPR Workshop 2023

This workshop is for multi-view deep learning. It covers various topics that involve multi-view deep learning for core 3D understanding tasks (recognition, detection, segmentation) and methods that use posed or un-posed multi-view images for 3D reconstruction and generation. Many of the recent advances in 3D vision have focused on the direct approach of applying deep learning to 3D data (e.g., 3D point clouds, meshes, voxels ). Another way of using deep learning for 3D understanding is to project 3D into multiple 2D images and apply 2D networks to process the 3D data indirectly. Tackling 3D vision tasks with indirect approaches has two main advantages: (i) mature and transferable 2D computer vision models (CNNs, Transformers, Diffusion, etc.), and (ii) large and diverse labeled image datasets for pre-training (e.g., ImageNet). Furthermore, recent advances in differentiable rendering allow for end-to-end deep learning pipelines that render multi-view images of the 3D data and process the images by CNNs/transformers/diffusion to obtain a more descriptive representation for the 3D data. However, several challenges remain in this multi-view direction, including handling the intersection with other modalities like point clouds and meshes and addressing some of the problems that affect 2D projections like occlusion and view-point selection. We aim to enhance the synergy between multi-view research across different tasks by inviting keynote speakers from across the spectrum of 3D understanding and generation, mixing essential 3D topics (like multi-view stereo) with modern generation techniques ( like NeRFs). The detailed topics covered in the workshop include the following:

  • Multi-View for 3D Object Recognition
  • Multi-View for 3D Object Detection
  • Multi-View for 3D Segmentation
  • Deep Multi-View Stereo
  • Multi-view for 3D generation and novel view synthesis
  • Submission TimeLine

  • Paper Submission start: February 27th
  • Paper submission deadline: April 9th
  • Review period: April 8th - April 17th
  • Decision to authors: April 20th
  • Camera-ready posters: May 7th
  • Call For Papers

    We are soliciting papers that use Multi-view deep learning to address problems in 3D Understanding and 3D Generation, including but not limited to the following topics:

  • Bird-Eye View for 3D Object Detection
  • multi-view fusion for 3D Object Detection
  • indoor/outdoor scenes segmentation
  • 3D Diffusions for 3D generation
  • part object segmentation
  • language+3D
  • Medical 3D segmentation and analysis
  • 3D shape generation
  • Deep multi-view stereo
  • Inverse Graphics from multi-view images
  • indoor/outdoor scenes generation and reconstruction
  • Volumetric Multi-view representation for 3D generation and novel view synthesis
  • Nerfs
  • 3D shape classification
  • 3D shape retrieval
  • Paper Submission Guidelines

  • We accept submissions of max 8 pages (excluding references) on the aforementioned and related topics.
  • The submissions can be previously published works (in the last two years) or based on new works.
  • Accepted papers are not archival and will not be included in the proceedings of CVPR 2023
  • Submitted manuscripts should follow the CVPR 2023 paper template (if they have not been published previously).
  • All submissions will be peer-reviewed under a single-blind policy (authors should include names in submissions)
  • PDFs need to be submitted online through the link.
  • Accepted papers' authors will be notified to prepare camera-ready posters to be uploaded based on the schedule above.
  • There will be a `best poster award` announced during the workshop with a sponsored money prize.
  • Schedule (June 19)

    {{item}}
    Time Session Speakers Recordings
    {{tableData[currentCountry][0]}} Opening Remarks Abdullah Hamdi link (6:00)
    {{tableData[currentCountry][1]}} Multi-view for 3D Generation Ben Poole link (19:07)
    {{tableData[currentCountry][2]}} 3D Perception from Multi-X Charles Qi link (57:28)
    {{tableData[currentCountry][3]}} Break - Poster Session
    {{tableData[currentCountry][4]}} 3D Recognition in the Wild With and Without 3D Supervision Georgia Gkioxari not available
    {{tableData[currentCountry][5]}} Photorealistic Reconstruction with Neural Parametric Models Matthias Niessner link (00:00)
    {{tableData[currentCountry][6]}} Panel Discussion (includes all speakers) Abdullah Hamdi -

    Posters

    NeRFocus: Neural Radiance Field for 3D Synthetic Defocus
    Yinhuai Wang (Peking University Shenzhen Graduate School); Shuzhou Yang (Peking University); Yujie Hu (Peking University Shenzhen Graduate School); Xinhua Cheng (Peking University); Jian Zhang (Peking University Shenzhen Graduate School)*
    OpenScene: 3D Scene Understanding with Open Vocabularies
    Songyou Peng (ETH Zurich and MPI-IS)*; Kyle Genova (Google Research); Chiyu Jiang (Waymo); Andrea Tagliasacchi (Google Brain and University of Toronto); Marc Pollefeys (ETH Zurich / Microsoft); Thomas Funkhouser (Google Research)
    CroCo: Self-Supervised Pre-training for 3D Vision Tasks by Cross-View Completion
    Philippe Weinzaepfel (NAVER LABS Europe)*; Vincent Leroy (Naver Labs Europe); Thomas LUCAS (Naver); Romain Brégier (Naver Labs Europe); Yohann Cabon (Naver Labs Europe); Vaibhav Arora (Universitè Paris Sud); Leonid Antsfeld (Naver Labs Europe); Boris Chidlovskii (Naver Labs Europe); Gabriela Csurka (Naver Labs Europe); Jerome Revaud (Naver Labs Europe)
    Federated Neural Radiance Fields
    Lachlan Holden (The University of Adelaide)*; Feras Dayoub (The University of Adelaide); David Harvey (The University of Adelaide); Tat-Jun Chin (The University of Adelaide)
    RePAST: Relative Pose Attention Scene Representation Transformer
    Aleksandr Safin (Skoltech)*; Daniel Duckworth (Google); Mehdi S. M. Sajjadi (Google Brain)
    HDR-Plenoxels: Self-Calibrating High Dynamic Range Radiance Fields
    Kim Jun-Seong (POSTECH)*; Kim Yu-Ji (POSTECH); Moon Ye-Bin (POSTECH); Tae-Hyun Oh (POSTECH)
    Multiface: A Dataset for Neural Face Rendering
    Cheng-hsin Wuu (Meta Reality Lab)*; Ningyuan Zheng (Meta); Scott Ardisson (Meta); Rohan Bali (University of Arizona); Eric Brockmeyer (Meta); Lucas Evans (Meta); Tim Godisart (Meta); Hyowon Ha (Meta); Xuhua Huang (Meta Reality Labs); Alexander T Hypes (Meta Reality Labs); Taylor Koska (Meta); Steven Krenn (Meta); Stephen Lombardi (Facebook Reality Labs); Xiaomin Luo (Meta); Laura Millerschoen (Meta); Kevyn A McPhail (Meta); Michal Perdoch (Meta); Mark Pitts (Meta); Alexander Richard (Facebook Reality Labs); Jason Saragih (Facebook); Junko Saragih (Meta); Takaaki Shiratori (Meta Reality Labs Research); Tomas Simon (Facebook Reality Lab Research); Matthew Stewart (Meta); Autumn Trimble (Facebook Reality Labs Pittsburgh); Xinshuo Weng (NVIDIA Research); David Whitewolf (Meta); Chenglei Wu (Facebook Reality Labs); Shoou-I Yu (Facebook Reality Labs); Yaser Sheikh (Facebook Reality Labs)
    Spatial-Language Attention Policies for Efficient Robot Learning
    Priyam Parashar (FAIR, Meta)*; Jay Vakil (FAIR, Meta); Samantha N Powers (Carnegie Mellon University); Chris Paxton (Meta AI)
    Spatio-Temporally Consistent Face Mesh Reconstruction on Videos
    Kim Youwang (POSTECH)*; Hyun Lee (POSTECH); Kim Sung-Bin (POSTECH); Suekyeong Nam (KRAFTON Inc.); Janghoon Ju (KRAFTON); Tae-Hyun Oh (POSTECH)
    Depth Field Networks for Generalizable Multi-view Scene Representation
    Vitor Guizilini (Toyota Research Institute); Igor Vasiljevic (Toyota Research Institute)*; Jiading Fang (Toyota Technological Institute at Chicago); Rareș A Ambruș (Toyota Research Institute); Greg Shakhnarovich (TTI-Chicago); Matthew R Walter (Toyota Technological Institute at Chicago); Adrien Gaidon (Toyota Research Institute)
    NerfDiff: Single-image View Synthesis with NeRF-guided Distillation from 3D-aware Diffusion
    Jiatao Gu (Apple)*; Alex Trevithick (University of California, San Diego); Kai-En Lin (University of California San Diego); Joshua M Susskind (Apple); Christian Theobalt (MPI Informatik); Lingjie Liu (University of Pennsylvania); Ravi Ramamoorthi (University of California San Diego)
    Learning to Predict Scene-Level Implicit 3D from Posed RGBD Data
    Nilesh Kulkarni (University of Michigan)*; Linyi Jin (University of Michigan); Justin Johnson (University of Michigan); David Fouhey (University of Michigan)
    VARS: Video Assistant Referee System for Automated Soccer Decision Making from Multiple Views
    Jan Held (University of Liege)*; Anthony Cioppa (University of Liège (ULiège)); Silvio Giancola (KAUST); Abdullah J Hamdi (KAUST); Bernard Ghanem (KAUST); Marc Van Droogenbroeck (University of Liège)
    LDM3D: Latent Diffusion Model for 3D
    Gabriela Ben Melech Stan (Intel)*; Diana Wofk (Intel Labs); Scottie Fox (Blockade Labs); Alexander H Redden (Blockade Labs); Will Saxton (Blockade Labs); Jean Yu (Intel Corporation); Estelle Guez Aflalo (Intel Corp); Shao-Yen Tseng (Intel); Fabio Nonato (Intel); Matthias Müller (Intel Labs); Vasudev Lal (Intel Corp)
    Look Around and Refer: 2D Synthetic Semantics Knowledge Distillation for 3D Visual Grounding
    eslam mohamed abdelrahman (KAUST)*; Yasmeen Y Alsaedi (King Abdullah University of Science and Technology ); Mohamed Elhoseiny (King Abdullah University of Science and Technology)

    Paper Awards

  • Best Paper Award:
    LDM3D: Latent Diffusion Model for 3D
    Gabriela Ben Melech Stan (Intel)*; Diana Wofk (Intel Labs); Scottie Fox (Blockade Labs); Alexander H Redden (Blockade Labs); Will Saxton (Blockade Labs); Jean Yu (Intel Corporation); Estelle Guez Aflalo (Intel Corp); Shao-Yen Tseng (Intel); Fabio Nonato (Intel); Matthias Müller (Intel Labs); Vasudev Lal (Intel Corp)
  • Best Paper Runner-up Award:
    Look Around and Refer: 2D Synthetic Semantics Knowledge Distillation for 3D Visual Grounding
    eslam mohamed abdelrahman (KAUST)*; Yasmeen Y Alsaedi (King Abdullah University of Science and Technology ); Mohamed Elhoseiny (King Abdullah University of Science and Technology)
  • Sponsors

    Contact: abdullah.hamdi@kaust.edu.sa
    CVPR 2023 Workshop ©2023