Listen to this story
This year’s Computer Vision and Pattern Recognition (CVPR) conference has already started on June 19 and will continue till June 24 in New Orleans, Louisiana, as well as virtually. Like every year, it is expected to attract more than 7,500 attendees and feature keynote speakers, presentations, tutorials, a panel session, and workshops.
All the major names in tech are present at CVPR 2022 and will conduct workshops, present their papers and discuss the innovations they are working on.
CV models are usually black-box in nature and do not provide explanations for their predictions. This lack of transparency, can lead to a lack of trust among consumers, which can cause backlash when algorithms make mistakes.
Sign up for your weekly dose of what's up in emerging technology.
This workshop will help build proactive adaptation of explainability in computer vision systems. The agenda will be to have conversations that will work on “building top-performing explainable computer vision systems.” It will put a greater focus on providing the human-like reasoning of “why” the model made those predictions.
Some other research areas that will be a talking point from Meta include
- Ego4D: Around the World in 3,000 Hours of Egocentric Video
- HVH: Learning a Hybrid Neural Volumetric Representation for Dynamic Hair Performance Capture
- KeyTr: Keypoint Transporter for 3D Reconstruction of Deformable Objects in Videos
- Masked-attention Mask Transformer for Universal Image Segmentation
- Masked Autoencoders Are Scalable Vision Learners
- MViTv2: Improved Multiscale Vision Transformers for Classification and Detection
- Neural 3D Video Synthesis from Multi-View Video
- Omnivore: A Single Model for Many Visual Modalities
- PONI: Potential Functions for ObjectGoal Navigation With Interaction-Free Learning
- Visual Acoustic Matching
To read more about Meta’s plans for CVPR 2022, click here.
Some of the accepted papers from Apple in CVPR 2022:
Critical Regularizations for Neural Surface Reconstruction in the Wild – The researchers introduce RegSDF, which shows that “proper point cloud supervisions and geometry regularizations are sufficient to produce high-quality and robust reconstruction results.”
Forward Compatible Training for Large-Scale Embedding Retrieval Systems – The researchers bring out a forward-compatible training (FCT) – a new learning paradigm for representation learning. When the old model is trained, the researchers also prepare for a future unknown version of the model. They propose “learning side-information, an auxiliary feature for each sample that facilitates future updates of the model.”
Robust Joint Shape and Pose Optimization for Few-View Object Reconstruction – The researchers show FvOR, a learning-based object reconstruction method that predicts accurate 3D models, given a few images with noisy input poses.
Efficient Multi-View Stereo via Attention-Driven 2D Convolutions -The researchers bring out MVS2D, a multi-view stereo algorithm, which integrates multi-view constraints into single-view networks via an attention mechanism.
Apple will show a demo of RoomPlan technology that allows the user to capture a room and its defining objects in a parametric format within minutes. It is supported on any of the Apple devices that are equipped with LiDAR sensors.
To read more about Apple’s plans for CVPR 2022, click here.
Samsung Research will present around 20 thesis papers at CVPR 2022. Two of the papers submitted by Samsung’s Toronto AI Center have been selected for oral presentations (a prestigious feat).
- P3IV: Probabilistic Procedure Planning from Instructional Videos with Weak Supervision – This paper shows how to build AI systems capable of analysing and mimicking human behaviour. An area of research that is growing in this field is procedure planning which can assist humans in goal-directed problems like cooking, repairing gadgets, etc.
- Day-to-Night Image Synthesis for Training Nighttime Neural ISPs – It shows how to synthesise the nighttime image data needed to train Night Mode using neural Image Signal Processors (ISPs) on smartphone cameras. Through this, it is possible to convert clear daytime images into nighttime image pairs.
On the starting day, Waymo held a tutorial session on synthetic camera data generation for autonomous driving at the LatinX in CV (LXCV) Research workshop. The next day, it conducted a workshop on Autonomous Driving. The Waymo Research team shared results from this year’s Waymo Open Dataset Challenges too.
On June 22, a team from Waymo and Google Research will present BlockNeRF. It is a method of large-scale scene reconstruction based on camera images. Waymo will also present what they are working on in a poster session on a novel data-driven range image compression algorithm called RIDDLE (Range Image Deep Delta Encoding).