Everything You Need To Know About Facebook’s Deep Learning Library PyTorchVideo

Deep video understanding is one of the most challenging tasks in computer vision. With the rise of computing power and the amount of video data on the internet, the demand for new-age machine learning models and tools continues to grow. As per Stanford University, technologies used to develop object detection from videos are maturing rapidly.

Facebook AI recently unveiled a new deep learning library for video understanding called PyTorchVideo. The source code is available on GitHub

With PyTorchVideo, Facebook aims to help researchers develop cutting-edge machine learning models and tools to enhance video understanding capabilities, alongside providing a unified repository of reproducible and efficient video understanding components for research and production applications. 

In addition to this, Facebook is looking to standardise video-focused libraries that serve various video use cases in one place. “This has created a barrier for developers looking to work with videos for the first time,” said Facebook AI, stating that lack of standardisation makes it difficult to collaborate and spur innovation. 

In the coming months, Facebook will improve the PyTorchVideo library to enable and support more groundbreaking research in video understanding. “We welcome contributions from the entire community. All our efforts will be directed at supporting the rich open-source community committed to pushing the boundaries of video research,” said Facebook. 

PyTorchVideo: In a nutshell 

Today, the PyTorchVideo library supports components that can be used for various video understanding applications, including video classification, self-supervised learning, detection, and optical flow, among others.

The video understanding library supports other modalities, including audio and text. It is not just limited to desktop devices; its Accelerator package also provides mobile hardware-specific optimization and model deployment flow. 

Some of the core features of PyTorchVideo include: 

  • Enables researchers to build new video architectures using its video models and pretrained weights with customizable components 
  • It consists of a set of downstream tasks like action classification, action detection, acoustic event detection and self-supervised learning (SSL)
  • Supports a wide range of datasets and tasks for benchmarking various video models under different evaluation protocols 
  • Promotes hardware-aware model design and full-speed on-device model execution using efficient building blocks and deployment flow optimized for inference on hardware like mobile devices, Intel NNPI, etc. 
  • Offers access to a growing toolkit of standard scripts for video processing, including tracking, decoding and optimal flow extracting 

At present, PyTorch Video is being used by Facebook AI for various research works, including:

Also, it has been used to fuel recent advances in video transformers and self-supervised learning, including:

There is a shortage of open-source codes and libraries for developing video understanding tools and models, which makes it difficult for researchers to exchange ideas, compare notes and accelerate innovation in the space.

Facebook AI’s PyTorchVideo can bolster innovation in the video understanding space, going beyond the realm of deep fakes and synthetic media propaganda. 

Download our Mobile App

Amit Raja Naik
Amit Raja Naik is a seasoned technology journalist who covers everything from data science to machine learning and artificial intelligence for Analytics India Magazine, where he examines the trends, challenges, ideas, and transformations across the industry.

Subscribe to our newsletter

Join our editors every weekday evening as they steer you through the most significant news of the day.
Your newsletter subscriptions are subject to AIM Privacy Policy and Terms and Conditions.

Our Upcoming Events

15th June | Bangalore

Future Ready | Lead the AI Era Summit

15th June | Online

Building LLM powered applications using LangChain

17th June | Online

Mastering LangChain: A Hands-on Workshop for Building Generative AI Applications

20th June | Bangalore

Women in Data Science (WiDS) by Intuit India

Jun 23, 2023 | Bangalore

MachineCon 2023 India

26th June | Online

Accelerating inference for every workload with TensorRT

MachineCon 2023 USA

Jul 21, 2023 | New York

Cypher 2023

Oct 11-13, 2023 | Bangalore

3 Ways to Join our Community

Telegram group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox