MITB Banner

How To Use UCF101, The Largest Dataset Of Human Actions

UCF-101 dataset has 101 actions and 13320 clips of human actions, collected from youtube were first introduced in 2012 by researchers: Khurram Soomro, Amir Roshan Zamir and Mubarak Shah of Center for Research in Computer Vision, Orlando, FL 32816, USA. The clips in the action class are divided into 25 groups. Each group contains 4-7 clips. Clips in each group share some common features like background or actor.

Share

video_dataset

UCF-101 dataset has 101 actions and 13320 clips of human actions, collected from youtube were first introduced in 2012 by researchers: Khurram Soomro, Amir Roshan Zamir and Mubarak Shah of Center for Research in Computer Vision, Orlando, FL 32816, USA. The clips in the action class are divided into 25 groups. Each group contains 4-7 clips. Clips in each group share some common features like background or actor.

UCF Sports, UCF11, UCF50 and UCF101 are the datasets arranged by UCF in sequential order, each one incorporates its forerunner. UCF-101 is the largest among them with 101 classes. This dataset gives the biggest variety as far as activities and with the presence of huge varieties in camera movement, object appearance and posture, object scale, perspective, jumbled foundation, light conditions.

Here, we will discuss the dataset and see how to load the dataset using TensorFlow and PyTorch. Further, we will implement the UCF-101 dataset in TensorFlow.

About the dataset

The dataset can be downloaded from the following link. It includes web videos which are recorded in various lighting conditions and low-quality frames. 101 human actions classes are divided into 5 types: Human-Object Interaction, HumanHuman Interaction, Playing Musical, Body-Motion Only, Instruments and Sports.

human_actions

Load the dataset using different deep learning frameworks.

Tensorflow

import tensorflow as tf
import tensorflow_datasets as tfds
import math
 x_train = tfds.load('ucf101', split='train', shuffle_files=True, batch_size = 64)

Pytorch

import torch
import torchvision
ucf_data = torchvision.datasets.UCF101(root,annotation_path,frames_per_clip,step_between_clips=1frame_rate=None,fold=1,train=True,transform=None,_precomputed_metadata=Nonenum_workers=1,_video_width=0,_video_height=0,_video_min_dimension=0_audio_samples=0))
data_loader = torch.utils.data.DataLoader(ucf_data,
                                          batch_size=4,
                                          shuffle=True,
                                          num_workers=args.nThreads)

Let’s define the parameters in the UCF101 Class:

·   root  – It is the root directory of the UCF101 Dataset.

·   annotation_path – It contains the split files.

·   frames_per_clip – Number of frames in a clip for the UCF dataset.

·   step_between_clips – Number of frames between each clip.

·   fold – We need to check which fold to use. It Should be between 1 and 3.

·   train – Creates a dataset from the train split if the statement is true.

Practical Implementation Using Tensorflow

#Import all the libraries required for this project
import tensorflow as tf
import tensorflow_hub as hub
import random
import os
import ssl
import cv2
import numpy as np
import imageio
from IPython import display
from urllib import request
import re
import tempfile

Reading the Video dataset

# fetch videos from UCF101 dataset
UCF_ROOT = "https://www.crcv.ucf.edu/THUMOS14/UCF101/UCF101/"
_VIDEO_LIST = None
_CACHE_DIR = tempfile.mkdtemp()
# default Colab environment anymore.
unverified_context = ssl._create_unverified_context()
def list_ucf_videos():
  global _VIDEO_LIST
  if not _VIDEO_LIST:
    index = request.urlopen(UCF_ROOT, context=unverified_context).read().decode("utf-8")
    videos = re.findall("(v_[\w_]+\.avi)", index)
    _VIDEO_LIST = sorted(set(videos))
  return list(_VIDEO_LIST)
def fetch_ucf_video(video):
  cache_path = os.path.join(_CACHE_DIR, video)
  if not os.path.exists(cache_path):
    urlpath = request.urljoin(UCF_ROOT, video)
    print("Fetching %s => %s" % (urlpath, cache_path))
    data = request.urlopen(urlpath, context=unverified_context).read()
    open(cache_path, "wb").write(data)
  return cache_path
def crop_center_square(frame):
  y, x = frame.shape[0:2]
  min_dim = min(y, x)
  start_x = (x // 2) - (min_dim // 2)
  start_y = (y // 2) - (min_dim // 2)
  return frame[start_y:start_y+min_dim,start_x:start_x+min_dim]
def load_video(path, max_frames=0, resize=(224, 224)):
  cap = cv2.VideoCapture(path)
  frames = []
  try:
    while True:
      ret, frame = cap.read()
      if not ret:
        break
      frame = crop_center_square(frame)
      frame = cv2.resize(frame, resize)
      frame = frame[:, :, [2, 1, 0]]
      frames.append(frame)
      if len(frames) == max_frames:
        break
  finally:
    cap.release()
  return np.array(frames) / 255.0
def to_gif(images):
  converted_images = np.clip(images * 255, 0, 255).astype(np.uint8)
  imageio.mimsave('./animation.gif', converted_images, fps=25)
  return embed.embed_file('./animation.gif')

Get the list of videos in the dataset

ucf_videos = list_ucf_videos()
categories = {}
for video in ucf_videos:
  category = video[2:-12]
  if category not in categories:
    categories[category] = []
  categories[category].append(video)
print("Found %d videos in %d categories." % (len(ucf_videos), len(categories)))
for category, sequences in categories.items():
  summary = ", ".join(sequences[:2])
  print("%-20s %4d videos (%s, ...)" % (category, len(sequences), summary))
Video_categories

Load a sample video

# Get a sample cricket video.
video_path = fetch_ucf_video("v_VolleyballSpiking_g01_c01.avi")
sample_video = load_video(video_path)
i3d = hub.load("https://tfhub.dev/deepmind/i3d-kinetics-400/1").signatures['default']

Prediction on a sample video

def predict(sample_video):
  # Add a batch axis to the sample video.
  model_input = tf.constant(sample_video, dtype=tf.float32)[tf.newaxis, ...]
  logits = i3d(model_input)['default'][0]
  probabilities = tf.nn.softmax(logits)
  print("Top 5 actions:")
  for i in np.argsort(probabilities)[::-1][:5]:
    print(f"  {labels[i]:22}: {probabilities[i] * 100:5.2f}%")
Top_5_actions_UCF-101

State of the Art

The current state of the art on UCF 101 dataset is R2+1D-BERT. The model gave an accuracy of 98.69. LGD-3D Two-stream and Two Stream I3D performed well on these actionable datasets with accuracy over 98%.

Final Thoughts

In this article, we have presented UCF101 which is one of the most testing dataset for activity acknowledgement contrasted with the current ones. It incorporates 101 activity classes and over 13k clips.The research on the same is still in progress so that there can be a further increase in the accuracy of the model. Hope this article is useful to you.

Share
Picture of Ankit Das

Ankit Das

A data analyst with expertise in statistical analysis, data visualization ready to serve the industry using various analytical platforms. I look forward to having in-depth knowledge of machine learning and data science. Outside work, you can find me as a fun-loving person with hobbies such as sports and music.
Related Posts

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

Upcoming Large format Conference

May 30 and 31, 2024 | 📍 Bangalore, India

Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

AI Courses & Careers

Become a Certified Generative AI Engineer

AI Forum for India

Our Discord Community for AI Ecosystem, In collaboration with NVIDIA. 

Flagship Events

Rising 2024 | DE&I in Tech Summit

April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore

MachineCon GCC Summit 2024

June 28 2024 | 📍Bangalore, India

MachineCon USA 2024

26 July 2024 | 583 Park Avenue, New York

Cypher India 2024

September 25-27, 2024 | 📍Bangalore, India

Cypher USA 2024

Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA

Data Engineering Summit 2024

May 30 and 31, 2024 | 📍 Bangalore, India

Subscribe to Our Newsletter

The Belamy, our weekly Newsletter is a rage. Just enter your email below.