Guide To FreeSound Datasets With Implementation In PyTorch

The FreeSound is a hierarchical collection of sound classes of more than 600 and has filled them with the audio samples of 297,144.

The FreeSound is a hierarchical collection of sound classes of more than 600 and has filled them with the audio samples of  297,144. The process is generating  685,403 candidate annotations that express the potential presence of sound sources in audio clips. FreeSound Dataset includes the outcome of everyday sounds, from human and animal sounds to music and sounds made by things.

Freesound is developed by Music Technology Research Group, Pompeu Fabra University, Barcelona.

To download the free sound dataset for the research project, refer to the following.

Datasets: https://github.com/MTG/freesound-datasets.

Research/Publication:https://www.upf.edu/web/mtg/research/publications.

FreeSound Github:https://github.com/MTG/freesound.

They collect data for the following 

  1. Artistic creation

2. Cultural Preservation

3. Education

4. Health and Well Being

5. Sustainable development

The Music Technology Group is organized into four labs, each one led by a faculty member.

  1. Audio Signal Processing Lab: Faculty, Head of the lab  Xavier Serra.

The lab is concentrated on to advance within the understanding of sound and music signals by combining signal processing and machine learning methods.

2. Music Information Research Lab: Emilia Gomez, Head of the lab.

The lab works on topics like sound and music description, music information retrieval, vocalization synthesis, audio source separation, music, and audio processing. 

3. Music and Multimodal Interaction Lab: Sergi Jorda, Head of the lab.

The lab focuses on multimodal interactive technologies and the way to use them for music creation.

4. Music and Machine Learning Lab: Rafael Ramírez, Faculty, Head of the lab.

The lab is concentrated on the intersection of music technology, AI, deep learning, and neuroscience with their applications.

Download Size: 20 GB

DataLoader:

Using Pytorch:
 import sys, os
 import torch
 import librosa
 import numpy as np
 import pandas as pd
 from torch import Tensor
 from scipy.io import wavfile
 from torchvision import transforms
 from torch.utils.data import DataLoader
 from torch.utils.data.dataset import Dataset
 class Freesound(Dataset):
     def __init__(self, transform=None, mode="train"):
         # setting directories for data
         data_root = "../input"
         self.mode = mode
         if self.mode is "train":
             self.data_dir = os.path.join(data_root, "audio_train")
             self.csv_file = pd.read_csv(os.path.join(data_root, "train.csv"))
         elif self.mode is "test":
             self.data_dir = os.path.join(data_root, "audio_test")
             self.csv_file = pd.read_csv(os.path.join(data_root, "sample_submission.csv"))
         # dict for mapping class names into indices. can be obtained by 
         # {cls_name:i for i, cls_name in enumerate(csv_file["label"].unique())}
         self.classes = {'Acoustic_guitar': 38, 'Applause': 37, 'Bark': 19, 'Bass_drum': 21, 
 'Burping_or_eructation': 28, 'Bus': 22, 'Cello': 4, 'Chime': 20, 'Clarinet': 7,'Computer_keyboard': 8, 'Cough': 17, 'Cowbell': 33, 'Double_bass': 29, 'Drawer_open_or_close': 36, 'Electric_piano': 34, 'Fart': 14, 'Finger_snapping': 40, 'Fireworks': 31, 'Flute': 16, 'Glockenspiel': 3, 'Gong': 26, 'Gunshot_or_gunfire': 6, 'Harmonica': 25, 'Hi-hat': 0, 'Keys_jangling': 9, 'Knock': 5, 'Laughter': 12, 'Meow': 35, 'Microwave_oven': 27, 'Oboe': 15, 'Saxophone': 1, 'Scissors': 24, 'Shatter': 30, 'Snare_drum': 10, 'Squeak': 23, 'Tambourine': 32, 'Tearing': 13, 'Telephone': 18, 'Trumpet': 2, 'Violin_or_fiddle': 39,  'Writing': 11}
         self.transform = transform
     def __len__(self):
         return self.csv_file.shape[0] 
     def __getitem__(self, idx):
         filename = self.csv_file["fname"][idx]
         rate, data = wavfile.read(os.path.join(self.data_dir, filename))
         if self.transform is not None:
             data = self.transform(data)
         if self.mode is "train":
             label = self.classes[self.csv_file["label"][idx]]
             return data, label
         elif self.mode is "test":
             return data
 if __name__ == '__main__':
     import matplotlib.pyplot as plt
     tsfm = transforms.Compose([
         lambda x: x.astype(np.float32) / np.max(x), # rescale to -1 to 1
         lambda x: librosa.feature.mfcc(x, sr=44100, n_mfcc=40), # MFCC 
         lambda x: Tensor(x)
         ])
     # todo: multiprocessing, padding data
     dataloader = DataLoader(
         Freesound(transform=tsfm, mode="train"), 
         batch_size=1,
         shuffle=True, 
         num_workers=0)
     for index, (data, label) in enumerate(dataloader):
         print(label.numpy())
         print(data.shape)
         plt.imshow(data.numpy()[0, :, :])
         plt.show()
         if index == 0:
             Break 

Application:

  1. Audio Tagging System

Audio tagging is a technique to update the meta-data fields in MP3 and other compressed audio files. An audio tag detector is used to correct the meta-data in individual files or to apply a category to a group of files.

  1. Emotion and theme recognition

It involves the prediction of moods and themes conveyed by a music track, given the raw audio.

  1. Automatic assessment system for musical exercises

Music Critic is employed to gauge musical exercises sung by students, to allow meaningful feedback. It is often easily integrated into online applications and education platforms.

  1. Animal Sound Recognition:

The ability to automatically recognize a large range of animal sounds can analyze the habits and distributions of animals, which makes it possible to watch and protect them effectively.

Conclusion:

We have learned about the Freesound dataset, how we can download it from the source. Freesound dataset creator and their researcher. Implementation of model in PyTorch data loader for speaker audio tagging Recognition and some of the application of FreeSound Datasets.

More Great AIM Stories

Amit Singh
Amit Singh is Data Scientist, graduated in Computer Science and Engineering. Data Science writer at Analytics India Magazine.

More Stories

OUR UPCOMING EVENTS

8th April | In-person Conference | Hotel Radisson Blue, Bangalore

Organized by Analytics India Magazine

View Event >>

30th Apr | Virtual conference

Organized by Analytics India Magazine

View Event >>

MORE FROM AIM

3 Ways to Join our Community

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Telegram Channel

Discover special offers, top stories, upcoming events, and more.

Subscribe to our newsletter

Get the latest updates from AIM