Guide To FreeSound Datasets With Implementation In PyTorch

The FreeSound is a hierarchical collection of sound classes of more than 600 and has filled them with the audio samples of 297,144.

The FreeSound is a hierarchical collection of sound classes of more than 600 and has filled them with the audio samples of  297,144. The process is generating  685,403 candidate annotations that express the potential presence of sound sources in audio clips. FreeSound Dataset includes the outcome of everyday sounds, from human and animal sounds to music and sounds made by things.

Freesound is developed by Music Technology Research Group, Pompeu Fabra University, Barcelona.

To download the free sound dataset for the research project, refer to the following.


Sign up for your weekly dose of what's up in emerging technology.



FreeSound Github:

They collect data for the following 

  1. Artistic creation

2. Cultural Preservation

3. Education

4. Health and Well Being

5. Sustainable development

The Music Technology Group is organized into four labs, each one led by a faculty member.

  1. Audio Signal Processing Lab: Faculty, Head of the lab  Xavier Serra.

The lab is concentrated on to advance within the understanding of sound and music signals by combining signal processing and machine learning methods.

2. Music Information Research Lab: Emilia Gomez, Head of the lab.

The lab works on topics like sound and music description, music information retrieval, vocalization synthesis, audio source separation, music, and audio processing. 

3. Music and Multimodal Interaction Lab: Sergi Jorda, Head of the lab.

The lab focuses on multimodal interactive technologies and the way to use them for music creation.

4. Music and Machine Learning Lab: Rafael Ramírez, Faculty, Head of the lab.

The lab is concentrated on the intersection of music technology, AI, deep learning, and neuroscience with their applications.

Download Size: 20 GB


Using Pytorch:
 import sys, os
 import torch
 import librosa
 import numpy as np
 import pandas as pd
 from torch import Tensor
 from import wavfile
 from torchvision import transforms
 from import DataLoader
 from import Dataset
 class Freesound(Dataset):
     def __init__(self, transform=None, mode="train"):
         # setting directories for data
         data_root = "../input"
         self.mode = mode
         if self.mode is "train":
             self.data_dir = os.path.join(data_root, "audio_train")
             self.csv_file = pd.read_csv(os.path.join(data_root, "train.csv"))
         elif self.mode is "test":
             self.data_dir = os.path.join(data_root, "audio_test")
             self.csv_file = pd.read_csv(os.path.join(data_root, "sample_submission.csv"))
         # dict for mapping class names into indices. can be obtained by 
         # {cls_name:i for i, cls_name in enumerate(csv_file["label"].unique())}
         self.classes = {'Acoustic_guitar': 38, 'Applause': 37, 'Bark': 19, 'Bass_drum': 21, 
 'Burping_or_eructation': 28, 'Bus': 22, 'Cello': 4, 'Chime': 20, 'Clarinet': 7,'Computer_keyboard': 8, 'Cough': 17, 'Cowbell': 33, 'Double_bass': 29, 'Drawer_open_or_close': 36, 'Electric_piano': 34, 'Fart': 14, 'Finger_snapping': 40, 'Fireworks': 31, 'Flute': 16, 'Glockenspiel': 3, 'Gong': 26, 'Gunshot_or_gunfire': 6, 'Harmonica': 25, 'Hi-hat': 0, 'Keys_jangling': 9, 'Knock': 5, 'Laughter': 12, 'Meow': 35, 'Microwave_oven': 27, 'Oboe': 15, 'Saxophone': 1, 'Scissors': 24, 'Shatter': 30, 'Snare_drum': 10, 'Squeak': 23, 'Tambourine': 32, 'Tearing': 13, 'Telephone': 18, 'Trumpet': 2, 'Violin_or_fiddle': 39,  'Writing': 11}
         self.transform = transform
     def __len__(self):
         return self.csv_file.shape[0] 
     def __getitem__(self, idx):
         filename = self.csv_file["fname"][idx]
         rate, data =, filename))
         if self.transform is not None:
             data = self.transform(data)
         if self.mode is "train":
             label = self.classes[self.csv_file["label"][idx]]
             return data, label
         elif self.mode is "test":
             return data
 if __name__ == '__main__':
     import matplotlib.pyplot as plt
     tsfm = transforms.Compose([
         lambda x: x.astype(np.float32) / np.max(x), # rescale to -1 to 1
         lambda x: librosa.feature.mfcc(x, sr=44100, n_mfcc=40), # MFCC 
         lambda x: Tensor(x)
     # todo: multiprocessing, padding data
     dataloader = DataLoader(
         Freesound(transform=tsfm, mode="train"), 
     for index, (data, label) in enumerate(dataloader):
         plt.imshow(data.numpy()[0, :, :])
         if index == 0:


  1. Audio Tagging System

Audio tagging is a technique to update the meta-data fields in MP3 and other compressed audio files. An audio tag detector is used to correct the meta-data in individual files or to apply a category to a group of files.

  1. Emotion and theme recognition

It involves the prediction of moods and themes conveyed by a music track, given the raw audio.

  1. Automatic assessment system for musical exercises

Music Critic is employed to gauge musical exercises sung by students, to allow meaningful feedback. It is often easily integrated into online applications and education platforms.

  1. Animal Sound Recognition:

The ability to automatically recognize a large range of animal sounds can analyze the habits and distributions of animals, which makes it possible to watch and protect them effectively.


We have learned about the Freesound dataset, how we can download it from the source. Freesound dataset creator and their researcher. Implementation of model in PyTorch data loader for speaker audio tagging Recognition and some of the application of FreeSound Datasets.

More Great AIM Stories

Amit Singh
Amit Singh is Data Scientist, graduated in Computer Science and Engineering. Data Science writer at Analytics India Magazine.

Our Upcoming Events

Conference, in-person (Bangalore)
Machine Learning Developers Summit (MLDS) 2023
19-20th Jan, 2023

Conference, in-person (Bangalore)
Rising 2023 | Women in Tech Conference
16-17th Mar, 2023

Conference, in-person (Bangalore)
Data Engineering Summit (DES) 2023
27-28th Apr, 2023

Conference, in-person (Bangalore)
MachineCon 2023
23rd Jun, 2023

Conference, in-person (Bangalore)
Cypher 2023
20-22nd Sep, 2023

3 Ways to Join our Community

Whatsapp group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our newsletter

Get the latest updates from AIM