close Home What is Sign Up Login

Audioset - Google ResearchAudioset - Google Research

sweetdata about a year ago 1.0.0 FREE
Download this dataset


evaluation.csv 1MB unbalanced.csv 94MB balanced.csv 1MB


# Content AudioSet consists of an expanding ontology of 632 audio event classes and a collection of 2,084,320 human-labeled 10-second sound clips drawn from YouTube videos. The ontology is specified as a hierarchical graph of event categories, covering a wide range of human and animal sounds, musical instruments and genres, and common everyday environmental sounds. # Dataset split The dataset is divided in three disjoint sets: a balanced evaluation set, a balanced training set, and an unbalanced training set. In the balanced evaluation and training sets, we strived for each class to have the same number of examples. The unbalanced training set contains the remainder of annotated segments. ## Evaluation 20,383 segments from distinct videos, providing at least 59 examples for each of the 527 sound classes that are used. Because of label co-occurrence, many classes have more examples. ## Balanced train 22,176 segments from distinct videos chosen with the same criteria: providing at least 59 examples per class with the fewest number of total segments. ## Unbalanced train 2,042,985 segments from distinct videos, representing the remainder of the dataset. # Source [A large-scale dataset of manually annotated audio events](


Evaluation setaudioset/evaluation

Youtube Video ID Start Time (seconds) End Time (seconds) Classes

Balanced setaudioset/balanced

Youtube Video ID Start Time (seconds) End Time (seconds) Classes

Unbalanced setaudioset/unbalanced

Youtube Video ID Start Time (seconds) End Time (seconds) Classes


OR Create an Account