GTZAN Music Speech Dataset

Visualisation of Gtzan Music Speech in the Deep Lake UI
GTZAN Music Speech dataset was created for the purposes of music/speech discrimination and is similar to the GTZAN Genre dataset. The dataset consists of 120 tracks, each containing 30 seconds of audio. The tracks in the dataset are all 22050Hz Mono 16-bit audio files in .wav format. Also, each class (music/speech) in the GTZAN Music Speech dataset has 60 samples.
Instead of downloading the GTZAN Music Speech dataset in Python, you can effortlessly load it in Python via our Deep Lake open-source with just one line of code.
import deeplake
ds = deeplake.load("hub://activeloop/gtzan-music-speech")
GTZAN Music Speech Data Fields
- audio: tensor contains audio file .wav format.
- label: tensor representing the .wav files as music or speech.
GTZAN Music Speech Data Splits
- The GTZAN Music Speech dataset training set is composed of 128 audio files.
Train a model on GTZAN Music Speech dataset with PyTorch in Python
Let’s use Deep Lake built-in PyTorch one-line dataloader to connect the data to the compute:
dataloader = ds.pytorch(num_workers=0, batch_size=4, shuffle=False)
Train a model on GTZAN Music Speech dataset with TensorFlow in Python
dataloader = ds.tensorflow()
- Homepage: http://marsyas.info/index.html
- Paper: Tzanetakis, George : GTZAN Music/Speech Collection
- Point of Contact: gtzan@cs.uvic.ca
GTZAN Music Speech Dataset Curators
GTZAN Music Speech Dataset Licensing Information
GTZAN Music Speech Dataset Citation Information
@ONLINE {Music Speech,
author = "Tzanetakis, George",
title = "GTZAN Music/Speech Collection",
year = "1999",
url = "http://marsyas.info/index.html"
}
What is the GTZAN Music Speech dataset for Python?
The GTZAN Music Speech dataset consists of 120 tracks, each containing 30 seconds of audio. The dataset was created for music/speech discrimination and is similar to the GTZAN Genre dataset. The tracks in the dataset are 22050Hz Mono 16-bit audio files and each class (music/speech) has 60 samples.
What is the GTZAN Music Speech dataset used for?
How to download the GTZAN Music Speech dataset in Python?
You can load GTZAN Music Speech dataset fast with one line of code using the open-source package Activeloop Deep Lake in Python. See detailed instructions on how to load GTZAN Music Speech dataset training subset in Python.
How can I use GTZAN Music Speech dataset in PyTorch or TensorFlow?
You can stream GTZAN Music Speech dataset while training a model in PyTorch or TensorFlow with one line of code using the open-source package Activeloop Deep Lake in Python. See detailed instructions on how to train a model on GTZAN Music Speech dataset with PyTorch in Python or train a model on GTZAN Music Speech dataset with TensorFlow in Python.