This site contains information on reproducing the experiments in the paper
A Fully Convolutional Deep Auditory Model for Musical Chord Recognition
Korzeniowski, F. and Widmer, G.
In Proceedings of the IEEE International Workshop on Machine Learning for Signal Processing (MLSP), Salerno, Italy, 2016.
The pre-trained chord recognition model is available as part of the madmom audio processing framework. It differs only slightly from the model presented in the paper: instead of padding the input in the first few convolutional layers, we use a larger context window in the input layer, such that the feature map size after the first pooling layer is retained.
The model provided with madmom is trained on a variety of datasets: Isophonics, RWC Popular, Robbie Williams, and Billboard. If you use it in research, keep in mind that its performance on these sets might be optimistic. We plan to provide model files for individual folds in the future.
Reproducing the experiments
Although we tried to facilitate reproducing our experiments as easily as possible, doing it is much more involved than applying the pre-trained model. You will have re-create the experimental pipeline, install all necessary libraries, and prepare the audio and chord data.
Experimental Pipeline Setup
Install the chordrec framework by following the instructions in the README file.
Put all datasets into respective subdirectories under
rwc. The datasets have to contain three types of data: audio files in
.flac format, corresponding chord annotations in lab format with the file
.chords, and the cross-validation split definitions. Audio and
annotation files can be organised on a directory structure, but do not need
to; the programs will look for any
.chord files in all
directories recursively. However, the split definition
files must be in a
splits sub-directory in each dataset directory (e.g.
beatles/splits). File names of audio and annotation files must correspond to
the names given in the split definition files. For more information regarding
the data take a look at the Data section below, where we provide a
with the annotations and split definitions that you just need to extract
data directory should look like this, where the internal structures
zweieck directories following
the one of the
experiments +-- data +-- beatles +-- *.flac +-- *.chords +-- splits +-- 8-fold_cv_album_distributed_*.fold +-- queen +-- robbie_williams +-- rwc +-- zweieck
Make sure the link
chordrec/experiments/mlsp2016/data refers to this
directory and works.
Running the Experiment
chordrec/experiments/mlsp2016/README file to run the experiments.
Training the CNN can take a while.
Results are stored in the
results sub-directory created when running the
experiment. The results of each experiment is stored in the
file in each subdirectory. To get a quick overview of the results, you can
simply use the
grep command, e.g.:
grep majmin results/*/artifacts/results.yaml
We trained the neural network on a compound dataset comprising the Beatles,
Queen, Zweieck, Robbie Williams and RWC popular music datasets. While we cannot
provide audio files due to copyright reasons, we do provide links with more
information about these datasets and to the chord annotations we used. Note
that our file naming scheme differs from the annotation archives you can
download on the respective sites. For convenience, we provide a
with annotations following our naming scheme further below.
- Beatles, Queen and Zweieck: See the isophonics website for chord annotations and information about audio files.
- Robbie Williams: The dataset description can be found in Bruno Di Giorgi et. al., “Automatic chord recognition based on the probabilistic modeling of diatonic modal harmony”. Download the annotations from here.
- RWC: For information on obtaining the audio files, see the RWC website. Chord annotations are on GitHub.
For convenience, we provide the annotations renamed to our naming scheme
archive also includes the fold definitions for cross-validation for each
dataset. You can extract this archive into the
experiments/data directory and
add the audio files to the respective directories. Then you should be ready to