An ML algorithm to distinguish Progressive Rock music from everything else.
Run save_feature.py
.
-
Each subfolder in the
tran_set\
has an.ods
file stating the list of songs, and.mp3
files for songs.Progressive_Rock_Songs\
: {'songs': 142, 'others': ['prog_train.ods']}Not_Progressive_Rock\Top_Of_The_Pops\
: {'songs': 87, 'others': ['notprog_top_pops_train.ods']}Not_Progressive_Rock\Other_Songs\
: {'songs': 272, 'others': ['notprog_other_train.ods']}
-
Load music files using
librosa
.
We build several baseline models based on the last years' best project, as well as several new models.
conv1d(43,64) -> conv1d(64,64) -> linear(6848,600) -> linear(600,10) -> linear(10,2)
conv1d(43,64) -> conv1d(64,128) -> linear(13696,600) -> linear(600,30) -> linear(30,4) -> linear(4,2)
conv1d(43,64) -> conv1d(64,128) -> conv1d(128,256) -> conv1d(256,512) -> linear(13312,100) -> linear(100,10) -> linear(10,2)
conv1d(43,86) -> conv1d(86,172) -> conv1d(172,344) -> linear(18232,2)
conv1d(43,64) -> conv1d(64,128) -> conv1d(128,128) -> conv1d(128,64) -> conv1d(64,32)-> conv1d(32,32) -> linear(832,2)
- O’Brien, Tim. "Musical Structure Segmentation with Convolutional Neural Networks." 17th International Society for Music Information Retrieval Conference. 2016.
conv1d(43,128) -> conv1d(128,128) -> conv1d(128,256) -> conv1d(256,256) -> conv1d(256,256) -> conv1d(256,256) -> conv1d(256,512)-> conv1d(512,10) -> linear(40,2)
- Pons, O. Nieto, M. Prockup, E. Schmidt, A. Ehmann, and X. Serra, “End-to-end learning for music audio tagging at scale,” in Intl Society for Music Inf Retrieval Conf, 2018, pp. 1–8.
conv1d(43,128) -> res1d(128,128) -> res1d(128,256) -> … -> res1d(256,512)-> conv1d(512,10) -> linear(40,2)
- Allamy, Safaa, and Alessandro Lameiras Koerich. "1D CNN architectures for music genre classification." 2021 IEEE symposium series on computational intelligence (SSCI). IEEE, 2021.
Recurrent NN
-
Place saved feature
json
files in relative path to this repo at../data/[feature.json]
. The default paths are set intrain.py
as:non_prog_other_path = "../data/Feature_Extraction_Other.json" non_prog_pop_path = "../data/Feature_Extraction_Top_Pop.json" prog_path = "../data/Feature_Extraction_Prog.json"
-
Run
main.py
. -
For new CNN models, just import them into
main.py
and add corresponding(model_name, model())
tomodel_dict
inmain.py
.
-
Feature plots are saved in
output
folder as.pdf
. -
Model results are saved in
output/model
folder named after the model name. These include- Confusion matrices for train/test snippets/songs (all in one file)
- Average confusion matrix after multiple runs
- Model test labeling result
- Model pickle file
-
Log file is generated at
output/log_file.log
- Increasing the cutoff value of binary classification in the last layer of neural network
- Increasing the cutoff value of the proportion of snippets to become a Prog Rock song
- We try 8 different models (CNN Structure, Recurrent Structure, and ResNet structure) with two types of snippets (non-overlap and 50% overlap).
- ResNet structure with normal snippets provides best prediction accuracy.
- With post-processing techniques, we can further improve the model accuracy to 82.64%, which has 3.5% improvement compared to the baseline model.
- We still find some type of musics that are hard to be classified under our criterion.
- For future work, we suggest:
- More advanced post-processing classification techniques;
- More advanced image classification techniques Neural ODEs (Cui et al. 2023);
- Extract other features like lyrics.