Skip to content

Code snippets and scripts for Can machines play the piano? presentation at PyData London 2024

Notifications You must be signed in to change notification settings

Nospoko/midi-pydata-london-24

Repository files navigation

midi-pydata-london-24

Presentation, code snippets and scripts for Can machines play the piano? presentation at PyData London 2024

Presentation

To start presentation run:

streamlit run --server.port 4001 presentation.py

MIDI data

MIDI to dataframe

To load a midi file into a dataframe, we use:

import fortepyan as ff

piece = ff.MidiPiece.from_file(path="data/midi/piano.mid")
python midi_basics/midi_to_dataframe.py

Maestro dataset

You can find maestro dataset with "notes" and "source" column at huggingface

from datasets import load_dataset

dataset = load_dataset("roszcz/maestro-sustain-v2")
Split Records Duration (hours) Number of notes (millions)
0 Train 962 159.4174 5.6593
1 Validation 137 19.4627 0.6394
2 Test 177 20.0267 0.7414
3 Total 1,276 198.9068 7.0402

Average notes per second

We can calculate average notes per second in maestro dataset by counting rows in dataframes created from dataset and dividing them by total time.

for record in dataset:
    total_notes += len(record["notes"]["pitch"])
    total_time += max(record["notes"]["start"]) - min(record["notes"]["start"])

By using "start" time we are calculating how many notes were pressed in a second on average.

python midi_basics/notes_per_second.py

streamlit-pianoroll

To visualize and listen to a midi file, we can use streamlit-pianoroll component.

import fortepyan as ff
import streamlit_pianoroll
from datasets import load_dataset

dataset = load_dataset("roszcz/maestro-sustain-v2", split="test")
record = dataset[77]

piece = ff.MidiPiece.from_huggingface(record=record)
streamlit_pianoroll.from_fortepyan(piece=piece)
streamlit run midi_basics/streamlit_piece.py

Comparing MIDI Pieces

Plotting Note Pitches Comparison and Note Time Difference Comparison

We can compare the distribution of note pitches between two MIDI pieces using matplotlib histograms. This can provide insights into the pitch range and distribution within each piece.

python -m streamlit run --server.port 4014 midi_basics/compare_pieces.py

pitch comparison

Duration Distribution Comparison

You can compare the distribution of note durations between MIDI pieces composed by different composers using histograms generated with matplotlib. This comparison helps in understanding the temporal characteristics of musical compositions and may reveal stylistic differences or compositional preferences.

python -m streamlit run --server.port 4015 midi_basics/compare_composers.py

duration comparison

Modelling

Augmentation

Augmentation review:

python -m streamlit run --server.port 4016 modelling/augmentation.py

Note range extraction

Predicting a sub-sequence of notes within defined range from a sequence of notes deprived of it is an interesting downstream task and a possible benchmark task. The results of generating them with a model can be interesting, as they show the model's understanding of musical structure and harmony.

Here is a review of the sub-sequence extraction

python -m streamlit run --server.port 4017 modelling/extract_notes.py
Voice Range ( in pitch value )
BASS 21-48
TENOR 43-81
ALTO 53-84
SOPRANO 60-96
TREBLE 60-108

Important Links

The Maestro dataset used in the experiments can be found here:
https://huggingface.co/datasets/roszcz/maestro-sustain-v2

You can also check out our organization GitHub with tools and experiments:
https://github.com/your-organization-link

For questions, reach out to Wojtek Matejuk at:
[email protected]

Explore and play with MIDI and share your compositions on:
https://pianoroll.io

If you play the piano and want to help source training data, track your practices there.

Deployment

Run with docker:

docker build -t pydata-london-24 .
docker run -p 4334:4334 pydata-london-24

Code Style

This repository uses pre-commit hooks with forced python formatting (black, flake8, and isort):

pip install pre-commit
pre-commit install

Whenever you execute git commit the files altered / added within the commit will be checked and corrected. black and isort can modify files locally - if that happens you have to git add them again. You might also be prompted to introduce some fixes manually.

To run the hooks against all files without running git commit:

pre-commit run --all-files

About

Code snippets and scripts for Can machines play the piano? presentation at PyData London 2024

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published