Experiments in music creation using RNN.
The "classic" technic to generate music with a RNN is to aggregate songs in ABC notations and train a model on those. Since ABC notation is "just" a succession of letters and symbols we can get a lot of musical data in a text file weighting only a few Mb. This allow us to quickly train some models even on a computer without GPU(s).
I've ran different kind of experiments, starting with simple ABC files with one track and uncomplicated melodies to more complex structures.
When training the model on simple ABC files (one track, basic melodies) the RNN manages to understand the structure very quickly and generate nice tunes without much efforts. For example those are early examples produced by a network trained on religious hymns:
Another example is this set produced by a model trained on national anthems. Check the 2nd track called "Sheetmusic + NatAnthems - Jirnni" where the model made an impro on the US national anthems "Ã la Jimi Hendrix!"
Using the same logic I've tried training a model on songs with multiple instruments and tracks but the result was very average. When feeding all tracks together the results where very often quite weird and not very usable out of the box. For example the jazz songs below really sounds like they are played by a jazz band with 2 neurones, literally...
There are many abc files available but there are mostly folk songs, so to train some new models I had to get some MIDI files and convert them into ABC notation. Even though not many people are using those anymore you still can find a lot of them online. This one in particular is quite a 💎. Once I had enough files I split the tracks by instrument and train different models for each type of instruments. I wrote a small Python script for this that you can get here. I'll post a longer post on the process later.
After training models on different instruments I got some pretty decent samples and could combine drums + guitars + bass more easily. Here are some of the latest samples trained on heavy metal songs.
While the first models (hymns, national anthems, etc...) produced only maybe 1/3rd of "ready to use" samples (probably because the training sets were small and data were fed "as if" to the RNN) The last models trained separately on different instruments produce samples that are nearly always spot on and ready to be use in a midi-file.
What's next:
- Create a metal machine that can generate metal tracks endlessly 🤘
- Add voice (??)