Is it a bard, or is it a bot? An AI-music survey.
At the time of writing, we are experiencing an unprecedented surge in Artificial Intelligence (AI)-related applications, platforms and ideas, including AI music. As humans, we find ourselves awash with novel notions concerning machine learning and creativity.
And as composers and musical professionals, we are not exempt from these challenges. Some aspects of the onslaught pertain to the business end of music, while others the creative process itself. In this article, I survey AI music, seeking some insights from a production music perspective.
Almost artificial
For a start, it’s worth noting that many elements of what we now call Musical AI have been with us for a while. For example, the phonograph (1877) and tape recorder (1948) were emulating musical performance by allowing repeatable renditions of a musical performance.
In 1982, MIDI (musical instrument digital interface) changed the way we were able to represent music, as data. Modern sampling technology can realistically emulate the most refined real-world instruments, and are adept at generating non-repeating tonal soundscapes. And software DAW’s have emulated almost all functions of even the most sophisticated studio, except perhaps the mythical tea-making role for juniors.
But these stop short of what we now call AI.
Genuinely Artificial
Moving beyond timbrel (tonal) generators takes us closer to the heart of the artistic process. By “training” against huge sets of musical material, the new generation of sound technology focuses on harmonic and melodic creation. From spicing up simple chords, to generating new words and melodies, to fully mimicking great artists, AI encroaches on the inner workflows of our musicality.
A good deal of the current furor surrounds voice emulation technology. Questions of who owns (and therefor earns from) the output of the trained algorithm of famous singers and rappers which are approaching near-perfect stylistic mimicry test all aesthetic and legal frameworks.
Musical YouTuber Rick Beato discusses deep-fake music using Drakes artificial voice and asks “Does Drake own the sound of his voice, or does the record label that he’s signed to (UMG) own the sound of his voice, or is this an original composition that’s fair use?” And he predicts, “many … will like the AI versions better than the original artists … AI is here to stay”.
The flow of musical work
Those of us deeply involved in musical composition and production in one of its forms, have well-honed workflows that have proved their worth. I maintain that the core capital of today’s composer is not in their tools, but rather in their aesthetic – the ineffable combination of taste, experience, and musical intuition that is uniquely theirs.
Many of our workflows are internal and a matter between the brain, the fingers and body. Once learned, they become innate and subconscious, and it is from this combination that beautiful or useful musical output results.
Other aspects of the total workflow are more “external” – our work-spaces, instruments, technical kit, musical collaborators, business connections, technology processes and delivery platforms.
The musical AI solutions industry has analyzed these workflows and implemented them as algorithms. Human musical intelligence, observed by psychologists, neuroscientists and musicologists, has created an externalized artificial model of our artistry. And business – affordability – plays an enormous role in how these new systems work.
Platforms and functions
Platforms for AI music include
- Desktop/laptop/studio Applications
- DAW plugins (e.g. Scalar, Magenta Studio)
- Web Apps (e.g. Soundful)
- Phone Apps (iPhones Amadeus Code, Spotify’s Soundtrap)
- Social media music generators (See TikToks list)
- Dynamically created sample libraries (Splice)
- Content platform “artists” like Shutterstock’s Amper
Musical functions being modeled include
- Audio track renders including variations and stems
- Lyric and chord creation (ChatGPT)
- Sung or spoken voice generators and voice artist modelling (UberDuck)
- Chord/melody midi generation (AudioCypher)
- Full songwriting solutions (Amadeus Code)
- Auto-soundtracking services (Amper)
- Whole-style emulation (such as OpenAI’s Jukebox – see below)
AI composition workflows
Many of the more consumer-oriented apps involve simple “wizard” style front-ends, gathering a few parameters such as genre, tempo and mood, before rendering an audio output.
Genres that have evolved since computers became mainstream tools (like techno, EDM, general electronica) are favored, and the limited choice of categories provided by front-ends limits output variety. I am consistently disappointed that the less generic genres I specialize in (world and acoustic music) for example are often not on the menu at all. Maybe that is a good thing.
Other systems like Ecrett use grid-like front ends, breaking of an arrangement into units, or clips (in the mold of Ableton Live’s session view). The boast of many such apps /sites is that “no musicianship is needed”, but to me as lifelong playing musician this remains bizarre.
Other AI’s can be far more challenging, and are integrated into sophisticated DAWs as plugins (Scalar), but are best learned and integrated with some musical experience.
Go deeper, Dave
Of course, sites such as Soundraw and Boomy represent the shallows of the deep AI music complex. To get a view of behind the scenes complexity – the science – you might want to peruse OpenAI’s platforms. These include MuseNet, a deep neural network that can generate 4-minute musical compositions.
And peruse the white paper on JukeBox, which generates complex music by learned style in the raw audio domain (i.e. not “symbolically” via midi or score, but direct to sound). I warn you all – the science might just blind you!
But it shows just how abstracted from traditional musical reality AI music is. It reaches the outer limits of left-brain analytical thinking to understand the mysteries of the right.
Decide for yourself: listen to JukeBox’s sample playlist on SoundCloud.
Final questions
So is AI generated music able to move us like we know music can?
At this point, not really … for me, nothing I have heard has come close to creating an emotional connection. But I speak as fussy composer, and there are many who just need to fill silence or pace a video with a generic bed.
And how should we as composers position ourselves in this onslaught?
I say don’t outsource your musical intelligence, cherish it and grow it. But we are probably already entangled with some degree of AI, so if tools come along that are compatible with your approach, treat them like you would any collaborator – riff off them, and jam with them, and allow them to grow your musical sensibilities.
Of course, the question of demand for human-created music when cheaper artificial processes are so available is key. Not to mention the continuing threat of “royalty free” or buy-out models generally driving the price of music downwards.
Ah well, one tsunami at a time.
Nic Paton
Check the topic AI and Music Creation on MLR.