AI and the Fast Food-ification of Art
I fear the end is near fellow artists: Dall-E, ChatGPT, and more are coming for our jobs.
Over the past year it feels like there has been an explosion of high-quality machine-learning AI generative tools, like Dall-E and ChatGPT. If you haven’t been following it (or even if you have really), I strongly recommend this article from The New Yorker as a summary. This explosion follows a trend from the past decade, which saw similar growths in machine-learning fields showcasing the modern-day Deep Blue technological miracles AlphaGo and AlphaStar. The reaction to these tools has been mixed, with opinions like “it’s incredible”, “it’s awful”, and “lol I made a silly meme.” But, when it comes to the discussion about AI and artistic jobs, I learned from my time as a film composer that something doesn’t have to be good to be put into widespread use. And the consequences will be far worse than silly memes; they will inevitably lead to a degradation of art and people’s ability to appreciate quality in artistic expression. Here’s how technology ruined music production in only twenty years.
In almost all television and many films, the instrumental music you hear is not performed by real musicians (particularly for non-Hollywood films). The soundtracks use virtual instruments—computer software that allows composers to produce “realistic” sounding mockups of their orchestral arrangements. The technology of digital sampling synthesizers, which enables a keyboard or drum machine to record a “sample” of a sound (a note on a trumpet, say), and then play that sound across the range of the keyboard, rose to prominence in the ‘80s.
In addition to performing artists, sampled instruments found a new home with composers who had to provide mockups of their work to film producers and directors. Below is an example of Danny Elfman’s mockup from The Nightmare Before Christmas, followed by the final recording.
While it didn’t sound like the real thing, this was a significant technological leap. Being able to show someone what an orchestral piece would sound like before going to a recording studio with a 60-piece orchestra was a foreign concept until the 80s. And orchestral recording sessions are expensive, costing upwards of $10,000 per hour. So, if the director didn’t like what they heard in the recording session, a composer had to improvise something on the fly to make them happy—a terrifying scenario for all involved. Producing an orchestral mockup seems like a win-win for everyone.
But, this process of producing mockups changed from being something that gave a director an idea of what the music would sound like to being the final product. If the budgets weren’t there, the mockup would replace a recorded version. At best, a hybrid version with a few live instruments mixed with the virtual was the end product. And then, as producers looked for more corners where they could cut costs, they would (presumably) say to composers, “you can just use the computer to make the music, can’t you?” Slowly but surely, throughout the ‘90s, digital production started becoming the norm.
Lower-budget and higher-budget shows like Buffy the Vampire Slayer and 24 used almost exclusively virtual instruments. Slowly, all the live musicians who used to record for TV were no longer needed. There were a few holdouts, like the Star Trek series’, The Simpsons, Family Guy, and Lost. But even then, production budgets started to be slashed. When former Star Trek composer Paul Baillergeon spoke to my film-composing class at the University of Montreal in 2011, he told us about how the orchestra size slowly shrank over his time working on Deep Space Nine, Voyager, and Enterprise, from 60-piece, to 50, to 40, and so on.
As virtual instruments got better and cheaper, the technology became even more widespread. To some extent, it democratized orchestral music production—you didn’t need to study classical music for twenty years and a huge budget to be able to produce “orchestral” music. All you needed was a modest computer and some relatively cheap software. As more and more people started writing music digitally first, with no intention of recording live musicians, the market was inundated with an inferior product that is now considered normal because it's what is heard everywhere. To be clear, it’s not that the quality of the composition itself is inferior; rather the sonic quality, the production, sounds inferior when producing traditional orchestral music.1
I don’t think virtual instrument technology is bad or shouldn’t exist, but we should remember that real musicians make real performances. So when a relatively big TV series like Schitt’s Creek comes out, and its minimal amount of music sounds so terrible with a fake tuba and trumpet, it says to me that we as a society have lowered our cultural expectations around the art we consume. It wouldn’t have necessarily cost much to record musicians for a show that likely had a budget of millions of dollars per season, under $10,000 with full buyout rights, by my estimate. But instead, this was the final result:
This brings me back to AI tools for art and writing: Dall-E and ChatGPT might not be perfect, but they don’t have to be; they only have to be good enough. Thinking about where music technology was 40 years ago and where it is now, there is no doubt that writing and art jobs will be lost. These tools will also devalue the work of future artists and writers, making it harder to make a living as clients will say, “You can just use the computer to generate a draft, can’t you?” In fact, video game developer Polydin recently mused on what a successful integration of a tool like ChatGPT in games could look like, much to the ire of at least one professional game writer.
And where does it end? Do we just let AI programs generate voice performances of our favourite actors? Do we let them generate their facial expressions? Will audiobooks now be exclusively read by text-to-speech bots? Ultimately, I think of it like fast food: no one will argue that fast food is great, but millions of people will be ok eating it regularly. Of course, certain people will have money to eat at the best restaurants—or commission real art—but many of our cultural products will slowly become like fast food: cheap, barely palatable, and unmemorable.
I believe that there is significant room for creativity and expression with virtual instruments and software, but the use of software to replace what would otherwise be a real musician performing the music is what I, and many other music professionals I know, deem to be inferior.