AI and the Human Factor

Earlier this month, I went to the Melbourne premiere of “Eno”, a documentary by Gary Hustwit, which is described as the world’s first generative feature film. Each time the film is shown, the choice and sequencing of scenes is different – no two versions are ever the same. Some content may never be screened at all.

I’ll leave readers to explore the director’s rationale for this approach (and the implications for film-making, cinema and streaming). But during a Q&A following the screening, Hustwit was at pains to explain that this is NOT a film generated by AI. He was also guarded and refrained from revealing too much about the proprietary software and hardware system he co-developed to compile and present the film.

However, the director did want to stress that he didn’t simply tell an AI bot to scour the internet, scrape any content by, about or featuring Brian Eno, and then assemble it into a compilation of clips. This documentary is presented according to a series of rules-based algorithms, and is a content-led venture curated by its creator. Yes, he had to review hours and hours of archive footage from which to draw key themes, but he also had to shoot new interview footage of Eno, that would help to frame the context and support the narrative, while avoiding a banal biopic or series of talking heads. The result is a skillful balance between linear story telling, intriguing juxtaposition, traditional interviews, critical analysis, and deep exploration of the subject. The point is, for all its powerful capabilities, AI could not have created this film. It needed to start with human elements: innate curiosity on the part of the director; intelligent and empathetic interaction between film maker and subject; and expert judgement in editing the content – as a well as an element of risk-taking in allowing the algorithm to make the final choices when it comes to each screened version.

That the subject of this documentary is Eno should not be surprising, either. He has a reputation for being a modern polymath, interested in science and technology as well as art. His use of Oblique Strategies in his creative work, his fascination with systems, his development of generative music, and his adoption of technology all point to someone who resists categorisation, and for whom work is play (and vice versa). In fact, imagination and play are the two key activities that define what it is to be human, as Eno explored in an essay for the BBC a few years ago. Again, AI does not yet have the power of imagination (and probably has no sense of play).

Sure, AI can conjure up all sorts of text, images, video, sound, music and other outputs. But in truth, it can only regurgitate what it has been trained on, even when extrapolating from data with which it has been supplied, and the human prompts it is given. This process of creation is more akin to plagiarism – taking source materials created by other people, blending and configuring them into some sort of “new” artefact, and passing the results off as the AI’s own work.

Plagiarism is neither new, nor is it exclusive to AI, of course. In fact, it’s a very natural human response to our environment: we all copy and transform images and sounds around us, as a form of tribute, hommage, mimicry, creative engagement, pastiche, parody, satire, criticism, acknowledgement or denouncement. Leaving aside issues of attribution, permitted use, fair comment, IP rights, (mis)appropriation and deep fakes, some would argue that it is inevitable (and even a duty) for artists and creatives to “steal” ideas from their sources of inspiration. Notably, Robert Shore in his book about “originality”. The music industry is especially adept at all forms of “copying” – sampling, interpolation, remixes, mash-ups, cover versions – something that AI has been capable of for many years. See for example this (limited) app from Google released a few years ago. Whether the results could be regarded as the works of J.S.Bach or the creation of Google’s algorithm trained on Bach’s music would be a question for Bach scholars, musicologists, IP lawyers and software analysts.

Finally, for the last word on AI and the human condition, I refer you to the closing scene from John Carpenter’s cult SciFi film, “Dark Star”, where an “intelligent” bomb outsmarts its human interlocutor. Enjoy!

Next week: AI hallucinations and the law

 

 

Whose side is AI on?

At the risk of coming across as some sort of Luddite, recent commentary on Artificial Intelligence suggests that it is only natural to have concerns and misgivings about its rapid development and widespread deployment. Of course, at its heart, it’s just another technology at our disposal – but by its very definition, generative AI is not passive, and is likely to impact all areas of our life, whether we invite it in or not.

Over the next few weeks, I will be discussing some non-technical themes relating to AI – creativity and AI, legal implications of AI, and form over substance when it comes to AI itself.

To start with, these are a few of the questions that I have been mulling over:

– Is AI working for us, as a tool that we control and manage?  Or is AI working with us, in a partnership of equals? Or, more likely, is AI working against us, in the sense that it is happening to us, whether we like it or not, let alone whether we are actually aware of it?

– Is AI being wielded by a bunch of tech bros, who feed it with all their own prejudices, unconscious bias and cognitive limitations?

– Who decides what the Large Language Models (LLMs) that power AI are trained on?

– How does AI get permission to create derived content from our own Intellectual Property? Even if our content is on the web, being “publicly available” is not the same as “in the public domain”

– Who is responsible for what AI publishes, and are AI agents accountable for their actions? In the event of false, incorrect, misleading or inappropriate content created by AI, how do we get to clarify the record, or seek a right of reply?

– Why are AI tools adding increased caveats? (“This is not financial advice, this is not to be relied on in a court of law, this is only based on information available as at a certain point in time, this is not a recommendation, etc.”) And is this only going to increase, as in the recent example of changes to Google’s AI-generated search results? (But really, do we need to be told that eating rocks or adding glue to pizza are bad ideas?)

– From my own experience, tools like Chat GPT return “deliberate” factual errors. Why? Is it to keep us on our toes (“Gotcha!”)? Is it to use our responses (or lack thereof) to train the model to be more accurate? Is it to underline the caveat emptor principle (“What, you relied on Otter to write your college essay? What were you thinking?”). Or is it to counter plagiarism (“You could only have got that false information from our AI engine”). If you think the latter is far-fetched, I refer you to the notion of “trap streets” in maps and directories.

– Should AI tools contain better attribution (sources and acknowledgments) in their results? Should they disclose the list of “ingredients” used (like food labelling?) Should they provide verifiable citations for their references? (It’s an idea that is gaining some attention.)

– Finally, the increased use of cloud-based services and crowd-sourced content (not just in AI tools) means that there is the potential for overreach when it comes to end user licensing agreements by ChatGPT, Otter, Adobe Firefly, Gemini, Midjourney etc. Only recently, Adobe had to clarify latest changes to their service agreement, in response to some social media criticism.

Next week: AI and the Human Factor

State of the Music Industry…

Depending on your perspective, the music industry is in fine health. 2023 saw a record year for sales (physical, digital and streaming), and touring artists are generating more income from ticket sales and merchandising than the GDPs of many countries. Even vinyl records, CDs and cassettes are achieving better sales than in recent years!

On the other hand, only a small number of musicians are making huge bucks from touring; while smaller venues are closing down, meaning fewer opportunities for artists to perform.

And despite the growth in streaming, relatively few musicians are minting it from these subscription-based services, that typically pay very little in royalties to the vast majority of artists. (In fact, some content can be zero-rated unless it achieves a minimum number of plays.)

Aside from the impact of streaming services, there are two other related challenges that exercise the music industry: the growing use of Artificial Intelligence, and the need for musicians to be recognised and compensated more fairly for their work and their Intellectual Property.

With AI, a key issue is whether the software developers are being sufficiently transparent about the content sources used to train their models, and whether the authors and rights owners are being fairly recompensed in return for the use of their IP. Then there are questions of artistic “creativity”, authorial ownership, authenticity, fakes and passing-off when we are presented with AI-generated music. Generative music software has been around for some time, and anyone with a smart phone or laptop can access millions of tools and samples to compose, assemble and record their own music – and many people do just that, given the thousands of new songs that are being uploaded every day. Now, with the likes of Suno, it’s possible to “create” a 2-minute song (complete with lyrics) from just a short text prompt. Rolling Stone magazine recently did just that, and the result was both astonishing and dispiriting.

I played around with Suno myself (using the free version), and the brief prompt I submitted returned these two tracks, called “Midnight Shadows”:

Version 1

Version 2

The output is OK, not terrible, but displays very little in the way of compositional depth, melodic development, or harmonic structure. Both tracks sound as if a set of ready-made loops and samples had simply been cobbled together in the same key and tempo, and left to run for 2 minutes. Suno also generated two quite different compositions with lyrics, voiced by a male and a female singer/bot respectively. The lyrics were nonsensical attempts to verbally riff on the text prompt. The vocals sounded both disembodied (synthetic, auto-tuned and one-dimensional), and also exactly the sort of vocal stylings favoured by so many contemporary pop singers, and featured on karaoke talent shows like The Voice and Idol. As for Suno’s attempt to remix the tracks at my further prompting, the less said the better.

While content attribution can be addressed through IP rights and commercial licensing, the issue of “likeness” is harder to enforce. Artists can usually protect their image (and merchandising) against passing off, but can they protect the tone and timbre of their voice? A new law in Tennessee attempts to do just that, by protecting a singer’s a vocal likeness from unauthorised use. (I’m curious to know if this protection is going to be extended to Jimmy Page’s guitar sound and playing style, or an electronic musician’s computer processing and programming techniques?)

I follow a number of industry commentators who, very broadly speaking, represent the positive (Rob Abelow), negative (Damon Krukowski) and neutral (Shawn Reynaldo) stances on streaming, AI and musicians’ livelihood. For every positive opportunity that new technology presents, there is an equal (and sometimes greater) threat or challenge that musicians face. I was particularly struck by Shawn Reynaldo’s recent article on Rolling Stone’s Suno piece, entitled “A Music Industry That Doesn’t Sell Music”. The dystopian vision he presents is millions of consumers spending $10 a month to access music AI tools, so they can “create” and upload their content to streaming services, in the hope of covering their subscription fees….. Sounds ghastly, if you ask me.

Add to the mix the demise of music publications (for which AI and streaming are also to blame…), and it’s easy to see how the landscape for discovering, exploring and engaging with music has become highly concentrated via streaming platforms and their recommender engines (plus marketing budgets spent on behalf of major artists). In the 1970s and 1980s, I would hear about new music from the radio (John Peel), TV (OGWT, The Tube, Revolver, So It Goes, Something Else), the print weeklies (NME, Sounds, Melody Maker), as well as word of mouth from friends, and by going to see live music and turning up early enough to watch the support acts. Now, most of my music information comes from the few remaining print magazines such as Mojo and Uncut (which largely focus on legacy acts), The Wire (but probably too esoteric for its own good), and Electronic Sound (mainly because that’s the genre that most interests me); plus Bandcamp, BBC Radio 6’s “Freak Zone”, Twitter, and newsletters from artists, labels and retailers. The overall consequence of streaming and up/downloading is that there is too much music to listen to (but how much of it is worth the effort?), and multiple invitations to “follow”, “like”, “subscribe” and “sign up” for direct content (but again, how much of it is worth the effort?). For better or worse, the music media at least provided an editorial filter to help address quality vs quantity (even if much of it ended up being quite tribal).

In the past, the music industry operated as a network of vertically integrated businesses: they sourced the musical talent, they managed the recording, manufacturing and distribution of the content (including the hardware on which to play it), and they ran publishing and licensing divisions. When done well, this meant careful curation, the exercise of quality control, and a willingness to invest in nurturing new artists for several albums and for the duration of their career. But at times, record companies have self-sabotaged, by engaging in format wars (e.g., over CD, DCC and MiniDisc standards), by denying the existence of on-line and streaming platforms (until Apple and Spotify came along), and by becoming so bloated that by the mid-1980s, the major labels had to merge and consolidate to survive – largely because they almost abandoned the sustainable development of new talent. They also ignored their lucrative back catalogues, until specialist and independent labels and curators showed them how to do it properly. Now, they risk overloading the reissue market, because they lack proper curation and quality control.

The music industry really only does three things:

1) A&R (sourcing and developing new talent)

2) Marketing (promotion, media and public relations)

3) Distribution & Licensing (commercialisation).

Now, #1 and #2 have largely been outsourced to social media platforms (and inevitably, to AI and recommender algorithms), and #3 is going to be outsourced to web3 (micro-payments for streaming subscriptions, distribution of NFTs, and licensing via smart contracts). Whether we like it or not, and taking their lead from Apple and Spotify, the music businesses of the future will increasingly resemble tech companies. The problem is, tech rarely understands content from the perspective of aesthetics – so expect to hear increasingly bland AI-generated music from avatars and bots that only exist in the metaverse.

Meanwhile, I go to as many live gigs as I can justify, and brace my wallet for the next edition of Record Store Day later this month…

Next week: Reclaim The Night

 

 

 

More on Music Streaming

A coda to my recent post on music streaming:

Despite the growth in Spotify‘s subscribers (and an apparent shift from free to paid-for services), it seams that the company still managed to make a loss. Over-paying for high-profile projects can’t have helped the balance sheet either….

Why is it so hard for Spotify to make money? In part, it’s because streaming has decimated the price point for content. This price erosion began with downloads, and has accelerated with streaming – premium subscribers don’t bother to think about how little they are paying for each time they stream a song, they have just got used to paying comparatively little for their music, wherever and whenever they want it. So they are not even having to leave their screen or device to consume content – whereas, in the past, fixed weekly budgets and the need to visit a bricks and mortar shop meant record buyers were probably more discerning about their choices.

Paradoxically, the reduced cost of music production (thanks to cheaper recording and distribution technology) means there is more music being released than ever before. But there is a built-in expectation that the consumer price must also come down – and of course, with so much available content, there has to be a law of diminishing returns – both in terms of quality, and the amount of new content subscribers can listen to. (It would be interesting to know how many different songs or artists the average Spotify subscriber streams.)

While some artists continue to be financially successful in the streaming age (albeit backed up by concert revenue and merchandising sales), it means there is an awfully long tail of content that is rarely or never heard. Even Spotify has to manage and shift that inventory somehow, so that means marketing budgets and customer acquisition costs have to grow accordingly (even though some of the promotion expenses can be offloaded on to artists and their labels).

Not only is streaming eroding content price points, in some cases, it is also at risk of eroding copyright. Recently it was disclosed that Twitter (now X) is being sued by music companies for breach of copyright.

You may recall that just over 10 years ago, a service called Twitter Music was launched with much anticipation (if not much fanfare…). Interestingly, part of the idea was that Twitter Music users could “integrate” their Spotify, iTunes or Rdio (who…?) accounts. It was also seen as a way for artists to engage more directly with their audience, and enable fans to discover new music. Less than a year later, Twitter pulled the plug.

One conclusion from all of this is that often, even successful tech companies don’t really understand content. The classic case study in this area is probably Microsoft and Encarta, but you could include Kodak and KODAKOne – by contrast, I would cite News Corp and MySpace (successful content business fails to understand tech). I suppose Netflix (which started as a mail-order DVD rental business) is an example of a tech business (it gained patents for its early subscription tech) that has managed to get content creation right – and its recent drive to shut down password sharing looks like it is paying dividends.

Of all its contemporaries, Apple is probably the most vertically integrated tech and content company – it manufactures the platform devices, manages streaming services, and even produces film and TV content (but not yet music?). In this context, I would say Google is a close second (devices, streaming, dominates on-line advertising, but does not produce original content), with Amazon someway behind (although it has had a patchy experience with devices, it has a reasonable handle on streaming and content creation).

All of which makes it somewhat surprising that Spotify is running at a loss?

Next week: Digital Identity – Wallets are the key?