More on AI Hallucinations…

The mainstream adoption of AI continues to reveal the precarious balance between the benefits and the pitfalls.

Yes, AI tools can reduce the time it takes to research information, to draft documents and to complete repetitive tasks.

But, AI is not so good at navigating subtle nuances, interpreting specific context or understanding satire or irony. In short, AI cannot “read the room” based on a few prompts and a collection of databases.

And then there is the issue of copyright licensing and other IP rights associated with the original content that large language models are trained on.

One of the biggest challenges to AI’s credibility is the frequent generation of “hallucinations” – false or misleading results that can populate even the most benign of search queries. I have commented previously on whether these errors are deliberate mistakes, an attempt at risk limitation (disclaimers), or a way of training AI tools on human users. (“Spot the deliberate mistake!) Or a get-out clause if we are stupid enough to rely on a dodgy AI summary!

With the proliferation of AI-generated results (“overviews”) in basic search queries, there is a tendency for AI tools to conflate or synthesize multiple sources and perspectives into a single “true” definition – often without authority or verified citations.

A recent example was a senior criminal barrister in Australia who submitted fake case citations and imaginary speeches in support of a client’s case.

Leaving aside the blatant dereliction of professional standards and the lapse in duty of care towards a client, this example of AI hallucinations within the context of legal proceedings is remarkable on a number of levels.

First, legal documents (statutes, law reports, secondary legislation, precedents, pleadings, contracts, witness statements, court transcripts, etc.) are highly structured and very specific as to their formal citations. (Having obtained an LLB degree, served as a paralegal for 5 years, and worked in legal publishing for more than 10 years, I am very aware of the risks of an incorrect citation or use of an inappropriate decision in support of a legal argument!!!)

Second, the legal profession has traditionally been at the forefront in the adoption and implementation of new technology. Whether this is the early use of on-line searches for case reports, database creation for managing document precedents, the use of practice and case management software, and the development of decision-trees to evaluate the potential success of client pleadings, lawyers have been at the vanguard of these innovations.

Third, a simple document review process (akin to a spell-check) should have exposed the erroneous case citations. The failure to do so reveals a level laziness or disregard that in another profession (e.g., medical, electrical, engineering) could give rise to a claim for negligence. (There are several established resources in this field, so this apparent omission or oversight is frankly embarrassing: https://libraryguides.griffith.edu.au/Law/case-citators, https://guides.sl.nsw.gov.au/case_law/case-citators, https://deakin.libguides.com/case-law/case-citators)

In short, as we continue to rely on AI tools, unless we apply due diligence to these applications or remain vigilant to their fallibility, we use them at our peril.

 

AI hallucinations and the law

Several years ago, I blogged about the role of technology within the legal profession. One development I noted was the nascent use of AI to help test the merits of a case before it goes to trial, and to assess the likelihood of winning. Not only might this prevent potentially frivolous matters coming to trial, it would also reduce court time and legal costs.

More recently, there has been some caution (if not out and out scepticism) about the efficacy of using AI in support of legal research and case preparation. This current debate has been triggered by an academic paper from Stanford University that compared leading legal research tools (that claim to have been “enhanced” by AI) and ChatGPT. The results were sobering, with a staggering number of apparent “hallucinations” being generated, even by the specialist legal research tools. AI hallucinations are not unique to legal research tools; nor to the AI tools and the Large Language Model (LLMs) they are trained on, as Stanford has previously reported. While the academic paper is awaiting formal publication, there has been some to-and-fro between the research authors and at least one of the named legal tools. This latter rebuttal rightly points out that any AI tool (especially a legal research and professional practice platform) has to be fit for purpose, and trained on appropriate data.

Aside from the Stanford research, some lawyers have been found to have relied upon AI tools such as ChatGPT and Google Bard to draft their submissions, only to discover that the results have cited non-existent precedents and cases – including in at least one high-profile prosecution. The latest research suggests that not only do AI tools “imagine” fictitious case reports, they can also fail to spot “bad” law (e.g., cases that have been overturned, or laws that have been repealed), offer inappropriate advice, or provide inaccurate or incorrect legal interpretation.

What if AI hallucinations resulted in the generation of invidious content about a living person – which in many circumstances, would be deemed libel or slander? If a series of AI prompts give rise to libelous content, who would be held responsible? Can AI itself be sued for libel? (Of course, under common law, it is impossible to libel the dead, as only a living person can sue for libel.)

I found an interesting discussion of this topic here, which concludes that while AI tools such as ChatGPT may appear to have some degree of autonomy (depending on their programming and training), they certainly don’t have true agency and their output in itself cannot be regarded in the same way as other forms of speech or text when it comes to legal liabilities or protections. The article identified three groups of actors who might be deemed responsible for AI results: AI software developers (companies like OpenAI), content hosts (such as search engines), and publishers (authors, journalists, news networks). It concluded that of the three, publishers, authors and journalists face the most responsibility and accountability for their content, even if they claimed “AI said this was true”.

Interestingly, the above discussion referenced news from early 2023, that a mayor in Australia was planning to sue OpenAI (the owners of ChatGPT) for defamation unless they corrected the record about false claims made about him. Thankfully, OpenAI appear to have heeded of the letter of concern, and the mayor has since dropped his case (or, the false claim was simply over-written by a subsequent version of ChatGPT). However, the original Reuters link, above, which I sourced for this blog makes no mention of the subsequent discontinuation, either as a footnote or update – which just goes to show how complex it is to correct the record, since the reference to his initial claim is still valid (it happened), even though it did not proceed (he chose not to pursue it). Even actual criminal convictions can be deemed “spent” after a given period of time, such that they no longer appear on an individual’s criminal record. Whereas, someone found not guilty of a crime (or in the mayor’s case, falsely labelled with a conviction) cannot guarantee that references to the alleged events will be expunged from the internet, even with the evolution of the “right to be forgotten“.

Perhaps we’ll need to train AI tools to retrospectively correct or delete any false information about us; although conversely, AI is accelerating the proliferation of fake content – benign, humourous or malicious – thus setting the scene for the next blog in this series.

Next week: AI and Deep (and not so deep…) Fakes