Since the launch of OpenAI’s ChatGPT beta in November 2022, the LLM (Large Language Model) chatbot race has heated up in Q1 of 2023. We have seen developments in Microsoft’s ChatGPT-Bing integration, Google Bard, Meta’s LLaMA, and the ChatGPT API. Last week, the UK government set out plans to regulate AI with new guidelines on “responsible use”.
The latest epicentre of this whirlwind is the data privacy debate.
In January, we made some initial predictions in a piece titled Virtual lawyers: the future of the legal sector with ChatGPT. Since then, a Goldman Sachs report has predicted that 44% of tasks in legal professions could be automated by AI. With so many processes set to be automated, lawyers will need to find a way to bill for this!
To continue in this vein, this article focuses specifically in the legal sector. First, we bring you up to speed with the latest news.
Then, we discuss the implications for Personal Identifiable Information (PII) and privileged information.
Finally, we cover how building a bespoke program, plugged into the API, can alleviate data privacy concerns.
The latest news: data leaks and governmental action
Over the weekend, Italy became the first Western country to block ChatGPT over privacy concerns. The Italian data-protection authority is investigating whether it complies with GDPR, and has given OpenAI 20 days to address the watchdog’s concerns, under penalty of a fine of €20 million. It appears that Germany may be following suit, while other EU nations have been inspired to investigate if harsher measures may be necessary.
This follows ChatGPT data leak two weeks ago, which revealed the bot’s growing pains in a significant setback for the firm’s reputation and trustworthiness. A major bug exposed private conversations of other users, including credit card details, first and last names and emails. Taking to Twitter, OpenAI CEO Sam Altman declared a “technical postmortem” to analyse the bug that led to people’s private conversations being leaked.
Disruption has already been visible in the legal sector. Some law firms – PwC and Allen & Overy – have announced a strategic alliance with ChatGPT API-powered and OpenAI Startup Fund-backed Harvey AI, which has the tagline “Generative AI for Elite Law Firms”. It is currently unknown exactly what Harvey will do, but we do know that the program assists lawyers with research, drafting, analysis, and communication.
OpenAI’s ChatGPT API data policy
Last month’s data leak revealed weak points in the web version. But what about the API?
OpenAI’s release of API on March 1st announced Regarding the API, on the 1st March, OpenAI announced two changes to its data usage and retention policies:
OpenAI will not use data submitted by customers via our API to train or improve our models, unless you explicitly decide to share your data with us for this purpose. You can opt-in to share data. (we don’t recommend doing this! – Springbok)
Any data sent through the API will be retained for abuse and misuse monitoring purposes for a maximum of 30 days, after which it will be deleted (unless otherwise required by law).
The API, in contrast to the web interface, has far fewer holes in its design, safeguarding it against unauthorised access. OpenAI’s policy will satisfy many lawyers, but we expect some to still be unwilling to take the risk. In the next sections, we explore the reasons for this, and some solutions.
The threat of ChatGPT to data privacy in the legal sector: privileged information & PII
The two main types of at-risk data in the legal sector are:
Privileged information: the attorney-client privilege whereby their communications are allowed to, and must, stay confidential.
Personal Identifiable Information (PII): data that permits the identity of an individual to be reasonably interred.
Lawyers can never (with rare exceptions) divulge the client’s secrets without the client’s permission. This privilege still applies in the world of AI. It remains illegal for lawyers to snitch on their clients to OpenAI through ChatGPT’s web interface.
Considering the recent bug which exposed users’ conversations, imagine if Donald Trump’s lawyer in his ongoing lawsuit had entered a confidential email, say, instructing ChatGPT to change the tone – and then the glitch exposed it. It would have been an absolute scandal!
Another issue is that the data could be used to train, say, GPT-5. The next generation chatbot could then be able to answer questions about sensitive cases.
Personal Identifiable Information (PII)
For most law firms’ purposes, using the API should alleviate most GDPR concerns regarding PII – the API has much more robust data privacy terms than the browser product.
Entering PII into ChatGPT or through the API is not illegal. However, a problem lies in the uncertainty as to whether or not ChatGPT complies with the EU’s GDPR or the UK’s Data Protection Act (DPA).
Some may deem OpenAI’s 30-day data deletion policy insufficient for their confidentiality policies. There is an option to opt out of the 30-day rule of data retention. However, it is not a guaranteed opt-out. There is an application process in order to avoid malicious agents getting approval.
Nonetheless, the API data privacy terms should satisfy the needs of most companies on the GDPR front. The liability falls on OpenAI if they mess up, and not your firm, since you have contractually protected yourself.
An additional type of confidential information that should not be entered into ChatGPT (which, remember, is essentially disclosing it to OpenAI) is high commercial value news that has not yet been broken and could facilitate insider trading.
How law firms can bolster data privacy with a technology partner
We have explained why the API is far lower risk. Now, we cover how law firms can take their security to the next level when they implement the API.
Many companies are reaching out to technology partners like Springbok for guidance on how to build bespoke ChatGPT solutions which, through the API, can live securely in the firm’s intranet.
This allows the implementation of the firm’s own security in the following ways:
The company’s single sign-on: an additional layer of security protection to the frontend.
Trackable/loggable requests: so the CIO has full visibility of what employees input to the program to ensure compliance. This contrasts with the web interface version, with which the CIO cannot know who sent what.
Anonymisation: declassify text, so that sensitive information never reaches OpenAI in the first place. For example, legal clients’ names and other PII can automatically be replaced with placeholders.
With these three layers of protection – alongside additional customisation designed through conversation with a technology partner – a law firm should feel confident in leveraging the API.
This article is part of our series exploring ChatGPT. Others in the series include a discussion on the legal and customer experience (CX) sectors, how a human-in-the-loop can mitigate risks, and the race between Google and Microsoft to LLM-ify the search engine.
Springbok have also written the ChatGPT Best Practices Policy Handbook in response to popular client demand. Reach out or comment if you'd like a copy.
If you’re interested in anything you’ve heard about in this article, reach out at [email protected]!
Chief Data Scientist
Ben is a co-founder and the Chief Data Scientist at Springbok AI. He has over 10 years of experience leading teams in commercial applications of machine learning and data science, focusing in particular on conversational AI and over 6 years of academic experience in data science and machine learning. At Springbok AI, Ben is leading AI research and machine learning projects.
Chief of Product & Solutions
Jason is a co-founder and the Chief of Product & Solutions at Springbok. His previous experience includes working as a Data Science/ Software engineer at Jaguar Land Rover. Jason has led the software engineering delivery of 4 of Springbok’s most significant chatbot projects to date.
A new era of productivity: Prompt Architected Software
Explore 'Prompt Architecting' for Generative AI in business, optimizing workflows with LLMs like ChatGPT for efficient HR and legal solutions.
Should you choose GPT-3.5 or GPT-4?
Discover GPT-4's edge over GPT-3.5 in legal applications, offering enhanced AI for document processing, chatbots, and business efficiency.
The Springbok Artificial Intelligence Glossary
Unveil the world of AI with our glossary, covering Generative AI, LLMs, NLP, and more, essential for integrating AI into business applications.