How to Safely Get Started with Large Language Models

Written by Synaptiq | Jul 22, 2024 6:20:19 AM

Just as a skydiver never wishes they’d left their parachute behind, no business leader ever thinks, "I wish we had done less research before integrating a large language model into our operations and exposing our sensitive data." This guide is designed to help you understand the risks associated with adopting a large language model (LLM) and to choose an approach to LLM adoption that aligns with your risk appetite.

1. Consider the Benefits

First and foremost, we must answer two questions: What is an LLM, and why adopt one? Large language models are computational models that predict and generate natural language, which is the kind of language humans use to communicate with each other. They can generate value through a variety of use cases, including the following:

Automated or Assisted Task Completion

LLMs can automate or assist humans in tasks involving natural language, such as text generation, summarization, and translation. or A study by Harvard Business School at Boston Consulting Group found that consultants with access to the LLM GPT-4 completed tasks more quickly and with higher-quality results than those without. [1]

Information Management

LLMs can retrieve information from datasets that are too large or complex for humans to navigate manually.

2. Understand the Risks

While LLMs can generate value, they also introduce risks that must be recognized and managed:

External Leaks

LLMs may share sensitive data with unauthorized parties. For instance, OpenAI’s LLM-driven chatbot, ChatGPT, captures users' chat history to train its underlying model, which can potentially cause one user's data to appear as output for another. [2] In May 2023, Samsung banned staff from using generative AI tools after its software engineers leaked internal code by uploading it to ChatGPT. [3] A generalizable example: researchers at Robust Intelligence, a startup specializing in AI model stress-testing, have discovered that Nvidia’s “NeMo Framework,” which allows developers to work with a variety of LLMs, can be manipulated to reveal sensitive data. [4]

Hallucinations

LLMs can produce incorrect information—a behavior known as "hallucination." Well-meaning users can mistake hallucinations for credible information, and malicious actors can exploit them to produce harmful results. For instance, the security firm Vulcan Cyber found that URLs, references, and code libraries hallucinated by ChatGPT could serve as a vector for attackers to introduce malicious code into development environments. [5]

Internal Exposure & Misuse

LLMs can expose sensitive data to unauthorized users. For instance, if an LLM is trained on a company's internal documents, it might reveal confidential information from performance evaluations when queried by an employee. Additionally, using LLMs without training or guidelines can lead to employees becoming overly reliant on AI, sidelining common sense and due diligence. For instance, in May 2023, a federal judge imposed $5,000 fines on two lawyers and a law firm who submitted fictitious legal research and later blamed ChatGPT for the error. [6]

3. Choose an Approach to LLM Adoption with Risk Management In Mind

It is possible (but not easy) to safely integrate an LLM into your business operations.

Depending on your risk appetite, you might consider one of the following approaches to LLM adoption:

Option 1: FOMO — Outsource the Risk

If your business wants to start using an LLM as quickly as possible without building an internal team, consider engaging a third-party LLM provider like Microsoft (Azure Cognitive Services) or OpenAI (GPT). You can outsource some of the associated risks by ensuring that the Master Service Agreement (MSA) governing your engagement includes provisions that require the third-party LLM provider to take responsibility for risk management.

However, MSAs often include ambiguous terms or exclusions that limit provider liability. For instance, an MSA might state that the provider will ensure "appropriate security measures" without defining what "appropriate" means. This vagueness can be exploited by the provider to avoid assuming full responsibility for a security lapse.

Moreover, relying on a provider makes you dependent on their reliability, expertise, and stability. If the provider fails to manage and mitigate the risks of issues such as cyberattacks or operational failures, the consequences can directly impact your business. For instance, OpenAI has disclosed that a bug in ChatGPT’s source code may have caused the “unintentional visibility of payment-related information” for premium users between 1-10 a.m. PST on March 20, 2023. [7] This incident shows how issues with an LLM provider can trickle down to affect its users.

Option 2: YOLO — Embrace the Risk

If your business prefers to 'own' the risks associated with LLM adoption, consider in-house development. You can either build an LLM from scratch (expensive) or customize a pre-trained model to meet your specific needs.

However, your business should not attempt in-house LLM development without a substantial amount of time, resources, and a pre-existing team experienced in LLM development. The tasks involved—from data acquisition and preprocessing to model training and model deployment—are significant in complexity, scale, and cost.

It's also important to consider that maintaining an in-house LLM may strain your internal resources. Without continuous support from a third-party provider, your business will be solely responsible for keeping its model secure and operational, and any lapse in maintenance can lead to security breaches or operational disruptions. Practically speaking, very few businesses have what it takes to succeed in this approach to LLM adoption.

Option 3: Phone a Friend

If your business wants to start using LLMs quickly, safely, and affordably, consider a temporary engagement with a team of consultants. This approach to LLM adoption combines the benefits of outsourcing and in-house development: You get third-party support while maintaining ownership over decision-making and outcomes.

For most businesses, a consulting engagement is the best approach to LLM adoption. It offers greater control than working directly with an LLM provider and is significantly more affordable than in-house LLM development. Your internal teams will cultivate LLM expertise through collaboration with experienced experts while maintaining focus on their core activities and without taking on the full burden of LLM development and maintenance.

Photo by Dylan Gillis on Unsplash

About Synaptiq

Synaptiq is an AI and data science consultancy based in Portland, Oregon. We collaborate with our clients to develop human-centered products and solutions. We uphold a strong commitment to ethics and innovation.

You can learn more about our story through our past projects, blog, or podcast.

View full post