Want to advertise in this newsletter?
If you are looking for an opportunity to advertise your product, course, service, or whatever and it’s in the domain of artificial intelligence (or coffee!), please visit the advertising page (link) or respond to this email to inquire further about advertising/sponsorship opportunities.
Large language models (LLMs) such as ChatGPT seem to be incredibly smart. In fact, ChatGPT has passed a collection of difficult exams, which include the bar and top-tier medical exams (link).
Despite the vast knowledge and articulate responses these types of models have, LLMs are notorious for hallucinating, which is a phenomena where it responds with false or nonsensical information.
Whether if you’re in academia, marketing, or even software engineering, you most likely will come across this phenomena if you haven’t yet when messing around with generative AI applications. Let’s talk about it further about it:
What is a hallucination?
A hallucination is when the model thinks that the output it gives is correct but is false and/or not coherent. One of the big reasons why ChatGPT took off back in early 2023 was due to the “knowledge” it had.
I decided to test ChatGPT’s knowledge on one of the bands I listen to, Gloryhammer. I fed it the opening lyrics to one of the songs minus the character name to see if it could guess the lyrics correctly:
I fed the same prompt to Claude 2:
They’re both factually wrong. ChatGPT correlates being the character of this song to Macbeth. According to Claude 2, Prince Hubert is a character in Camelot. After some googling, I don’t think Hubert is in Camelot!
Hallucinating poses a risk for credibility for not only the models, but for the company who makes the model from the public perspective.
Why is hallucination bad? Credibility, for one.
Imagine a scenario where a barista tells you multiple times they don't have sugar, when it's clearly available over on the condiment stand next to the register. You may question their reliability since it’s clearly available.
Over time, repeated false claims, such as this one, undermine trust and ultimately build up to a point to where the credibility of the coffee shop is damaged.
The same can be said about users of the LLM. When a LLM hallucinates over multiple iterations of either a single task or a multitude of tasks, credibility of the LLM is damaged due to the pattern of false information. This compounds over time, which in turn could damage the reputation of the company or the technology because it can be perceived as untrustworthy and unreliable.
Hallucinations have real-world impacts. For instance, a lawyer was recently sanctioned (link) by a judge because they cited non-existent court cases that GPT generated. In addition, a paper by Hussam Alkaissi and Samy I McFarlane (link) was published talking about hallucinations and how they impact science writing.
Now that I know about this, how can I prevent ChatGPT from hallucinating?
Preventing hallucination entirely is difficult and something that you can avoid, but there are measures you can put into place to minimize the probability of having the LLM hallucinate:
Context is your friend. By providing the LLM as much context to the problem at hand as possible, it doesn’t have to make assumptions when generating the output.
Use example outputs, if applicable. For instance, if you want the output to be formatted a certain way, prompt the LLM with the formatting style.
Be specific with your prompts. I discuss in Edition 1 (link) how to craft “the perfect prompt”. Here’s a snippet from the edition that’s relevant to the discussion:
Let’s break down what I like to call “the 3 components of prompt crafting”, which are clarity, completeness, and specificity. In an ideal scenario, you’re going to end up [using all 3]…
… For instance, the prompt “Discuss how Python handles memory management”, leaves out the “completeness” component of the trio, as we didn’t define the target audience, which can lead to a vague output.
Just remember to always verify if the output is factually correct to avoid running into conflicts. Also keep in mind that this isn’t limited to ChatGPT - this is an issue that spans across multiple LLM’s.
Enjoy this edition so far? Here’s 2 things you can do:
1) Respond to this email and let me know your thoughts. I aim to respond to all emails.
2) Share Bytes and Brew with a friend.
Click the button below to share your referral code!