Bytes and Brew
Posts
Detecting AI Generated Content: Can It Be Done?

Detecting AI Generated Content: Can It Be Done?

The Turing test - 2023 edition!

Brandon Molyneaux
July 24, 2023

Edition #4. Approximate read time: 15 minutes

Hey there!

This edition I’m calling Watch What You Write. I discuss how we can try and detect AI-generated content and the impacts it has on academia.

Here’s what’s in store:

The Writing: I like to stay around 500 words per edition, but I decided to up it to just under 1,000 words. It’s a long one!
AI Tools: As usual, I’ve included 3 tools you can use right now. This collection I’ve picked out is a bit interesting and not really on topic…
AI Generated Images: I decided to use the same prompt across 3 different image generators: Midjourney, Stable Diffusion, and DALL-E. The prompt for all of these images are “Imagine a world full of surveillance, looking to detect any anomalous behavior.”

I’m working on setting up a referral program. Your referral link to send to family and friends is found at the bottom of this email or by logging into your account on the website.

I’m also working on something no other AI newsletter is doing. You’ll get a sneak peek if I decide to go through with it in either an August or September edition.

Grab your cup of coffee. Let’s discuss how we can detect AI.

New here? Grab a cup of coffee - we’re talking AI here. The content that’s in this newsletter is curated towards the integration and applications of AI in our lives, so if this sounds like your kind of thing, hit that “subscribe” button below!

There’s been a lot of buzz about ChatGPT. In case you’re not aware of what it is, it’s an AI chat bot, or a Large Language Model (or, LLM), that you can converse in a conversation-like manner.

People have found lots of uses for it, ranging from creating social media posts to generating code.

How ChatGPT Generates Content

I mention in Edition 1 how ChatGPT-like systems work under the hood, but I’ll reiterate it here for the purposes of this edition:

It reads and translates the prompt we provide into tokens, which is then broken down into numerical values. Tokens can be either as small as a character or as large as a word.
It passes these values through a neural network, which is trained on a large amount of text data.
It predicts the next letter/word based off its neural network training, context of the tokens, and preceding tokens. Each new token is assigned a probability, and generally the token with the highest probability gets chosen to be generated.

Point #3 is important; it looks like ChatGPT is super knowledgeable, but in reality it has no clue if it’s being truthful or not.

The Problem With AI Generated Content

AI generated content is making its round on the internet. One place where it’s really shown its face is in academia. I pointed out in Edition 3 how generative AI can help students and instructors. While it can be beneficial to the industry, it can also cause a few problems, namely:

Cheating - ChatGPT has made it much easier to cheat because of how much information it “knows”; students can shortcut assignments without doing the intellectual work themselves.
Plagiarism - AI generated content can make it more difficult to detect plagiarized content.

This, in turn, causes a disruption on the development on critical thinking skills. We want students to be able to think deeply about problems and be able to thoroughly analyze complex issues from multiple perspectives. Generative AI could do this kind of work on behalf of the student.

I had originally planned this edition to discuss plagiarism detection with AI, but I want to do more research on this and talk to people in the industry. But I liked the idea of AI generated content detection, so I asked myself: is there a way to determine if content was written by AI?

Detecting AI Generated Content

I tested 10 different tools (both paid and free) that I found on the internet, to see if their systems can detect AI output. I wrote a 150 word paragraph about coffee and I prompted each model shown below to do the same thing.

I fed each output (including my writing) through each tool. I would’ve expected the left-most column to be green (standing for human-detected writing) and the remainder of the columns to be red (standing for AI-detected writing).

Turns out that’s not the case. Here are the results:

I’ve redacted the names of the tools because I don’t want to pin the correctness of the tool; I’m looking at the variations between the tools and how each one handles each LLM output (plus mine).

Let’s understand a bit how a lot of these tools work. Each tool is slightly different under the hood, but generally speaking:

Each tool has an internal model trained to identify patterns from the outputs from each LLM.
Using the data the internal model is trained on, it determines the likelihood it is human or AI generated.

It’s important to remember that the numbers generated by the tools are probabilities, not a “this or that” number. Think of it as a polygraph.

The polygraph doesn’t actually know if you’re telling the truth or not, but it measures physiological attributes. Ultimately, this queues the person reading the information in on whether if the subject is lying or not.

For the purposes of this diagram, I’ve grouped the predicted probability from each tool that the text was human or AI generated. Specifically, the text that the tools predict to have:

Greater than 60% probability of being AI-generated is colored red.
Less than 40% probability of being AI-generated is colored green.
Between a 40-60% probability of being AI generated is colored yellow.

Is there a solution?

Yes, but no.

I don’t think it’s possible to pin content to be AI or human generated accurately in a “this or that” scenario. There’s a number of reasons for this:

The tool may not necessarily be up-to-date with the latest training set. The AI landscape, especially generative AI, is evolving extremely quickly.
There are dozens of LLM’s on a machine learning sharing platform, such as Hugging Face. Tools may not necessarily be trained properly to detect AI output from these models, since each model output has its own output intricacies/patterns.
Similarities between LLM and human generated content are converging on one another; they’re becoming more and more similar, thus making it more difficult to identify to begin with.
False positives can be tough to detect. Across all 10 tools analyzed, they collectively showed a 20% false positive rate (identifying human written as AI) and a 40% false negative (identifying AI written as human). Turnitin is a great example of how it’s impacted their software.

As the article points out, false positives causes problems within academia; you wouldn’t want to fail a student for using AI generated content if you can’t prove it. This could lead to having students figure out methods to evade detection to begin with by modifying some words or clauses in generated content.

On the flip side, I think that by noticing trends and patterns over time with multiple students, you can gather enough of an idea to either say that they’re either fully relying on generative AI, using generative AI to support their work, or not using it at all.

To wrap up, relying fully on AI to detect AI-generated content is like a game of cat and mouse. Just as the mouse constantly finds new ways to evade the cat, AI continually evolves to outpace the detection tools.

Enjoy this edition? I’d love to hear from you! Simply respond to this email and it’ll appear in my inbox!

Any links with an asterisk (*) are affiliate links

What is Midjourney?

Midjourney is an independent research lab exploring new mediums of thought and expanding the imaginative powers of the human species. To generate images, you’ll need a Discord account to join their server.

In this section, I’ve included tools you can use right here, right now to help you out with things.

1. Originality.ai

OK, fine, I’ll spill the beans to one of the tools that’s used in The Writing. Originality.ai is a tool that helps detect if the content was written by AI. I played around with this and I will admit, it’s pretty accurate.

You’re able to use 50 free credits before they make you pay monthly. Sign up for an account and play around with it on their website!

2. Reword

Reword is an application that lets you write your own content with the help of AI. It’s not designed to dodge AI detection systems, but you’ll be able to write better! Check out the software on their website!

3. Cape Privacy

When it comes to generative AI, you’re probably using ChatGPT. One huge issue with ChatGPT is that OpenAI collects all of the data you provide ChatGPT. Cape Privacy solves this issue by providing a layer of protection by encrypting your data and redacting anything sensitive.

What is DALL-E?

DALL-E is OpenAI’s image generator. Simply input a prompt using natural language and out comes an image. Generate images using their website!

The polygraph was intended originally as a medical device. It measures core vitals such as heart rate, blood pressure, and perspiration and was intended to monitor patients during surgery and diagnose cardiac anomalies.

What is Stable Diffusion?

Stable Diffusion is a product from Stability.ai, a company that open sources their models. You can generate images for free using DreamStudio.

I truly enjoy hearing from my audience - please reach out and let me know what you like, don’t like, and want to see!