Shy Girl, Mob Rules, and the AI Detection Problem

on writing

23 Mar

The publishing world had its first real AI blow up this week, and the fallout is horrendous (and sad, honestly).

On March 19, Hachette Book Group cancelled the US release of Shy Girl, a horror novel by Mia Ballard, and discontinued the UK edition. The reason? Hachette suspected large portions of the book were AI-generated.

Ballard denies writing the book with AI. She told the New York Times that an unnamed acquaintance she'd hired to edit the original self-published version had used it without her knowledge.

I genuinely don't know if that is true in the sense that, how can anyone know? Unless someone confesses. But the way this has unfolded deserves a closer look because the mob mentality surrounding it, and some arguments that are being thrown around, are just as concerning.

How we got here

Ballard self-published Shy Girl in early 2025 on Kindle Unlimited. It quickly gained traction and got thousands of ratings on Goodreads. Hachette picked it up, published a UK edition in November 2025 through its Wildfire imprint, and had a US release scheduled for April 2026 through Orbit. While not everyone liked it, which is hardly uncommon, the early reviews were largely positive.

Then the tide turned. A Reddit post in January 2026 from someone claiming to be a book editor flagged the writing as having the hallmarks of large language model output (LLMs are a specific type of generative AI model). A YouTuber published a bloated (over two and a half hour) video analysing the text. It went viral with over 1.2 million views.

And that's when things really went off.

Bad reviews and AI accusations started piling up on Goodreads. But I’d love to know how many of those one-star reviews came from people who'd actually read the book before the controversy, and how many showed up after the Reddit threads and the YouTube detective work framed it as “AI slop”? I’m not saying it was all of them, but the answer still matters. Because we all know what happens when a narrative takes hold online…

The 78 per cent figure nobody is questioning

One of the most cited pieces of 'evidence' in this whole saga is the claim that Shy Girl is 78 per cent AI-generated. That number has been repeated in headline after headline. But almost nobody is talking about where it came from or why.

It seems as though the founder and CEO of Pangram, an AI detection software company, saw the online controversy and ran the book through his own product before posting the results publicly on X. As far as I can see nobody asked him to do that. Not Hachette or the New York Times. So it appears as though he inserted himself and his commercial product into a delicate, trending story, then published the dramatic number.

I'm not saying the test result is necessarily wrong (though AI detectors are broadly known to underperform). But I am saying that a CEO running his own product on a viral controversy and publicly sharing the results doesn’t sound like an independent investigation.

[Note: It came to my attention (and I checked for myself) that the copy run through Pangram had the name oceanofpdf.com embedded in it. It’s a book pirating website. There are a number of possible issues with that but I would begin at: how do you know a pirated copy is exactly the same as the one published by Hachette? Or even that it was the same one originally self-published? I also found out two other AI Detectors tested the novel, but that still doesn’t take away from my overall points.]

Why has the obvious conflict of interest gone so unquestioned? Or have I missed something? The 78 per cent figure has been treated as near-fact, when it's one output from a commercial tool. And don’t get me wrong, we need these tools. It’s the confidence that concerns me, because I do not believe it’s possible to trust in these tools right now.

Enter James Frey

Some commentators are raising the name James Frey. Let’s take a quick look at his story.

Author, Frey, was hugely shamed in the industry when his 2003 memoir A Million Little Pieces was exposed as partly fabricated. An investigation was published in January 2006 and Oprah Winfrey, who'd championed the book through her book club, really hauled him over the coals on her show, telling him to his face that he'd betrayed millions of readers. He was dropped by his literary agent and his publisher settled a huge class-action lawsuit.

But Frey didn't just disappear. By late 2007 he had signed a large new deal with HarperCollins. He kept on writing and then in 2023, he openly discussed in an interview having used generative AI in a manuscript he was working on called FOURSEVENTYSIX. He said he'd included a footnote telling readers he'd used AI, that the AI had been instructed to mimic his style, and that he considered every word his own. That book was never published, though.

Then, Next to Heaven came out in June 2025. His publisher told Book of the Month that Frey had written every word. Frey himself told the New York Times he hadn't used AI on it, though he'd experimented with it on an earlier project he'd abandoned. But by July 2025, only a month after publication, he had apparently confirmed AI was involved after all.

It’s a rollicking ride but now to the critical way it is distinct to Ballard’s.

The race argument

I've seen people on Bookstagram arguing that the difference in treatment between Frey and Ballard is a race issue. Ballard is a Black woman. Frey is white. Frey kept his career. Ballard lost her deal.

I understand why people reach for that framework and there is another black author who has received some accusations of AI use in Goodreads reviews. She’s also an Indie author who has recently got a publishing deal which sounds a bit suspicious, doesn’t it? But AI is mentioned in just 46 out of nearly 10,000 reviews she’s received, and it has not affected her contract. So, while I see concerns with it (for other reasons), I don't find it a compelling argument in relation to Frey, nor that it’s specifically a racial discrimination issue at this point. The reason seems simpler: money.

Frey was already a bestseller when his first scandal broke. A Million Little Pieces had sold over five million copies. He had commercial leverage, and the fact is that would make publishers more willing to take another bet on him despite the controversy. And even then, he had still been dropped by his agent, he’s still faced the class-action lawsuit, and been publicly humiliated on national TV. The idea that he sailed through unscathed just isn't true. He clawed his way back and, because he had an existing readership and enough capital built within the industry already, he was able to get another deal.

Ballard was a debut self-published author. Early career, without the big money making track record. She didn't have the same buffer and ultimately publishing houses have to make enough money to stay afloat and, of course, some decisions will be financial.

I imagine when Hachette weighed the risk of standing behind Ballard versus the cost of letting her go, it seemed easier to let her go. It’s brutal but I think it would apply to most new authors caught in a controversy like this.

Hellish social media and comments sections

Meanwhile, and sadly, Bookstagram is burning away (and I’d hate to see BookTok). There are people asking if they should rip up their copies, and piling on with a confidence the evidence doesn't support.

One comment in particular stopped me cold. A woman argued that Ballard must have used AI on Shy Girl because, she claimed, the cover of Ballard's other book, Sugar, was AI-generated. But I think this commenter was confusing what happened with the original Shy Girl cover, where Ballard, unfortunately, used an image she wasn’t entitled to on the cover (later fixing that ‘situation’ from what I can tell).

But even if the earlier novel’s cover was AI-generated, an AI book cover does not prove an AI-written book. The leap in logic there is as frightening as it is ridiculous.

I occasionally use generative AI to make images and short videos for Instagram to promote my novels (a still video screenshot example is below). That doesn't mean my novels were written with generative AI. If that logic held up, half the authors with a Canva account would be under investigation.

Assistive vs generative: the gap nobody wants to learn about

Another claim I saw today was that behind the scenes publishers are actively asking new authors to use AI. This wouldn’t surprise me but, if true, I’d assume they’re talking about assistive AI tools. Those are the ones that do grammar checks and find repeated words or phrases in a manuscript. That's a world away from asking ChatGPT to write your chapters.

The distinction between assistive and generative AI is one I wish more readers, writers, and commentators would take the time to understand. I also wish writing organisations and publishers would come out to state explicitly where they stand on assistive AI tools (please🙏🏻). There seems to be a quiet acceptance of them, but the vagueness around it is awkward.

I wonder if they're scared, though, and that is relatable. Scared that defining the line too clearly will invite the same mob energy currently aimed at Ballard. But then, leaving the ruling on it so vague is creating confusion, and confusion can give fuel to accusations based on “vibes” rather than evidence. In the meantime, I suspect (not all, but) a majority of writers are using the spell and grammar checking abilities of assistive AI products.

The inconvenient truth about AI detection

Now, maybe Shy Girl really was written mainly by generative AI. Maybe it was the editor. Maybe it was Ballard herself. But here's the uncomfortable reality: right now, I don’t believe there’s a reliable way to prove it without a confession.

I've tested AI detectors myself. I've run fully AI-written text through them, part-AI text, and 100 per cent human-written work. The results were all over the shop. Across multiple detection tools, I got correct results along with wildly wrong ones. Every category. Every tool I could find, including full versions. The technology isn't there and I honestly can’t tell you if it ever will be. It’s unfortunate, but anyone claiming certainty based on an AI detection program should be met with some scepticism.

Given that, and that the public surely does not know all of the details, I find what happened to Ballard horrendous.

A note for the 'we told you so' crowd

If it eventually comes out, through some actual admission, that Shy Girl was predominantly AI-generated and that Ballard was responsible, I suspect some people will read this post with huffs of 'we told you so' and 'it was obvious.'

But in the broader sense? The actual charge—of Shy Girl having been written with AI—is not obvious or “sure”. And what I've said here is worth saying regardless of the eventual outcome, because this story raises questions about mob justice, flawed detection tools, the concern of those with a commercial interest testing the work, and the lack of understanding many people have about the differences between assistive versus generative AI. And most of these issues won’t disappear any time soon.

So I'm not afraid to be wrong here. I just prefer to be cautious about this rather than reckless in my accusations of an early career author.

Morgan x

(Mia, should you read this I feel for you right now. Hang tight. Regardless of what happened, this will pass. x)

You might be interested in reading my other AI post, AI Is Already Changing How You Write — Even If You Don't Use It or my AI Use Statement.

M. N. Cox