We Tested 30k+ Human Essays on ZeroGPT - Here's What We Found

Independent study reveals ZeroGPT flagged 9,987 out of 37,874 guaranteed human essays from 2010-2021 as AI-generated.

Daniel Anderson

Last updated: Sep 18, 2025

While gathering training data for Phrasly's latest detection models, we decided to test something interesting. We collected 37,874 student essays written between 2010 and 2021 - years before ChatGPT existed - and ran them through ZeroGPT's API.

Every essay was 100% human-written. The results? ZeroGPT flagged 9,987 of these genuine essays as AI-generated. That's 26.4% of confirmed human writing getting marked as artificial.

Detection Results Overview: 9,987 false positives out of 37,874 human essays

The Numbers Don't Lie

Out of 37,874 guaranteed human essays:

27,887 correctly identified as human (73.6%)
9,987 incorrectly flagged as AI (26.4%)

This means roughly 1 in 4 human-written essays gets wrongly accused. For context, that's nearly 10,000 false accusations in our dataset alone.

Why This Happens

ZeroGPT uses older detection algorithms that struggle with formal academic writing. The structured language students naturally use in essays triggers false alerts designed to catch AI patterns.

It's also worth noting that ZeroGPT has partnerships with AI humanizer services. When students get falsely flagged, they often panic and purchase humanizer tools to "fix" their perfectly legitimate writing. The business incentive is clear - more false positives potentially drive more sales to partner services.

What We Built Instead

Seeing these accuracy problems firsthand shaped our approach at Phrasly. Rather than aggressive flagging that destroys innocent writers, we focused on genuine accuracy. Our detection accounts for natural academic writing patterns without generating mass false accusations.

Timeline showing consistent false positive rates across 2010-2021

The Business Model Problem

Here's the uncomfortable reality: ZeroGPT's partnerships with humanizer services create a conflict of interest. The more false positives generated, the more panicked students seek out humanization tools. Whether intentional or not, there's little business incentive to dramatically improve accuracy when inaccuracy drives partner revenue.

What Students Can Do

If ZeroGPT flagged your human writing, you're part of a documented pattern affecting thousands of legitimate students. Keep records of your writing process, understand your school's appeal procedures, and know that detection technology has proven limitations.

Don't let false accusations define your academic experience.

Conclusion

Our analysis reveals a fundamental problem with current AI detection - 26.4% of genuine human essays get wrongly flagged as artificial. This isn't just a technical glitch; it's a systematic issue damaging real academic careers.

Until the industry prioritizes accuracy over aggressive detection, students need protection from tools that falsely accuse 1 in 4 human writers. The data speaks for itself: current detection technology isn't ready for high-stakes academic decisions without human oversight.

This study analyzed 37,874 essays from 2010-2021 using ZeroGPT's API to measure detection accuracy on guaranteed human content.

The Numbers Don't Lie

Why This Happens

What We Built Instead

The Business Model Problem

What Students Can Do

Conclusion

Written by Daniel Anderson

Keep reading

America’s New AI Hot Spots: Where the Best Roles and Salaries Actually Are

Confidently Wrong: How People Rank AI’s Strengths and Weaknesses by Model