GPTZero works by analyzing a piece of text and determining if there is a high or low indication that a bot wrote it. It looks for two hallmarks: “perplexity” and “burstiness.” “Perplexity” is how likely each word is to be suggested by a bot; a human would be more random. “Burstiness” measures the spikes in how perplex each sentence is. A bot will likely have a similar degree of perplexity sentence to sentence, but a human is going to write with spikes — maybe one long, complex sentence followed by a shorter one. Like this.
To test out Tian’s creation, I fed it a short essay written by ChatGPT using a prompt that a would-be high school cheater might try: Describe the main theme of Hamlet. (“The main theme of Shakespeare’s play ‘Hamlet’ is the struggle of the main character, Hamlet, to come to terms with the fact that his uncle has murdered his father and taken the throne…” blah blah and so on.)
GPTZero gave the essay a perplexity score of 10 and a burstiness score of 19 (these are pretty low scores, Tian explained, meaning the writer was more likely to be a bot). It correctly detected this was likely written by AI.
For comparison, I entered the first half of this article, which I wrote myself, into the tool. Perplexity: 39; burstiness: 387. (Ironically, it determined the sentence with highest perplexity was “I want people to use ChatGPT,” he said.) Ultimately, GPTZero deemed the essay likely to be human. Correct!
However, the exact success rate of GPTZero is unclear. At least one Twitter user said that it failed to catch a few of their AI-written samples. Elsewhere on the platform, the reaction has been mixed: Adults are praising the effort, and others, mostly teens, are calling Tian a “narc.”
Tian told the Daily Beast in an interview that after his tweet about it, his DMs were blowing up from venture capital interest. For now, though, he plans on keeping his creation free and accessible. “I want to support freshman English teachers everywhere,” he said.