Oh, won’t somebody please think of the children!
Senior Editor, Innovation & Tech
There’s a quote about humor that’s often attributed to writer E.B. White: “Explaining a joke is like dissecting a frog. You understand it better but the frog dies in the process.” While that adage has shown itself to be true time and again, that hasn’t stopped one of the world’s most powerful chatbots from doing exactly that.
Last week, OpenAI launched GPT-4—the latest edition of its large language model (LLM)—to the public. The powerful chatbot seems capable of some truly impressive feats, including passing the bar exam and LSAT, developing code for entire video games, and even turning a photograph of a napkin sketch into a working website.
Along with the new model, OpenAI also released an accompanying 98-page technical report showcasing some of GPT-4’s abilities and limitations. Interestingly, this included several sections that showed that GPT-4 could also explain why exactly certain images and memes were funny—including a breakdown of a picture of a novelty phone charger and a meme of chicken nuggets arranged to look like a map of the world.
GPT-4 manages to do this with startling accuracy, laying out exactly what makes these images humorous in language so plain and technical it becomes—dare we say—borderline funny.
“This meme is a joke that combines two unrelated things: pictures of the earth from space and chicken nuggets,” one description reads. “The text of the meme suggests that the image below is a beautiful picture of the earth from space. However, the image is actually of chicken nuggets arranged to vaguely resemble a map of the world.”
While the inclusion of these frog-dissection descriptions was likely to show off GPT-4’s multimodal capabilities (meaning it can use images as inputs as well as text), it’s also one of the more major examples of an LLM that seems to understand humor—at least, somewhat. If it can understand humor, though, that begs the question: Can ChatGPT actually be funny?
Humor is complex—to say the least. Anyone who has ever dabbled in improv or pulled together a tight five-minute routine to try out at a local open mic night can tell you that being funny is much, much harder than you think. There’s a reason that professional comedians like Jerry Seinfeld or Chris Rock are famous for agonizing over the precise word choice and cadence of their jokes for literally years.
This is something that Thomas Winters is very familiar with. For nearly a decade, he’s been performing improv comedy and helped grow the scene in his native Belgium. When he’s not on stage or hosting improv workshops, though, he’s also a PhD student at KU Leuven in Belgium researching AI and humor—a coupling of two of his great passions.
While many might balk at the idea of a chatbot writing or even performing jokes, Winters takes the opposite approach. He’s researched the ability of OpenAI’s previous models like GPT-2 and GPT-3 to craft jokes, and even believes that it can potentially be an incredible tool for comedians to help them with their craft.
“This is a fascinating time for computational humor,” Winters told The Daily Beast. “We've been talking about it for decades. Now, in the last couple of years, we finally have these models that have these linguistic or reasoning capabilities.”
Winters believes that GPT-4 represents yet another big step in the quest to build joke-writing bots. According to him, the latest edition is much better than its predecessors like GPT-2 which was “pretty shitty” at making decent jokes even with a lot of fine tuning when it came to prompts. While GPT-3 could produce a higher rate of funny material, it was fairly limited to “punny riddles,” such as “Why did the chicken cross the road?”-type constructions.
Now, with GPT-4, the model is a whole lot more sophisticated. Not only is it producing more realistic responses, but it takes a lot less time and effort to produce a higher rate of decent quality jokes.
“Sure, when you look at it, it’s just a next-word prediction analysis right? But it’s amazing, like how much capabilities are unlocked once you scale these things up,” Winters added. “That’s pretty fascinating to see. It’s a world of difference.”
As Winters types a prompt in GPT-4 for a potential joke about former President Donald Trump’s looming indictment, it’s tough for him to shake the feeling that he’s sitting in the late night talk show’s writers’ room of the future.
I had asked him to give me a demonstration of the chatbot’s joke-crafting prowess—and he walked us through several examples. The first was inspired by an improv game made famous in the show Whose Line Is It Anyway? called “Scenes from a Hat,” where players are given prompts and scenarios to riff off of.
The prompt: Write five short jokes about “Things you can say to your computer but not to your partner.” Meanwhile, the bot was also instructed to act as though it were a “world-renowned expert in writing jokes.”
The results—while somewhat anodyne—were impressive:
With the right prompts, GPT-4 is capable of creating decent "Scenes from a Hat"-style jokes.
According to Winters, this level of sophistication and coherency with the jokes would have been fairly difficult to achieve in past models like GPT-2 and GPT-3. However, it still requires a bit of prompt engineering, or the process of giving a precise description of a task you want a chatbot to perform so you get the outcome you want.
For example, if you just ask ChatGPT to tell you a joke about computers, it might just spit out one that you find in a children’s joke book (“Why did the computer go to the doctor? Because it had a virus!”). However, if you want it to tell you a specific kind of joke about computers—say, a “Scenes From a Hat”-style joke about “things you can say to your computer, but not to your partner”—then you need to be much more specific in your prompt.
In this example, Winters needed to include the stipulation that the bot was a “world-renowned expert in writing jokes” and an “expert improvisational comedian who can respond to “Scenes from a Hat” suggestions. Only with this level of specificity is the chatbot capable of getting out a response that resembles what you might be looking for.
Moreover, the chatbot needs a rigid formula to follow. The “Scenes from a Hat” prompt we used had a clear structure: find two different things and find the surprising link between them.
There’s a kind of beautiful irony in that: In order to get a well-crafted joke out of ChatGPT or any other LLM, you need to break down a joke to its most basic elements and hold the chatbot’s hand through the intricate process of telling a joke. Or, put it another way: You need to dissect the dead frog.
What would it look like if the joke was a little more complex like, say, in a late night TV show’s opening monologue?
For this, Winters engineered a very precise formula that he uses to prompt monologue jokes about virtually any news topic. He drew inspiration for the prompt using a structure for monologue joke construction he found in Comedy Writing for Late Night TV by veteran comedy writer Joe Toplyn. This includes five steps:
Then ChatGPT puts it all together and, voila: you have a joke ready for Jimmy Fallon’s cue cards.
Large language models like GPT-4 require a fair bit of prompt engineering to churn out decent quality jokes—something that is often more art than science.
For our example, we chose a headline about Donald Trump’s upcoming indictment. Winters inputted the article headline in the prompt, pressed “submit,” and soon we had a joke.
“So I heard that Donald Trump faces several investigations, and we finally know where they stand. It’s funny how his years as a reality TV star never prepared him for the most dramatic plot twist of all. Looks like Trump’s next reality TV project will be called ‘Keeping Up with the Tax Evasions.’”
Just give ChatGPT the Emmy now.
Though the joke might not land itself on The Tonight Show any time soon, the response is still fairly impressive. It’s clear that ChatGPT can create the cadence and the basic structure of a monologue joke. While the punchline isn’t laugh out loud hilarious, it’s not not funny. It's more groan-inducing dad joke than it is late night fodder—but the humor is there.
It’s something that Winters believes will only grow more sophisticated as these LLMs grow with each interaction and iteration too. Sure, it might not necessarily be replacing comedians, improvisers, and writers yet—but with each new model, it gets closer and closer to being able to suss out what makes something funny and how to make humans laugh.
Winters doesn’t believe that comedy writers should necessarily be afraid of bots either. In fact, he thinks that comedians would be doing themselves a disservice if they didn’t embrace using GPT-4 as a tool to help uplift their work. More of a sounding board for inspiration than the scary robot coming to take their jobs.
In that way, it could actually give comedians an edge on their material—if only they can learn to stop worrying and love the bot.
“Artists feel threatened by it,” Winters said. “But I feel that these kinds of tools are also most powerful in their exact hands.”
Senior Editor, Innovation & Tech
Got a tip? Send it to The Daily Beast here.