Scientifically Speaking: When machines become poets
The simplicity of the AI poems, free from messy metaphors or complex allusions common in human poetry, seemed to appeal to many readers.
I want to start off this science column in a slightly different way this week, with a poem. Please bear with me, because we will get to science very soon. Here’s a short poem that seems like it was written by the American poet, Sylvia Plath.
The air is thick with tension,
My mind a tangled mess.
The weight of my emotions
Is heavy on my chest.
The darkness creeps upon me,
A suffocating cloak.
The world outside is cruel and cold,
And I’m a fragile, broken yolk.
My thoughts are spinning wildly,
A cyclone in my brain.
I try to grasp at something solid,
But all is lost in vain.
The voices in my head,
They never cease to scream.
And though I try to shut them out,
They haunt me like a dream.
So here I am, alone and lost,
A ship without a sail.
In this world of pain and sorrow,
I am but a mere wail.
The poem is written in Plath’s characteristic confessional style with striking imagery. For example, “a fragile, broken yolk” conveys vulnerability with an unexpected, startling intimacy reminiscent of Plath’s poetry.
You wouldn’t be in the minority if you thought the poem was written by Plath. But here’s the twist: this poem wasn’t written by Sylvia Plath or any other human poet. It was created by OpenAI’s GPT-3.5, an artificial intelligence trained to mimic language and tone, including the emotionally charged Plath.
Researchers Brian Porter and Edouard Machery from the University of Pittsburgh used AI to mimic Plath, William Shakespeare, and Walt Whitman, and other poets to test if people could tell the difference between poems written by people and those written by AI trained to mimic them. They published their findings in a recent study published in Scientific Reports.
Given that I’m writing about this paper in my science column you might have guessed the outcome already. People couldn’t reliably tell the difference between poetry created by human poets and those created by AI. What’s more, they often believed that the AI-generated poems were human-authored often rating them more beautiful and rhythmic than poems from real poets!
So, what did Porter and Machery actually test? First, they presented over 1,600 participants with a mix of poems from ten notable poets. Half of the poems were genuine; the other half were generated by GPT-3.5, designed to capture each poet’s distinctive style. The participants were tasked with identifying which poems were human-authored and which were AI-generated. Surprisingly, they identified the source correctly only 46.6% of the time (which is worse than chance). Many participants found the AI poems so convincingly “human” that they assumed they were real, particularly because these verses tended to be simpler and easier to understand.
In a second experiment, the researchers had another group of around 700 participants rate the poems on qualities such as rhythm, imagery, emotional depth, and sound. Some participants were told the poems were human-written, others were told they were AI-generated, and a third group was given no information on authorship. Predictably, participants rated poems labeled as human higher, even if they were actually written by AI.
But when the participants didn’t know the origin, AI poems often outscored their human counterparts, especially on qualities like rhythm and accessibility. The simplicity of the AI poems, free from messy metaphors or complex allusions common in human poetry, seemed to appeal to many readers.
While human poets, like Plath, pour personal struggles and life experiences into their art, AI doesn’t feel the weight of sorrow or joy, or anything at all. It reassembles language patterns based on probabilities.
So, is it game over for poets, philosophers, artists, and their ilk? Certainly, the ability of AI to produce convincing poetry raises philosophical questions about creativity and authenticity. But Plath didn’t write poems with words for maximum popularity, she wrote to express her pain and her doubts. Every poem she penned was shaped by a lifetime of experiences, emotional turmoil, and creative struggle. In contrast, an AI-generated poem above is an arrangement of learned patterns, an echo without a soul.
This study was conducted with GPT-3.5, and we’re now well into the era of GPT-4, with even more advanced models on the horizon. AI will keep improving at capturing the subtleties of human tone, rhythm, and style. But for human poets, the struggle with language will remain. And that’s kind of the point. We will continue to wrestle with “the clay of language,” shaping raw emotion into words through a process that is vulnerable and unpredictable.
I’m a scientist, but I also read and write poetry. I recall John Keating’s memorable line in Dead Poets Society, “We don’t read and write poetry because it’s cute. We read and write poetry because we are members of the human race. And the human race is filled with passion. And medicine, law, business, engineering- these are noble pursuits and necessary to sustain life. But poetry, beauty, romance, love- these are what we stay alive for.”
Human poetry is irreplaceable because art can never be separated from the artist.
Anirban Mahapatra is a scientist and author, most recently of the popular science book, When The Drugs Don’t Work: The Hidden Pandemic That Could End Medicine. The views expressed are personal.