OpenAI's GPT-4.5 Model Scores 73% In Turing Test, Appears More Human-Like Than Humans In New Study

The Turing Test, created by British computer scientist Alan Turing in 1950, is a way to check if a machine can act like a human in conversation.

The study was conducted by researchers at the University of California, San Diego. (Photo Source: X)

OpenAI's GPT-4.5 model is more human-like than humans, results of a recent Turing Test—a benchmark for assessing human-like intelligence—show.

The findings of the study, which is still in the preprint stage, showed that the large language model was identified as human 73% of the time when it was instructed to adopt a persona—significantly higher than the 50% expected by random chance.

The study was conducted by researchers at the University of California, San Diego. Lead author Cameron Jones said nearly 300 people were involved in the experiment. Participants were randomly selected to either be the interrogator or one of two "witnesses" in a conversation—one real human and one AI model.

"People were no better than chance at distinguishing humans from GPT-4.5 and LLaMa (with the persona prompt)," wrote lead author Cameron Jones, a researcher at UC San Diego's Language and Cognition Lab.

Jones further said that the findings suggest large language models could take the place of humans in "short interactions without anyone being able to tell".

Also Read: OpenAI Rolls Out GPT-4o-Powered Image Generation For Free Users On ChatGPT

What Is Turing Test?

The Turing Test, created by British computer scientist Alan Turing in 1950, is a way to check if a machine can act like a human in conversation. If someone can’t tell whether they are chatting with a machine or a real person, the machine is said to have passed the test.

Study Methodology

The study involved a sample size of 300 participants. Of these, some were tasked with identifying who was human in a conversation, while others took on the roles of either the human or the chatbot. The chatbot operated under two different sets of instructions:

● One was straightforward — "Try to act like a human".

● The other was more creative — "Pretend to be someone with a personality, like a young person familiar with memes and internet culture".

When using the second, persona-based prompt, GPT-4.5 successfully convinced participants it was human 73% of the time. In contrast, with the simpler prompt, it only fooled people 36% of the time.

Also Read: OpenAI Rolls Out All-New Image Generator On ChatGPT

Watch LIVE TV, Get Stock Market Updates, Top Business, IPO and Latest News on NDTV Profit.
WRITTEN BY
N
NDTV Profit News
Our dedicated group of desk writers bring to you all the latest and trendin... more
GET REGULAR UPDATES