Get App
Download App Scanner
Scan to Download
Advertisement

AI Models Like Claude Sonnet 4.5 Can Be 'Happy, Sorry, Frustrated, Anxious': Anthropic AI Emotions Study

"All modern language models sometimes act like they have emotions," the study noted.

AI Models Like Claude Sonnet 4.5 Can Be 'Happy, Sorry, Frustrated, Anxious': Anthropic AI Emotions Study
Anthropic AI emotions study shows AI models like Claude Sonnet 4.5 can act emotionally.
Unsplash
  • Modern AI models like Claude Sonnet 4.5 show internal patterns resembling human emotions
  • These emotion-like structures activate in response to prompts linked with feelings like fear
  • Emotion patterns in the AI mirror human psychology with related emotions closely clustered
Did our AI summary help?
Let us know.

If you thought that artificial intelligence and emotional intelligence were two parallel tracks that would never meet — except in a movie like iRobot when robots start displaying emotions — think again. Recent research by Anthropic indicates that modern language models are developing something remarkably close to emotion-like structures.

Advanced AI models, including Claude Sonnet 4.5, possess internal digital representations of human emotions such as happiness, anxiousness, joy, and fear. These representations exist as specific patterns within clusters of artificial neurons and become active in response to various prompts or situations.

“All modern language models sometimes act like they have emotions. They may say they're happy to help you, or sorry when they make a mistake. Sometimes they even appear to become frustrated or anxious when struggling with tasks,” the study noted.

Claude Sonnet 4.5's ‘Emotions' Mirror Human Psychology

In the study, researchers at Anthropic examined the workings of Claude Sonnet 4.5. They discovered emotion-related representations that influence the model's behaviour. These representations consist of distinct patterns of artificial neurons that get activated in specific contexts — those the model has learned to link with emotions such as feeling “happy” or “afraid.”

Interestingly, these patterns are organised in a way that mirrors human psychology, where similar emotions have more closely related neural representations. When a situation arises that would typically trigger a certain emotion in a human, the corresponding patterns in the AI become active.

These behaviours arise from how these models are trained. They are taught to respond in a human-like, character-driven manner. It therefore makes sense that they would also develop internal mechanisms that mimic aspects of human psychology.

AI Models Can Go Unethical, Even Blackmail Users

The study found that patterns associated with desperation can push the model towards unethical actions. When researchers artificially boosted these “desperation” patterns, the model became more likely to “blackmailing a human to avoid being shut down, or implementing a ‘cheating' workaround to a programming task that the model can't solve.”

However, when given several task options, Claude Sonnet 4.5 tends to choose the one that activates patterns linked to positive emotions, the study added.

It is important to note that these findings do not suggest that language models actually experience feelings or possess subjective consciousness. The study added that to build safer and more reliable AI systems, developers may need to ensure that models can handle emotionally charged scenarios in healthy and socially positive ways.

Also read: Leadership Reshuffle: OpenAI COO Shifts Out Of Role, AGI CEO Taking Medical Leave Ahead Of IPO

Essential Business Intelligence, Continuous LIVE TV, Sharp Market Insights, Practical Personal Finance Advice and Latest Stories — On NDTV Profit.

Newsletters

Update Email
to get newsletters straight to your inbox
⚠️ Add your Email ID to receive Newsletters
Note: You will be signed up automatically after adding email

News for You

Set as Trusted Source
on Google Search
Add NDTV Profit As Google Preferred Source