- Modern AI models like Claude Sonnet 4.5 show internal patterns resembling human emotions
- These emotion-like structures activate in response to prompts linked with feelings like fear
- Emotion patterns in the AI mirror human psychology with related emotions closely clustered
If you thought that artificial intelligence and emotional intelligence were two parallel tracks that would never meet — except in a movie like iRobot when robots start displaying emotions — think again. Recent research by Anthropic indicates that modern language models are developing something remarkably close to emotion-like structures.
Advanced AI models, including Claude Sonnet 4.5, possess internal digital representations of human emotions such as happiness, anxiousness, joy, and fear. These representations exist as specific patterns within clusters of artificial neurons and become active in response to various prompts or situations.
“All modern language models sometimes act like they have emotions. They may say they're happy to help you, or sorry when they make a mistake. Sometimes they even appear to become frustrated or anxious when struggling with tasks,” the study noted.
Claude Sonnet 4.5's ‘Emotions' Mirror Human Psychology
In the study, researchers at Anthropic examined the workings of Claude Sonnet 4.5. They discovered emotion-related representations that influence the model's behaviour. These representations consist of distinct patterns of artificial neurons that get activated in specific contexts — those the model has learned to link with emotions such as feeling “happy” or “afraid.”
Interestingly, these patterns are organised in a way that mirrors human psychology, where similar emotions have more closely related neural representations. When a situation arises that would typically trigger a certain emotion in a human, the corresponding patterns in the AI become active.
These behaviours arise from how these models are trained. They are taught to respond in a human-like, character-driven manner. It therefore makes sense that they would also develop internal mechanisms that mimic aspects of human psychology.
AI Models Can Go Unethical, Even Blackmail Users
The study found that patterns associated with desperation can push the model towards unethical actions. When researchers artificially boosted these “desperation” patterns, the model became more likely to “blackmailing a human to avoid being shut down, or implementing a ‘cheating' workaround to a programming task that the model can't solve.”
However, when given several task options, Claude Sonnet 4.5 tends to choose the one that activates patterns linked to positive emotions, the study added.
It is important to note that these findings do not suggest that language models actually experience feelings or possess subjective consciousness. The study added that to build safer and more reliable AI systems, developers may need to ensure that models can handle emotionally charged scenarios in healthy and socially positive ways.
Also read: Leadership Reshuffle: OpenAI COO Shifts Out Of Role, AGI CEO Taking Medical Leave Ahead Of IPO
Essential Business Intelligence, Continuous LIVE TV, Sharp Market Insights, Practical Personal Finance Advice and Latest Stories — On NDTV Profit.