Smartest AI? Musk's New Grok 4.1 Model Beats ChatGPT, Gemini Pro

For the first time since its launch, the new Grok 4.1 has dethroned Gemini 2.5 Pro on the LMArena leaderboard for text related tasks.

Grok tops the EQ benchmark, beats peers. (Source: xAI)

The new 'Grok 4.1 Thinking' launched by Elon Musk's xAI has stolen the spotlight from AI contemporaries like ChatGPT and Gemini after surpassing them on various metrics.

For the first time since its launch, the new Grok 4.1 has dethroned Gemini 2.5 Pro on the LMArena leaderboard for text-related tasks. Grok 4.1 Thinking and Grok 4.1 have filled the number one and number two spots, respectively.

The LMArena is a crowdsourced leaderboard for large language models which provides community driven assessment of LLM driven performance across categories like text generation, coding, search among others based on votes by users.

Grok 4.1 Thinking has also surpassed big names such as the likes of Open AI's ChatGPT and Anthropic's Claude.

Additionally, it has once again topped the leaderboard for EQ or Emotional Quotient Bench of the LMArena leaderboard which assesses emotional intelligence abilities, understanding, insight, empathy, and interpersonal skills.

Grok 4.1 Thinking and Grok 4.1 were followed by Kimi K2 in the third spot and Gemini 2.5 Pro and GPT 5 in the fifth and sixth spots respectively.

As far as the Creative Writing v3 benchmark is concerned, the Grok models were in the second and third spots.

The leaderboard for the benchmark in question, which analyses language models' responses to 32 distinct writing prompts across three iterations, was topped by an early version of ChatGPT 5.1, also known as Polaris Alpha. In the fourth rank there was OpenAI's O3.

The Beginning Of Grok 4.1 Thinking 

The new grok update started off with a silent rollout in the beginning of November. By Nov. 14 the new LLM was added to the Grok website, X and the Grok mobile apps. It is now being available to all users.

According to xAI, the new model has reduced hallucinations to 4.22% in comparison on Grok 4.0, which had a hallucination rate of 12.09%.

Hallucinations in this context refer to false, nonsensical and inaccurate information generated by AI models.

Also Read: Cloudflare Outage Hits Internet: X, Perplexity, ChatGPT Among Various Websites Affected

Watch LIVE TV, Get Stock Market Updates, Top Business, IPO and Latest News on NDTV Profit. Feel free to Add NDTV Profit as trusted source on Google.
WRITTEN BY
Khushi Maheshwari
Khushi hails from Aligarh and is a desk writer at NDTV Profit after passing... more
GET REGULAR UPDATES
Add us to your Preferences
Set as your preferred source on Google