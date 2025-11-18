Smartest AI? Musk's New Grok 4.1 Model Beats ChatGPT, Gemini Pro
For the first time since its launch, the new Grok 4.1 has dethroned Gemini 2.5 Pro on the LMArena leaderboard for text related tasks.
The new 'Grok 4.1 Thinking' launched by Elon Musk's xAI has stolen the spotlight from AI contemporaries like ChatGPT and Gemini after surpassing them on various metrics.
For the first time since its launch, the new Grok 4.1 has dethroned Gemini 2.5 Pro on the LMArena leaderboard for text-related tasks. Grok 4.1 Thinking and Grok 4.1 have filled the number one and number two spots, respectively.
The LMArena is a crowdsourced leaderboard for large language models which provides community driven assessment of LLM driven performance across categories like text generation, coding, search among others based on votes by users.
Grok 4.1 Thinking has also surpassed big names such as the likes of Open AI's ChatGPT and Anthropic's Claude.
Additionally, it has once again topped the leaderboard for EQ or Emotional Quotient Bench of the LMArena leaderboard which assesses emotional intelligence abilities, understanding, insight, empathy, and interpersonal skills.
Grok 4.1 Thinking and Grok 4.1 were followed by Kimi K2 in the third spot and Gemini 2.5 Pro and GPT 5 in the fifth and sixth spots respectively.
As far as the Creative Writing v3 benchmark is concerned, the Grok models were in the second and third spots.
The leaderboard for the benchmark in question, which analyses language models' responses to 32 distinct writing prompts across three iterations, was topped by an early version of ChatGPT 5.1, also known as Polaris Alpha. In the fourth rank there was OpenAI's O3.
The Beginning Of Grok 4.1 Thinking
The new grok update started off with a silent rollout in the beginning of November. By Nov. 14 the new LLM was added to the Grok website, X and the Grok mobile apps. It is now being available to all users.
According to xAI, the new model has reduced hallucinations to 4.22% in comparison on Grok 4.0, which had a hallucination rate of 12.09%.
Hallucinations in this context refer to false, nonsensical and inaccurate information generated by AI models.