- Google announced TurboQuant, a new algorithm reducing memory use in AI systems and LLMs
- TurboQuant improves data storage efficiency in key-value caches, aiding faster system performance
- Memory chip stocks SanDisk, Micron, Western Digital, and Seagate declined after the announcement
Several memory chip stocks on Wall Street declined on Wednesday after Google announced a new compression algorithm designed to reduce the amount of memory needed to run large language models and vector search engines.
The TurboQuant can reduce memory usage in AI systems, especially in large language models. Google said the technology is designed to improve how data is stored in key-value cache, which helps systems run more efficiently and reduces the need for high memory capacity.
Shares of SanDisk Corp. dropped as much as 4%, Micron Technology fell 3.4%, Western Digital closed over 1% lower and Seagate Technology slipped nearly 3%, data from Tradingview showed. The tech benchmark Nasdaq 100 closed 0.7% higher.
Announced on Tuesday, TurboQuant works in two steps. First, PolarQuant compresses data by rotating vectors for better efficiency. Second, the Quantized Johnson-Lindenstrauss algorithm removes remaining errors. Google noted that older methods added extra memory bits, reducing overall compression gains and efficiency in AI systems.
Also Read | AI Impact On Wall Street: Five Key Trends Indians Investing In US Stock Market Should Watch Out
Algorithmic Advances
Google tested TurboQuant on benchmarks like LongBench, Needle In A Haystack, ZeroSCROLLS, RULER, and L-Eval using Gemma and Mistral models. According to the company, results show better accuracy, recall, and lower KV memory use, outperforming PolarQuant and KIVI across tasks like question answering, coding and summarisation.
While presenting its findings, Google said that TurboQuant, QJL and PolarQuant are not just practical tools but strong algorithmic advances with solid theoretical backing.
"These methods don't just work well in real-world applications; they are provably efficient and operate near theoretical lower bounds. This rigorous foundation is what makes them robust and trustworthy for critical, large-scale systems," the search engine giant said in its blogpost.
It noted that this technology reduces memory use, needs little pre-processing and delivers high accuracy. “This makes semantic search at Google's scale faster and more efficient. As AI becomes more integrated into all products, from LLMs to semantic search, this work in fundamental vector quantisation will be more critical than ever,” the US tech giant said.
Also Read: Nvidia Earnings Slam Into Market With No Patience For AI Hiccups
Essential Business Intelligence, Continuous LIVE TV, Sharp Market Insights, Practical Personal Finance Advice and Latest Stories — On NDTV Profit.