Google has clarified its stance on AI-generated content, stating that quality and human oversight are important, instead of focusing solely on whether content is “human-created.” Google’s Gary Illyes shared his thoughts on AI-generated content in an interview with Search Engine Journal.
When asked about the AI models powering Google’s AI Overviews (AIO) and AI Mode, Illyes said, “The model that we use for AIO (for AI Overviews) and AI mode is a custom Gemini model and that might mean that it was trained differently.”
Illyes was also asked about whether AIO and AI Mode use separate indexes for grounding, the process by which AI responses are tied to verifiable sources. Illyes clarified that both services rely on Google Search for grounding. “So basically, they issue multiple queries to Google Search and then Google Search returns results for those particular queries.”
The conversation turned to the role of the Google Extended crawler. Illyes explained, “You have to remember that when grounding happens, there’s no AI involved. So basically, it’s the generation that is affected by the Google Extended. But also if you disallow Google Extended, then Gemini is not going to ground for your site.”
When asked about the potential impact of AI content on large language models (LLMs), Illyes said that the search index is unaffected but warned about training risks. “I’m not worried about the search index, but model training definitely needs to figure out how to exclude content that was generated by AI. Otherwise you end up in a training loop which is really not great for training. I’m not sure how much of a problem this is right now, or maybe because how we select the documents that we train on.”
On being asked whether Google cares about how content is created, as long as it meets quality standards, Illyes stressed that content quality and factual accuracy are paramount. “if you can maintain the quality of the content and the accuracy of the content and ensure that it’s of high quality, then technically it doesn’t really matter. The problem starts to arise when the content is either extremely similar to something that was already created, which hopefully we are not going to have in our index to train on anyway.”
He added that the bigger issue arises “when you are training on inaccurate data,” as this can introduce biases and false information into the models. But if the content is of high quality, “which typically nowadays requires that the human reviews the generated content,” it is suitable for model training, Illyes said.
Illyes added that human review means checking the content carefully, not just adding a note on a website saying it has been reviewed. “When we say that it’s human, I think the word human created is wrong. Basically, it should be human curated.”
In summary, Google’s guidance, as outlined by Illyes in the Search Engine Journal interview, is that AI-generated content is acceptable for search and model training if it is original, factually accurate and reviewed by humans.
RECOMMENDED FOR YOU

Decoding AI Trends And Regulatory Grid


Fine Organic Q1 Review: PL Capital Maintians 'Buy', Revises Target Price — Here's Why


Irdai Imposes Rs 5 Crore Fine On Policybazaar Insurance Brokers


Google Suffers Blow In Fight Over EU’s 4.1 Billion-Euro Android Fine
