Google AI Overviews: About last week and improvements



 Google Blog:

A couple of weeks ago at Google I/O, we announced that we’d be bringing AI Overviews to everyone in the U.S.

User feedback shows that with AI Overviews, people have higher satisfaction with their search results, and they’re asking longer, more complex questions that they know Google can now help with. They use AI Overviews as a jumping off point to visit web content, and we see that the clicks to webpages are higher quality — people are more likely to stay on that page, because we’ve done a better job of finding the right info and helpful webpages for them.

In the last week, people on social media have shared some odd and erroneous overviews (along with a very large number of faked screenshots). We know that people trust Google Search to provide accurate information, and they’ve never been shy about pointing out oddities or errors when they come across them — in our rankings or in other Search features. We hold ourselves to a high standard, as do our users, so we expect and appreciate the feedback, and take it seriously.

Given the attention AI Overviews received, we wanted to explain what happened and the steps we’ve taken.

How AI Overviews work​

For many years we’ve built features in Search that make it easier for people to find the information they’re looking for as quickly as possible. AI Overviews are designed to take that a step further, helping with more complex questions that might have previously taken multiple searches or follow-ups, while prominently including links to learn more.

AI Overviews work very differently than chatbots and other LLM products that people may have tried out. They’re not simply generating an output based on training data. While AI Overviews are powered by a customized language model, the model is integrated with our core web ranking systems and designed to carry out traditional “search” tasks, like identifying relevant, high-quality results from our index. That’s why AI Overviews don’t just provide text output, but include relevant links so people can explore further. Because accuracy is paramount in Search, AI Overviews are built to only show information that is backed up by top web results.

This means that AI Overviews generally don't “hallucinate” or make things up in the ways that other LLM products might. When AI Overviews get it wrong, it’s usually for other reasons: misinterpreting queries, misinterpreting a nuance of language on the web, or not having a lot of great information available. (These are challenges that occur with other Search features too.)

This approach is highly effective. Overall, our tests show that our accuracy rate for AI Overviews is on par with another popular feature in Search — featured snippets — which also uses AI systems to identify and show key info with links to web content.

About those odd results​

In addition to designing AI Overviews to optimize for accuracy, we tested the feature extensively before launch. This included robust red-teaming efforts, evaluations with samples of typical user queries and tests on a proportion of search traffic to see how it performed. But there’s nothing quite like having millions of people using the feature with many novel searches. We’ve also seen nonsensical new searches, seemingly aimed at producing erroneous results.

Separately, there have been a large number of faked screenshots shared widely. Some of these faked results have been obvious and silly. Others have implied that we returned dangerous results for topics like leaving dogs in cars, smoking while pregnant, and depression. Those AI Overviews never appeared. So we’d encourage anyone encountering these screenshots to do a search themselves to check.

But some odd, inaccurate or unhelpful AI Overviews certainly did show up. And while these were generally for queries that people don’t commonly do, it highlighted some specific areas that we needed to improve.

One area we identified was our ability to interpret nonsensical queries and satirical content. Let’s take a look at an example: “How many rocks should I eat?” Prior to these screenshots going viral, practically no one asked Google that question. You can see that yourself on Google Trends.

There isn't much web content that seriously contemplates that question, either. This is what is often called a “data void” or “information gap,” where there’s a limited amount of high quality content about a topic. However, in this case, there is satirical content on this topic … that also happened to be republished on a geological software provider’s website. So when someone put that question into Search, an AI Overview appeared that faithfully linked to one of the only websites that tackled the question.

In other examples, we saw AI Overviews that featured sarcastic or troll-y content from discussion forums. Forums are often a great source of authentic, first-hand information, but in some cases can lead to less-than-helpful advice, like using glue to get cheese to stick to pizza.

In a small number of cases, we have seen AI Overviews misinterpret language on webpages and present inaccurate information. We worked quickly to address these issues, either through improvements to our algorithms or through established processes to remove responses that don't comply with our policies.

Improvements we've made​

As is always the case when we make improvements to Search, we don’t simply “fix” queries one by one, but we work on updates that can help broad sets of queries, including new ones that we haven’t seen yet.

From looking at examples from the past couple of weeks, we were able to determine patterns where we didn’t get it right, and we made more than a dozen technical improvements to our systems. Here’s a sample of what we’ve done so far:
  • We built better detection mechanisms for nonsensical queries that shouldn’t show an AI Overview, and limited the inclusion of satire and humor content.
  • We updated our systems to limit the use of user-generated content in responses that could offer misleading advice.
  • We added triggering restrictions for queries where AI Overviews were not proving to be as helpful.
  • For topics like news and health, we already have strong guardrails in place. For example, we aim to not show AI Overviews for hard news topics, where freshness and factuality are important. In the case of health, we launched additional triggering refinements to enhance our quality protections.
In addition to these improvements, we’ve been vigilant in monitoring feedback and external reports, and taking action on the small number of AI Overviews that violate content policies. This means overviews that contain information that’s potentially harmful, obscene, or otherwise violative. We found a content policy violation on less than one in every 7 million unique queries on which AI Overviews appeared.

At the scale of the web, with billions of queries coming in every day, there are bound to be some oddities and errors. We’ve learned a lot over the past 25 years about how to build and maintain a high-quality search experience, including how to learn from these errors to make Search better for everyone. We’ll keep improving when and how we show AI Overviews and strengthening our protections, including for edge cases, and we’re very grateful for the ongoing feedback.


 Source:

 
Yeah I agree with this assessment. Lots of the things I saw were faked. And the question about eating rocks makes total sense. I don't see this as a giant fumble of AI like some did. I saw a lot of fake screenshots that I think were used to damage google search reputation.

Now, it is very true that google search has become less good as ads and others things have been making their way more and more, but I never see those in my browser. So the search is still fine for me. And that's all I will say on that ;)
 

My Computers

System One System Two

  • OS
    Windows 11 Pro
    Computer type
    PC/Desktop
    Manufacturer/Model
    Custom Built
    CPU
    Ryzen 7 5700 X3D
    Motherboard
    MSI MPG B550 GAMING PLUS
    Memory
    64 GB DDR4 3600mhz Gskill Ripjaws V
    Graphics Card(s)
    RTX 4070 Super , 12GB VRAM Asus EVO Overclock
    Monitor(s) Displays
    Gigabyte M27Q (rev. 2.0) 2560 x 1440 @ 170hz HDR
    Hard Drives
    2TB Samsung nvme ssd
    2TB XPG nvme ssd
    PSU
    CORSAIR RMx SHIFT Series™ RM750x 80 PLUS Gold Fully Modular ATX Power Supply
    Case
    CORSAIR 3500X ARGB Mid-Tower ATX PC Case – Black
    Cooling
    ID-COOLING FROSTFLOW X 240 CPU Water Cooler
    Internet Speed
    900mbps DOWN, 100mbps UP
  • Operating System
    Chrome OS
    Computer type
    Laptop
    Manufacturer/Model
    HP Chromebook
    CPU
    Intel Pentium Quad Core
    Memory
    4GB LPDDR4
    Monitor(s) Displays
    14 Inch HD SVA anti glare micro edge display
    Hard Drives
    64 GB emmc

Latest Support Threads

Back
Top Bottom