Fun fact: pages hit by Site Reputation Abuse manual actions are still eligible to rank prominently in AI Overviews, including for extremely high-volume, valuable "money" keywords. I am even seeing a penalized article (hit on November 19) that just moved into position #1 in AIO, today.
it's not new information that google ignores manual actions in AIO. what is new for me is that these pages can NEWLY appear in AIO citations, despite being penalized, nearly 2 months later.
My theory is the same as what it was when this launched and I saw penalized sites:
The AI overview is only using the first step of ranking. what we called Linear. It's not doing the "twiddlers" and advanced search systems that require ML or more computation - and manual penalties are part of that secondary stage of ranking.
I'm guessing because they aren't using links or domain authority or any of those signals in ranking for AI, they don't need to penalize the stuff in there that abuse those signals.
I belive this is just straight relevance signals (cossim, bm25, whatever) for whatever multi word passage it extracts from the AI answer and does a search for. - which, is how the first layers of ranking in web search basically work too.
I wonder if another issue is also that it pulls in the information first, finds 85% relevance to the site, but can't realistically link to any other site and would take too long to rewrite without those sites (and they aren't pre-processing to exclude those sites and data).
so they backdoor into it. cuz they don't really know what sites fed AI. cuz it doesn't work like that. so according to the patent:
AI generates an answer. The website snippets/text that rank are part of the prompt. so are websites that rank for related queries to the user query. Then once an answer is generated - they use tech similar to passage BERT to extract "passages" from the generated summary. They then find sites that rank for those passages and cite them.