Cookie Consent by Free Privacy Policy Generator

The Best Of

Go to the Best Of the SEO Community.

Noah
Noah
Dec 25, 2024, 7:05 AM
Forwarded from another channel:
Forwarded thread from another channel:
Ryan Mendenhall
Ryan Mendenhall
Dec 17, 2024, 2:10 PM
How are our traditional web analytics preparing for (or already prepped for) the explosion of AI agents of all sorts that are now crawling the web, including LLMs for training and live search? Can we filter for this somehow? See it's impact on site traffic?
Dana DiTomaso
Dana DiTomaso
Dec 17, 2024, 2:50 PM
I'm working on some JavaScript to try to detect AI agents, hopefully will have something to report in January.
Ryan Mendenhall
Ryan Mendenhall
Dec 17, 2024, 4:55 PM
So curious to see this @Dana DiTomaso! I'm even more curious to see how we're going to identify agents that aren't operating above board and are TRYING to hide their tracks and mimic authentic user behavior with legit and varied user-agents, devices, locations, etc. In the past I've had to filter out based on the most random kind of patterns, like screen resolution, time on page, or browser version.
Mika Lepistö
Mika Lepistö
Dec 17, 2024, 5:20 PM
Super curious about this, especially given the potential speed they may interact with the site. That could also have firewall and other implications. Seems like log analysis might be more reliable than GA from at least pageview/interaction speed perspective.
Dana DiTomaso
Dana DiTomaso
Dec 17, 2024, 7:35 PM
This is JS that actually runs on the site and detects use patterns, not GA. This is what to use to track AI tool traffic in GA4:
Wondering how much traffic AI tools like ChatGPT or Claude bring to your website? Learn how to create a custom channel in GA4 to highlight AI-driven traffic and uncover hidden insights in your data.
Kick Point Playbook: How to Report on Traffic from AI Tools in GA4 - Kick Point Playbook
Nico Brooks
Nico Brooks
Dec 17, 2024, 9:33 PM
I'm seeing more bot traffic in general - some of it might be agents, but I also think there's increased activity because anyone with access to an AI prompt can write a crawler. I think Cloudflare Bot Management, Google Recaptcha and the like are about to get a LOT more popular.
Alex Wilson
Alex Wilson
Dec 18, 2024, 9:01 AM
The real question is, does this traffic skew the data?
MIchael Buckbee
MIchael Buckbee
Dec 18, 2024, 9:39 AM
I've done a ton of work on this from both sides and:
• In general for crawling, almost all the AI bots are still doing HTML scraping, and not executing javascript and not loading things like image "pixels" that are used for tracking. They aren't going to show up in most marketing analytics tools like Posthog, GA, Plausible, etc. in the same way that you don't see Ahrefs+SemRush+Majestic site crawlers in those tools.
• There has been a big uptick in bot traffic (scrapers) in many cases as the the AI companies are in a frenzied landgrab of trying to get unadultered, pre-regurgitated data to build their foundational models off of.
• While you _can_ compare referrals in GA for things like AI traffic that tends to really under-report the impact of AI search ranking in ChatGPT, etc. as behaviorally they're using the AI for research and deciding what to do and then doing a navigational search or brand search in Chrome so it's being reported as "organic"
Alex Wilson
Alex Wilson
Dec 18, 2024, 10:32 AM
Generally I want to look at the historical data. Are we seeing a significant issue that may indicate sessions are being inflated, causing the conversion numbers and such to drop. I am not sure how useful it is to spend a bunch of time trying to fix an issue that may or may not cause issues. I could be convinced otherwise though. Its the same issue with internal traffic. If you have a very small website, internal traffic may be an issue. If you have a site getting 10k sessions per day, odds are internal traffic isnt causing any skew in the data.
MIchael Buckbee
MIchael Buckbee
Dec 18, 2024, 10:39 AM
Well, if you're concerned that AI bots are inflating numbers, but not converting I'd put that risk as very low.
Nico Brooks
Nico Brooks
Dec 18, 2024, 10:39 AM
@Alex Wilson absolutely, yes. For example, with one client I did a meticulous job of analyzing and scrubbing traffic that was categorized as direct and more than half was non-human in my estimation. What's important and thought-provoking about @Ryan Mendenhall’s question is that it is not about the AI platforms themselves, which are following protocols and generally filtered from analytics tools due to being on the , and generally not executing tracking scripts as @MIchael Buckbee says. What Ryan is pointing out is that with a minimal amount of coding knowledge, it is now possible to build an agent to monitor content on websites and deliver insights about pricing, content, availabilities or whatever else you or your clients care about.
MIchael Buckbee
MIchael Buckbee
Dec 18, 2024, 10:39 AM
The flip side of trying to filter them out as well is that excluding them is like blocking Google bot, if you do that you're not going to get your site indexed by them and then you're really hurting yourself from ranking.
Nico Brooks
Nico Brooks
Dec 18, 2024, 10:42 AM
Ryan's question jibes with what I'm seeing - low-key bots that are coming from GCP or AWS addresses, not behaving maliciously, just scraping data for some purpose.
MIchael Buckbee
MIchael Buckbee
Dec 18, 2024, 10:42 AM
So +1 to that @Nico Brooks - I've definitely seen a rise in weird one off scrapers and things that look like either homegrown scripts, academic research or hacky stuff. _mostly_ those also don't execute js but that may change.
Nico Brooks
Nico Brooks
Dec 18, 2024, 10:54 AM
@MIchael Buckbee yeah, I see both. This is where I think they may just be bots written by inexperienced developers in some cases and not agents. I see a fair amount from slightly outdated Chrome browsers running on Windows 10 (likely a virtual machine), which makes me think it's just Puppeteer or something similar that is requesting all assets associated with a page. I mostly care about what makes it into analytics, but when I _have_ looked at the web server log files it makes me think I'm just seeing the tip of the iceberg in analytics - sometimes it's hard to find requests that look human in the log files.
Arnout Hellemans
Arnout Hellemans
Dec 18, 2024, 12:54 PM
ChatGPT is adding UTM's from their app, nice way to create awareness imho.
Ryan Mendenhall
Ryan Mendenhall
Dec 18, 2024, 2:24 PM
@Arnout Hellemans yeah, I noticed that yesterday too actually. ?utm_source= - agreed, I think when people start seeing visits from that they'll start to get much more interested.
Mika Lepistö
Mika Lepistö
Dec 19, 2024, 2:17 PM
I'm not seeing UTMs in mobile app or web app. Where have you seen them?
Dana DiTomaso
Dana DiTomaso
Dec 19, 2024, 10:11 PM
Bugs me that they only send a source and not a medium, so it ends up in Unassigned unless you fix it in a custom channel.
Mika Lepistö
Mika Lepistö
Dec 19, 2024, 10:22 PM
I am seeing it - sometimes. In chat it seems more consistent but in the map with link to website it isn't for example.
Nico Brooks
Nico Brooks
Dec 19, 2024, 10:42 PM
could you fix the empty medium with an event modification?
Dana DiTomaso
Dana DiTomaso
Dec 20, 2024, 10:22 PM
You can but it's probably easier to set a custom channel anyway!
John Mueller
John Mueller
Dec 24, 2024, 8:15 AM
I think it's too early for tracking AI agents. And for things like Mariner, what should it really disclose? It's basically autocomplete clicking around on the web for you - isn't that just Chrome? It's a fascinating space, there aren't a lot of "best practices" yet. Public feedback from folks who run websites is super-helpful.
Mika Lepistö
Mika Lepistö
Dec 24, 2024, 8:43 AM
What should it disclose?
Overall would be nice to know when something user initiated is working in autonomous mode and how its engagement is perceiving value.
Ex: let's say I want to have :robot_face: research the best wedding venues and get their prices. Could be those sites need forms submitted with details and venue owners to respond with pricing or questions to be able to respond since it's potentially a more complex product.
Wouldn't it be useful for the business/website owners to understand if they're getting a user initiated autonomous engagement vs data-mining? I can see data mining growing a ton with tools like this, especially in scenarios where getting that information at scale is hard, and publishing it providing unique value.
Also why the :robot_face: chose my site as meeting the criteria to engage. I suspect that may be trying to emulate user signals for decision making, but automation could now allow it to go 20 pages deep in SERP as one example.
Ryan Mendenhall
Ryan Mendenhall
Dec 24, 2024, 11:48 AM
Basically, I just want bot/agent/user disambiguation in my analytics. If agents actually start auto navigating and purchasing that could become interesting, but until then the user clicks are the ones I need to optimize for conversions.
John Mueller
John Mueller
Dec 24, 2024, 2:11 PM
My mental model is if an agent does something within its own app, then it would use its own user-agent; but if an agent does it within the user's app (like Chrome), then it would use the user's user-agent. The line is blurry though - if I use Chrome to open an agent, and the agent visibly opens pages, does that count as Chrome or as the agent? Is the difference wether a page is visible or not? What if Chrome does it in a background / hidden tab? Or does it matter how direct the request was - "find cool hiking trails nearby" (research'y, more like a search engine) vs "open , check for deals on the homepage, buy my product1 if it's more than 20% off" (navigational'y, more like a user)? I find the whole space super-interesting.
Mika Lepistö
Mika Lepistö
Dec 24, 2024, 2:15 PM
The background tab is an interesting thought. I already use an extension for doing that to monitor various things so that would look like a direct chrome hit and run
Mika Lepistö
Mika Lepistö
Dec 24, 2024, 2:21 PM
Theoretically, my single page monitor should be able to be profiled, right? 1st party cookies, browser fingerprint, IP, time on page, etc.

Our Values

What we believe in

Building friendships

Kindness

Giving

Elevating others

Creating Signal

Discussing ideas respectfully

What has no home here

Diminishing others

Gatekeeping

Taking without giving back

Spamming others

Arguing

Selling links and guest posts


Sign up for our Newsletter

Join our mailing list for updates

By signing up, you agree to our Privacy Policy and Terms of Service. We may send you occasional newsletters and promotional emails about our products and services. You can opt-out at any time.

Apply now to join our amazing community.

Powered by MODXModx Logo
the blazing fast + secure open source CMS.