Cookie Consent by Free Privacy Policy Generator

The Best Of

Go to the Best Of the SEO Community.

Noah
Noah
Jan 19, 2025, 10:08 AM
Forwarded from another channel:
Forwarded thread from another channel:
Mohamed Natheem
Mohamed Natheem
Oct 26, 2024, 9:02 AM
I’m learning more about BigQuery GSC integration and its cost optimization. Is partitioning table by date helps optimize cost? I came to know this via some article that it would be best to partition frequently used data to reduce costs.
Noah
Noah
Oct 26, 2024, 9:06 AM
it is very important to partition GSC data. I do it by date column. You can also (and I recommend it) cluster your data too to make it cheaper to query.
the reason it’s important is that without it, you will run a full table scan each time you run a query, whereas when you partition by the date column, the query only runs in those partitions.
Noah
Noah
Oct 26, 2024, 9:08 AM
so if your entire table has 20GB in it in the columns you’re querying, then without partitioning, you’re querying it all.
With partioning, if you only wantdata between two dates, you’re only incurring costs associated with the data between those dates (which could be 20 MB or whatever the case may be).
Mohamed Natheem
Mohamed Natheem
Oct 26, 2024, 9:08 AM
Awesome. Thanks ????
Do you suggest any resources to learn more about BigQuery, GSC, and GA4 Integration? (I found a few resources on Google, but wanted to know from you if you know any best resources ???? )
Noah
Noah
Oct 26, 2024, 9:11 AM
Are you using the bulk export for either GSC or GA4?
Mohamed Natheem
Mohamed Natheem
Oct 26, 2024, 9:12 AM
Using bulk export for GSC. But didn’t try GA4. So kinda newbie here
Noah
Noah
Oct 26, 2024, 9:13 AM
you can use this free script I built to set up a pretty rad data viz of GSC data
Noah
Noah
Oct 26, 2024, 9:14 AM
it will take your bulk data and transform it to make it easier / faster to query and provides a Looker Studio front end (with a number of views) to analyze your data.
Mohamed Natheem
Mohamed Natheem
Oct 26, 2024, 9:14 AM
Looks interesting, I’ll check it out. Thanks, Noah!
Noah
Noah
Oct 26, 2024, 9:14 AM
It’ll take 30 minutes to set up.
Noah
Noah
Oct 26, 2024, 9:15 AM
:pinched_fingers::skin-tone-3:
Derek Perkins
Derek Perkins
Oct 26, 2024, 1:52 PM
Marco has a course if you're into that. His free newsletter is great
Derek Perkins
Derek Perkins
Oct 26, 2024, 1:56 PM
@Noah is definitely the expert in GSC, and his free script is awesome
Noah
Noah
Oct 26, 2024, 6:05 PM
I’m taking Marco’s course too.
Derek Perkins
Derek Perkins
Oct 26, 2024, 6:32 PM
Curious to hear your opinion on it
Mohamed Natheem
Mohamed Natheem
Oct 30, 2024, 5:26 AM
@Noah
Another amateur question incoming:
So, the script you shared, it pulls data from partitioned tables of BigQuery, transforms them into a single holistic table, and connects it to Looker Studio for visualization. Am I right?
So by easier/faster to query, you mean by using looker studio, we can query only the data we want? That's how query faster and save some money?
And one more question, would the looker studio report run whenever a client refreshes?
Noah
Noah
Oct 30, 2024, 7:55 AM
Yes you are right.
Partitioning is what saves time and money.
It refreshes every 12 hours behind the scenes and also when someone uses it.
Noah
Noah
Oct 30, 2024, 12:47 PM
Did you get the pipeline set up?
Mohamed Natheem
Mohamed Natheem
Oct 30, 2024, 9:48 PM
Not yet. I’m still going through the BigQuery console to learn more about it. Since I’ve never used one before. And it’s quite overwhelming.
Also, I have doubts on the pipeline. Like the main purpose of using it. Would it create a new table in addition to the tables created by bulk export and if that happens, would it increase the storage and cost? And if a client or a team member refreshes the looker multiple times, then we will be spending for each refresh right? (Sorry Noah, I know it’s a lot of questions, but it’s overwhelming to learn this)
I know how it works like I have watched the video. The looker studio was also amazing. But these doubts keep popping up
Derek Perkins
Derek Perkins
Oct 30, 2024, 9:51 PM
Storage is pretty cheap in BigQuery, querying is not. By creating aggregate tables, you only pay to scan the full data once, then Looker Studio only has to read the 10-100x smaller preaggregated table, making it cheaper and faster
Derek Perkins
Derek Perkins
Oct 30, 2024, 9:52 PM
If you know what you're reporting, precomputing tables is always the right choice
Mohamed Natheem
Mohamed Natheem
Oct 30, 2024, 9:57 PM
Okay so Noah’s pipeline creates aggregated table of all the partitioned tables of bulk export and then connects that aggregated table to Looker studio. And with filters in looker studio, we only query what we want from the aggregated table. Also, with scheduled query, the pipeline keeps adding refreshed data to the aggregated table.
So we will be paying:
1. When the scheduled query works to append the new data to aggregated table.
2. When we query the data from looker studio.
3. When someone refreshes data from looker studio. (Here I have one question, if someone refreshes looker studio without changing any filters, wouldn’t it be possible that BQ serves the cached data?)
Is that how it works?
Derek Perkins
Derek Perkins
Oct 30, 2024, 10:09 PM
And if you enable BI Engine, that can also be a good speed + cost improvement
Noah
Noah
Oct 30, 2024, 10:29 PM
Look in BigQuery and let me know how many partitions are in the URL_impression table and how much data there is in GBs or MBs.
Mohamed Natheem
Mohamed Natheem
Oct 30, 2024, 10:31 PM
this is a very small site. I'm testing this out to learn about BQ and gonna use it for a client
Noah
Noah
Oct 30, 2024, 10:33 PM
Okay. You won’t have costs on this site to run the pipeline likely ever.
Mohamed Natheem
Mohamed Natheem
Oct 30, 2024, 10:35 PM
thanks man. You were really helpful. It's just everything is a bit overwhelming when looking at the console. Taking a baby steps to learn one at a time.
Noah
Noah
Oct 30, 2024, 10:44 PM
Play around with that calculator on my tools setup page and throw lots of different values into the starting table size, MBs of growth size (table size / # of partitions) , and you’ll see how long it takes to incur any storage cost
Noah
Noah
Oct 30, 2024, 10:44 PM
And then querying is a pretty generous TB / month
Noah
Noah
Oct 30, 2024, 10:45 PM
And when query data try to use less date ranges i.e. don’t grab all of the data when you have two years worth of data in the database because it’ll be more expensive potentially

Our Values

What we believe in

Building friendships

Kindness

Giving

Elevating others

Creating Signal

Discussing ideas respectfully

What has no home here

Diminishing others

Gatekeeping

Taking without giving back

Spamming others

Arguing

Selling links and guest posts


Sign up for our Newsletter

Join our mailing list for updates

By signing up, you agree to our Privacy Policy and Terms of Service. We may send you occasional newsletters and promotional emails about our products and services. You can opt-out at any time.

Apply now to join our amazing community.

Powered by MODXModx Logo
the blazing fast + secure open source CMS.