Why are there data discrepancies between Search Console & Looker Studio.
There are 4 main causes for when the data doesn’t match up.
Full disclosure, you can’t always fix this easily, but we can at least understand it in those cases.
Also if you do want help with getting your Search Console numbers consistent & accurate (as well as extracting as much as possible), please get in touch, it's what we do.
Are you a video person?
We've got a video version of this blog post.
(Since we made this video we found reason number 4: regex_match filter bug, so you'll have to do that one in text.)
The 4 main reasons for the discrepancy between Search Console & Looker Studio
- Do you have the right profile?
- Site vs URL level
- Search Console sampling level
- Bug with regex_match
And then one bonus one:
- Are you doing a period comparison?
Problem 1: Do you have the right profile?
It’s probably not this, but we should just check. In Search Console you’ll see this:
Domain property at the top and then exact URLs below.
In Looker studio, the domain property is shown as sc-domain.
Make sure you’ve got the same ones selected on both sides.
Right wasn’t that? Onwards.
Sidebar: Naming bits in the search console interface
To understand the next two we need to know what numbers we’re comparing against in the Search Console interface.
We’ll define them as follows:
Top summary numbers
Table drill-down
Problem 2: URL vs Site level
Search console has two different ways of reporting numbers. We’ve talked about that separately here: difference between site level and URL level.
If you don’t understand that, read that post quickly then come back. (The TL;DR version is: site level treats every URL as one and URL level treats each URL individually.)
Great so the problem here is:
- The search console interface mixes Site & URL level. Numbers change depending on where you look.
By default everything is site level unless it is required to be URL because you’ve included pages.
That means everything is site level except for the pages table in the table drill down.
But if we apply a filter to the interface such as filtering to a content subfolder and now all of our numbers are URL level.
How do we fix it? — Compare apples to apples.
So if you’re comparing URL level in Looker Studio and getting totals to compare against the Site level in search console it won’t add up because those numbers aren’t measured in the same way.
Which one is right?
They both are. They’re just measuring different things. Pick the one you want to report on and continue on safe in the peace of mind that your report technically does show the same data as you see in the search console interface.
But I want to see site level data for a subfolder
Time to go and register that subfolder as a separate property in Search Console.
Problem 3: Search Console sampling
Still got issues? Well we’ve reached sampling.
Sampling pretty much always looks like this:
Some quick maths will tell you that those devices don’t add up to the total.
This issue actually exists in Search Console. It’s just hard to see:
We typically assume these don’t add up, because we’re limited to a 1000 keywords.
But then we start digging in Looker Studio and we add query to a table, download it and then we find some issues. When we add up all the clicks we should get:
- We want: 219,078 clicks
- We get: 127,180 Clicks.
About half of what we see in Looker Studio.
What’s happening here? The problem is Google won’t actually give you all the data your site ranks for.
They roll many of them up into a single hidden row which is never shown anywhere, but is still used in totals. (Like you can see in Looker Studio).
Then when you download the file, it drops the hidden row and voila you’re missing data.
How do we fix it? — Load Search Console data into a warehouse.
We’ve got two separate problems:
- Make it so our totals add up to what’s shown in a row.
- Minimize our data loss.
Both of these solutions involve data warehousing. The solution is quite long and we’ll end up writing more posts about the different parts of it, but in a nutshell it is:
- Download all of your Search Console from the API
- Store it in a data warehouse
- Enable Search Console to BigQuery
- Tie the two together
- Build dashboards off your new base table.
Because we’re now always building off a single table, we’ll no longer have totals with weird missing rows and because we’re using the integrations we’ll maximise our data output.
There are a couple additional steps you can take here:
- Combine search console subfolders together to maximize data output.
- Add the missing cardinal row as an explicit row back into your Search Console table so you keep track of what percentage of data is being lost.
Problem 4: Regex match filter bug
There is currently a bug if you're using multiple filter conditions and regex_match with the default connector.
If you have more than one regex_match based filter on an item then one of the filters will sometimes fail.
How do we fix it?
The solution to this one is nice and simple. Use regex_contains for your filters and you won't have issues.
If you want to match the beginning or end of string use ^ or $ and as always test in Regex101 (it's our personal favourite regex tester).
Problem 5: You're doing a period comparison
It's worth checking the date period is the same if you're doing period comparison with Looker Studio.
Looker Studio will pull through finalised Search Console data (i.e. data which usually is missing the past couple days.)
If you've done period comparison from today or yesterday your primary period will be missing a couple days and your previous period will have data for all the days, so you'll naturally be missing a couple days of data.
Questions
Hopefully that’s helped you better understand why the data you’re seeing in the search console interface doesn’t match what you see in your Looker Studio report.
If you’d like help with this: that’s what we do! Please reach out and get in touch. We help run and maintain data warehouses for SEO (& Marketing) teams to help you get more out of your analysis & reporting.