Notes on building the search console data downloader
The only thing better than Search Console, is getting all your data from Search Console.
We're talking about the search analytics/performance report. Unfortunately there isn't a coverage report API (at least at the moment!)
By default in the interface we can only get a thousand rows. There is so much more data than that. Let's go and get it.
Jump me straight to the free tool. I only care for notes when I'm stuck.
What is the search analytics traffic
Let's briefly layout the problem.
When someone clicks on your website in Google, Google records that click and the query they searched for to see it.
You can see all that traffic in your search analytics report.
You can't however download all of it easily.
You can only get a 1000 rows at a time and none of it is over time, which is often where it gets most interesting.
Let's solve that.
What are our options?
And then layout the possible solutions.
1. Use code to download it from the Search Console API
Excellent option, big downside is you're going to need to know a programming language. Or at least enough to be dangerous to run someone elses code.
There are plenty of scripts which can help you do this. I wrote an ancient blog post on Moz about this.
It's got some beginners detail, but there are also better more up to date libraries for this stuff now like this excellent one by Josh Carty.
2. Use a paid tool to download it and store it
Shameless plug incoming. A tool like Piped Out, is designed to go a couple steps further than that.
Download all the traffic and then store it in a data warehouse so it's easy to get and manipulate in the future. Really useful if you've got lots of websites, particularly large websites etc.
3. Use a free tool to download all of it
Method 1 is a lot of work. Method 2 costs money and is overkill if you just need it once. So we made a free tool to help you download all your data in the browser.
How do I use this free tool?
- Open up search console API downloader
- Authorize
- Select the property you want to download data for.
- Select the dimensions
- Select the date range
- Download!
Instructions are also on the tool. Once you've got the data then let your imagination run wild.
We really do mean all the data
All the data really is a lot of data, if you're not comfortable in large scale data manipulation, I'd recommend downloading smaller data periods and working with it in pivot tables.
What are the limitations?
Large websites
This tool runs in the browser and hasn't been tested at the large end of accounts.
If you start downloading 16 months of data for really large accounts, I suspect it'll crash your browser because it holds everything in RAM.
If you repeatedly run into that, then probably time for option 1 or 2.
Rate limiting
When testing this tool, I encountered a number of rate limiting issues which didn't match Google's public documentation.
The tool solves this by waiting 3 seconds between requests and having an exponential backoff.
This is far slower than the described limits, but it does mean the tool seems to be able to run consistently, so please be patient!
How does it work?
It's built using Google's API client library.
You authorize the tool to get access to your search console profile (don't worry we won't use it for anything other than the tool).
It stores all the authentication credentials locally in your browser.
You select dimensions & time period then we loop through and download your data a day at a time and stitch it all back together.
Fun Things
- I wanted to run this in a web worker. No dice.
- Google API Client needs to be directly referenced as a script. It's not version controlled and so you can't use NPM. I'm using
gatsby-plugin-workerize-loader
to build some future neat tools and it doesn't work withimportScripts()
... - So instead of anything clever, we just queue up the functions async with
bottleneck
to avoid locking up the browser.
- Google API Client needs to be directly referenced as a script. It's not version controlled and so you can't use NPM. I'm using
- Auth with Google Client API wasn't too bad in the end, but did not have great documentation.
- await/async is a million times easier to work with than promises.
I can't take SEO advice from someone whose cannibalising his own article with his tool
Reasonable.
There was too much explanation to fit on the tool, but yes point taken. Did you see how terrible I made the title?
You can help me out by only linking to the tool...
Thoughts?
Let us know what you get up to and any problems you run into.
And if regularly work with search console data and want to make it easier, please get in touch.