Export all your historical Google Search Console data from Keylime Toolbox.
This script reads the Google Search Console queries and URLs (pages) data that Keylime Toolbox has collected for you, for all your GSC properties. It will write a CSV per day, for each property, for each type of data, to an S3 bucket.
This script uses Ruby; make sure you have version 2.1 or later installed:
$ ruby -v
ruby 2.4.1p111 (2017-03-22 revision 58053) [x86_64-darwin15]
Download the code and extract it to a folder. In a terminal window change directory to that folder.
Set up the dependent gems:
gem install bundler
bundle install
Configure the following environment variables for the Keylime Toolbox API:
KEYLIME_TOOLBOX_EMAIL
The email address of your Keylime Toolbox account.KEYLIME_TOOLBOX_TOKEN
The API token for your account. Look at the API section in your settings to find or set up your token.
To access S3 set up AWS credentials. Credentials may be supplied with the following environment
variables for by setting values in ~/.aws/credentials
or ~/.aws/config
. See
AWS shared credentials
for more details.
AWS_ACCESS_KEY_ID
Your access key for AWS.AWS_SECRET_ACCESS_KEY
Your secret access key for AWS.
Supply the AWS region of the S3 bucket you are using. This may be done through
an evironment variable or a command-line option (--region
).
AWS_REGION
The AWS region where the S3 bucket is.
Run the script providing the target bucket where you want to store the data:
./export-data target-bucket
There are additional options available to configure the bucket region, the file path, and other aspects. See the complete list with this command:
./export-data --help
You can run this script in a Docker container. There is a Dockerfile
in the repository
that should work out-of-the box. Build the Docker container image like this:
docker build -t keylime-toolbox-export .
You must set the environment variables described above and specify the bucket when you run
the script, which you can do with a command like this (presuming you named the image
keylime-toolbox-export
as above):
docker run --env-file .env keylime-toolbox-export ./export-data target-bucket
See options for setting enviroment variables in the Docker documentation.
You can run the Docker image as a Kubernetes Job in your cluster. This will run the script and terminate the pod when the export job (and container) completes.
There is a sample manifest in keylime-toolbox-export-job.yml
that you can use as a
starting point for launching your export job. You will need to:
- Push the image to a registry
- Set the environment variables described above
- Set the target bucket
Push the image to a registry with docker push
. See the documentation
on docker push
for details
about the registry host, names, and tags.
docker push registry-host:5000/my-company/keylime-toolbox-export
Change the image
line in the manifest based on the name (and tag if desired) you used
when you uploaded the image to your registry.
Modify the manifest to set the environment variables. You might consider using Secrets to set the secret tokens and possibly a ConfigMap for the other values.
You will need to change the command
line in the manifest to set the target bucket and add
any other options.
Once configured you launch the job with a command like this:
kubectl apply -f keylime-toolbox-export-job.yml
Bug reports and pull requests are welcome on GitHub at https://github.com/keylime-toolbox/keylime-toolbox-export.
Run rubocop -Da export_data.rb lib/
before submitting to ensure that your code
meets style and organizational requirements.
- Add aggregate data downloads for reporting groups.
- Add Crawl Errors download for GSC properties.
- Ability to limit to specific groups, properties, or date ranges.
The MIT License. See LICENSE.txt