permalink

12

Downloading data from Google Webmaster Tools

Today, Google has published some Python scripts which allow an automated download of data from the Google Webmaster Tools, namely the Search Query data.

UPDATE: I created a Java version of the Search Query downloader. Any feedback would be great.

The two examples included provide a simple script to download search query data for the last month into a CSV file and another to create a Google Docs spreadsheet containing that data.

The required steps to run the scripts are documented in the Project Wiki, assuming you’ve got Python running already. The ‘selected_downloads’ variable may be extended to include the TOP_PAGES alongside the TOP_QUERIES in the download:

selected_downloads = ['TOP_QUERIES', 'TOP_PAGES']

Downloading data for all your sites

To download the search query data for all sites I have in my Google Webmaster Tools account I changed the example script as follows:

# Instantiate the downloader object
downloader = Downloader()
# Authenticate with your Webmaster Tools sign-in info
downloader.LogIn(email, password)

# Get the list of sites available
sites = downloader.GetSitesList()

# Initiate the download
for site in sites:
  print site.title.text
  try:
    downloader.DoDownload(site.title.text, selected_downloads)
  except ValueError:
    print "No JSON data"

The function GetSitesList() needs to be added to downloader.py:

  def GetSitesList(self):
    stream = self._client.request('GET', self.SITES_PATH)
    sites = wmt.SitesFeedFromString(stream.read())

    return sites.entry

with SITES_PATH being defined as

  SITES_PATH = '/webmasters/tools/feeds/sites/'

Additionally gdata.webmastertools needs to be imported, so add this line to the list of imports at the beginning of the script:

import gdata.webmastertools as wmt

12 Comments

  1. Looks very useful script – however a little problem – is there a “wmt” not declared somewhere in your example to download all data? I get a “global name ‘wmt’ not defined” when I put your GetSitesList() to the downloader script.

    • Mark, thanks for spotting this and apologies for the mistake. I forgot to mention the import line which I have now added to the article.

      Sorry for not replying earlier but I’ve been away for some days.

  2. Pingback: Keyword not provided | FrankOli.de

  3. hi! do you find how to download charts? when download 30days charts from gwt manually – it saved with name timeseries_queries. But i cannot find any method to download them automatically. But it will be really nice to integrate all stats at one interface.

    • Hi Varun,

      no, I haven’t tried to use OAuth with this script. I opted for the much simpler username / password authentication which I consider more practical for the use in a scheduled script.

      Frank

      • I totally agree with you. Somehow I was hoping to have all my robots working without I am having them to store any credential for security reasons but OAuth/2 just in case to put them in my dev environment where more guys have access then I can count :).

  4. Pingback: Chrome bald auch mit Suche ├╝ber SSL – noch mehr (not provided) | Webkruscht

Leave a Reply

Required fields are marked *.