Skip to content


Google top searches export data

One of the best features in Google Webmaster Tools is the Top Search Queries data, which shows what search queries your website appears for, and which result in clickthroughs. In the web interface this is easy to use and provides a great overview. However, it is rather frustrating that the export feature exports this data in a format that is almost impossible to use:

webmaster

As you can see, all of the data in square brackets (columns D & E) is presented in one row, and very difficult to analyse.

I put together a basic perl script that will re-order this information, and split it into two separate spreadsheets – one for impression data, and one for clickthrough data. This generally results in quite large files, but the data is a lot more easily digestible and easier to manipulate using programs like Excel. Enjoy!

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
#!/usr/bin/perl -w
use strict;

# You'll need to change the filename to correspond to your downloaded CSV file
open (DATA, 'TopSearchQueries_xxx.csv') or die "Error $!";

open (DIMP, '>wmt-impressions.csv')     or die "Error $!";
open (DCT,  '>wmt-clickthrus.csv')      or die "Error $!";

# create a new file that has 6 new columns (kw, %, pos, x 2)
print DIMP "Month,Locality,Type,Keyword,Percentage,Position\n";
print DCT "Month,Locality,Type,Keyword,Percentage,Position\n";

while (<DATA>) {
    my $line = $_;
   
    $line =~ s!""!!g;
    $line =~ s!"\(Virgin Islands, !"(Virgin Islands !g;

    if ($line =~ m{^([^,]*),    # Month / time period
                   ([^,]+),     # Locality
                   ([^,]+),     # Search type
                   "([^"]+)",   # impressions
                   (?:"
([^"]+)")?\s*$}xi) {
        my $month = $1;
        my $locality = $2;
        my $type = $3;
        my $impr = $4;
        my $ct = $5;
       
        while ($impr =~ m{\[([^,]+),([^,]+),([^,]+)\] }gi) {
            my $kw = $1;
            my $pc = $2;
            my $pos = $3;
            print DIMP "$month,$locality,$type,$kw,$pc,$pos\n";
        }

        if (defined($ct)) {
            if ($ct =~ m{\[([^,]+),([^,]+),([^,]+)\] }gi) {
                my $kw = $1;
                my $pc = $2;
                my $pos = $3;
                print DCT "$month,$locality,$type,$kw,$pc,$pos\n";
            }
        }
    }
}
  • Digg
  • del.icio.us
  • Sphinn
  • Google Bookmarks
  • Facebook
  • TwitThis
  • E-mail this story to a friend!

Posted in google, perl, seo.

0 Responses

Stay in touch with the conversation, subscribe to the RSS feed for comments on this post.

Some HTML is OK

(required)

(required, but never shared)

or, reply to this post via trackback.