RSS Aggregation with Google Tutorial
I wanted to add some good content to my website to hopefully attract some traffic. The best way I can think of getting good free content is by using RSS feeds. But how do you get what you want on your site and keep it updated dynamically?
Well I have been playing around with Google Reader recently, and thought someone out there may want to know what I’ve been doing?
Some first thing you are going to need is a Google account so that you can access Google Reader, once you are logged in just go to http://www.google.com/reader to set it up.
In this tutorial I am going to be combining the ITV and Official Formula 1 RSS feeds, you can get these here:
http://www.formula1.com/rss/news/latest.rss
So add these feeds in to Google Reader

Now that you have added the feeds to aggregate you will want to group them in a folder. This is done in the subscriptions screen, you can get there by clicking the “Manage Subscriptions” link at the bottom of the feed list, or by clicking on settings at the top right and then moving to the subscriptions tab.
From the subscriptions tab you can rename the feed if required, then you will want to use the Change Folders drop down menu to assign the feed to a category.

Now that you have grouped your feeds together you will need to click on the Tags tab. This screen will show you the groups you have created, we will want to make the group publically shared so that our website can access it.

Once you have shared the group, return to the main page, find the group on the left hand menu and click to select it. Once you have the group selected, the viewer will show you the posts from all feeds in that group and the page will have a link to the RSS feed for hat shared group. If you are using Internet Explorer you should be able to select the shared feed from a orange icon at the end of the address bar.
Your shared feed should be something like the following:
http://www.google.com/reader/atom/user%2F00000000000000000000%2Flabel%2FF1?r=n
You should see in the above URL that %2F is an encoded forward slash and the random numbers are how your Google account is identified (represented by zeros in the example above). Also the parameter on the end is not required, however we can swap it for the parameter n that will control how many items are returned.
In order to use this information on our website we want it in a usable form, luckily using Google’s JavaScript interface we can get that simply by changing the URL format to the following:
http://www.google.com/reader/public/javascript/user/00000000000000000000/label/F1?n=5
Notice here how we have replaced the start of the URL and the %2F but kept the end of it. We have also added the n parameter on the end so that only 5 items are returned in the feed.
Put this in your browser and TADA, it should return a nice big JSON (JavaScript Object Notation) data structure that we can use on our site.
BUT how do we use that?
Don’t worry, JSON can be used by almost any language with a simple conversion, take a look at http://www.json.org/ for more information about the specific language you are using.
I used Perl to implement my feeds, to convert the JSON data structure to Perl you can use the CPAN JSON or JSON::PP module http://search.cpan.org/search?query=json&mode=all
And here is a real small script that shows you how to do it, I won’t explain how this simple script works unless someone really wants me to.
#!/usr/bin/perl
use strict;
use warnings;
use CGI;
use JSON::PP;
use LWP::Simple;
{
my $q = new CGI;
my $target = q|http://www.google.com/reader/public/javascript/user/00000000000000000000/label/F1?n=5|;
my $response = get $target;
my $rss = decode_json( $response );
my $html = '';
for my $feeditem ( sort {$b->{'published'} <=> $a->{'published'}} @{$rss->{'items'}} ) {
my $pdate = $feeditem->{'published'};
my @dprts = gmtime $pdate;
$pdate = sprintf("%02d/%02d/%4d", $dprts[3],$dprts[4],$dprts[5]+1900 );
my $href = $feeditem->{’alternate’}->{’href’};
my $content = $feeditem->{’title’};
my $title = $feeditem->{’content’};
my $source = $feeditem->{’origin’}->{’title’};
$html .= qq|\n<div>|;
$html .= qq|\n<div>$pdate: <a href=”$href”>$content</a></div>|;
$html .= qq|\n<div>via. $source</div>|;
$html .= qq|\n</div>|;
$html .= qq|\n<div> </div>|;
}
print $q->header(
-charset => ‘utf-8′,
-expires => ‘-1d’
).$html;
}