"In theory, there is no difference between theory and practice. But, in practice, there is." - Jan L. A. van de Snepscheut
In theory, the project to place RSS news feeds on a map could be very easy. GeoRSS is standard for encoding geographic locations in RSS feeds. RSS feeds that have this encoding are the perfect data source for this project. But, in practice, the feeds I've been looking at do not have GeoRSS information. So, I'm going to start by seeing what I can get done without GeoRSS.The source of news that I'm working with is the AOL national news RSS feed. The description element of each feed starts with a location. I extracted the location, geocoded each location with the MapQuest (MQ) geocoding service, and then placed point of interest (POI) markers on the map.
In my last post, I laid out my design for MapNews. In this post, I present the working application and code I've written based on the MQ client Javascript tool kit. The following screen shot of MapNews shows POI markers for cities that have news stories in an example feed. I've clicked on the POI marker for Salt Lake City to expose links to news stories in the information window.
The HTML and Javascript for MapNews are shown below. I used prototype for an AJAX fetch of the RSS feed and also for parsing the RSS XML. On line 7, I included prototype 1.6 since the MQ Javascript library version seemed to be missing the AJAX component (also set ipr=false on line 6). I installed the PHP version of the MQ proxy on my server and pointed the MQExec object to the proxy on lines 13-26. I installed the MQ API Javascript files on my server and referenced them on lines 8-10 (excluding mqcommon.js which is included in line 6).
Lines 34-43 define the Feed class that represents each feed item and associated geocoding results. I fired off an AJAX request to fetch the RSS file on lines 46-54 and handled the return in processRss (lines 57-71). Lines 60-63 pull the item description and link straight from the RSS feed (the prototype function cleanWhitespace (line 126) comes in handy here).
The function parseForCityAndState (called on line 64) uses a regex to find a city, and perhaps, a state in the description RSS element. The regex on line 114 will make more sense if you consider the following examples of the AOL RSS feed:
<description>
ST. LOUIS (AP) - Substandard care at a southern Illinois ...
</description>
<description>
ALBANY, N.Y. (AP) - It's the perfect tax: Government ...
</description>
I, basically, looked for text starting the description prior to the literal '(AP)', and if a comma was present, I inferred that the state was the text after the comma.
With the city and state in hand, I call the MQ geocoding service (line 100-111). The geocoding service worked well; I had only 1 failed geocoding out of 35 items, many of which had only a city name. I picked the first element of the returned MQGeoAddress collection (line 109) and skipped items for which the geocoding failed (line 76).
Finally, the geocoded items are converted to POI makers with associated descriptions and links (lines 88-97). I collected the items by location to have only one information window per location with multiple items (lines 77-84). This was accomplished by building a hash with keys of MQLatLng.toString() (which is a concatenation of longitude and latitude) and values of a list of Feed objects.
I'm pretty happy with the application so far. There is one problem, however. I'm firing off many geocoding requests (one for each RSS item) from the client and the page load is too slow. I envision that the fix will be to use the batch geocoding method from the MQ API, or, to do the geocoding once per RSS feed on the server.
The next steps for MapNews are to consume GeoRSS feeds and to adopt batch geocoding for feeds without GeoRSS encoding.
001 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
002 <html>
003 <head>
004 <title>MapNews</title>
005 <link rel="stylesheet" href="mapnews.css"
type="text/css"/>
006 <script src="http://btilelog.access.mapquest.com/tilelog/transaction?
transaction=script&key=mjtd%7Clu6y290rn9%2C22%3Do5-0utn1
&ipr=false&itk=true&v=5.2.0"
type="text/javascript"></script>
007 <script src='prototype-1.6.0.2.js'
type='text/javascript'></script>
008 <script src='mqutils.js'
type='text/javascript'></script>
009 <script src='mqobjects.js'
type='text/javascript'></script>
010 <script src='mqexec.js'
type='text/javascript'></script>
011 <script language='javascript'>
012
013 var g_proxyServerName = '';
014 var g_proxyServerPort = '';
015 var g_proxyServerPath = '/mq/JSReqHandler.php5'
016
017 var g_serverName = 'geocode.dev.mapquest.com';
018 var g_serverPort = '80';
019 var g_serverPath = 'mq';
020
021 var g_geoExec = new MQExec(
022 g_serverName, g_serverPath, g_serverPort,
023 g_proxyServerName, g_proxyServerPath, g_proxyServerPort
024 );
025
026 var g_mqMap;
027
028 function startMap() {
029 g_mqMap = new MQTileMap(document.getElementById('mapWindow'), 2,
030 new MQLatLng(39.81,-98.56), "map");
031 getRss();
032 }
033
034 function Feed(){}
035 Feed.prototype = {
036 title: '',
037 link: '',
038 description: '',
039 city: '',
040 state: '',
041 country: 'USA',
042 geoAddress: null
043 };
044
045
046 function getRss() {
047 new Ajax.Request('news_top_nat.xml', {
048 method:'get',
049 onSuccess: function(transport){
050 processRss(transport.responseXML);
051 },
052 onFailure: function(){ alert('Something went wrong...') }
053 });
054 }
055
056
057 function processRss(root) {
058 var feeds = new Array();
059 Element.select(root, 'item').each(function(item, i) {
060 var feed = new Feed();
061 feed.title = getChildsText(item, 'title');
062 feed.link = getChildsText(item, 'link');
063 feed.description = getChildsText(item, 'description');
064 parseForCityState(feed);
065
066 geoCode(feed);
067 feeds.push(feed);
068
069 });
070 placeOnMap(feeds);
071 }
072
073 function placeOnMap(feeds) {
074 var locToFeed = new Hash();
075 feeds.each(function(feed) {
076 if(feed.geoAddress) {
077 var thisKey = feed.geoAddress.getMQLatLng().toString();
078 if(locToFeed.keys().indexOf(thisKey) > -1) {
079 locToFeed.get(thisKey).push(feed);
080 } else {
081 var a = new Array();
082 a.push(feed);
083 locToFeed.set(thisKey, a);
084 }
085 }
086 });
087
088 locToFeed.values().each(function(feeds) {
089 var html = '';
090 feeds.each(function(feed) {
091 html += "<a href='#{link}'>#{title}</a>
<br>".interpolate(feed);
092 });
093 var poi = new MQPoi(feeds[0].geoAddress.getMQLatLng());
094 poi.setInfoTitleHTML(feeds[0].city);
095 poi.setInfoContentHTML(html);
096 g_mqMap.addPoi(poi);
097 });
098 }
099
100 function geoCode(feed) {
101 if(feed.city != '') {
102 var address = new MQAddress();
103 address.setCity(feed.city);
104 address.setState(feed.state);
105 address.setCountry(feed.country);
106
107 var gaCollection = new MQLocationCollection("MQGeoAddress");
108 g_geoExec.geocode(address, gaCollection);
109 feed.geoAddress = gaCollection.get(0);
110 }
111 }
112
113 function parseForCityState(feed) {
114 var match = feed.description.match(/^([^(]+)\(AP\)/);
115 if(match) {
116 var cityAndMaybeState = match[1].split(',');
117 feed.city = cityAndMaybeState[0];
118 if(cityAndMaybeState.length > 1) {
119 feed.state = cityAndMaybeState[1];
120 }
121 }
122 }
123
124 function getChildsText(item, whichChild) {
125 var child = Element.select(item, whichChild)[0];
126 return Element.cleanWhitespace(child).firstChild.nodeValue;
127 }
128 </script>
129 </head>
130
131 <body onload="startMap();">
132 <h1>MapNews</h1>
133 <hr>
134 <div id="mapWindow" style=""></div>
135 <hr>
136 </body>
137 </html>