PDA

View Full Version : Parsing RSS feed using... well, anything, really.


Spudhead
12-19-2006, 11:15 AM
I'm getting a little nervous that I'm not doing this quite right. I'd appreciate any advice - I've looked all over at AJAX / XML DOM / related articles and I've stilll not seen what I'd describe as a complete solution.

So I have an XML RSS news feed file. (This will be generated by an ASP backend, that's no worries.) Obviously, for RSS news aggregators and stuff, we need to make that file available. It is, here (http://ben.ifthengo.com/rss/rss.xml).

Being all swishy and Web 2.0, we want some AJAX in there. Some (prototype-powered) AJAX:


<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
"http://www.w3.org/TR/html4/loose.dtd">
<html>
<head>
<title>Untitled Document</title>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">

<script type="text/javascript" src="prototype-1.4.0.js"></script>

<script type="text/javascript">
function getNews(){
var url = 'http://ben.ifthengo.com/rss/rss.xml';
var pars = '';

var myAjax = new Ajax.Request( url, { method: 'get', parameters: pars, onComplete: showResponse });
}

function showResponse(originalRequest){
//$('newsDiv').innerHTML = originalRequest.responseText;
var xml = originalRequest.responseXML;
$('newsBox').value = originalRequest.responseText;

}
</script>

</head>

<body onLoad="getNews()">

<!--<div id="newsDiv">Latest news</div>-->

<textarea id="newsBox" rows="10" cols="50"></textarea>

</body>
</html>



As you can probably see, it's at this point that I start to get confounded. I've got my XML, what the dickens am I supposed to do with it? I want a pretty news box with rounded edges and links and scrollableness and all that stuff.

I could get the responseXML as a DOM object and traverse it in Javascript, generating HTML. Not entirely sure how to do that, the examples I've seen appear to descend quickly into browser-detection; I don't want to go there. I'm using prototype.js - surely that's got some sort of XML-parsing... thing? Right?

I could also write an XSLT for it - I'm reasonably familiar with XSL, I reckon I could muddle through. But I've not written one that'll just generate a BIT of a page before, and I'm still not sure how I'd apply an XSLT to a XML DOM object, or even simply a string of XML text. I've only ever done it with XML files.

So... I need a bit of help to get me the final mile. This has, eventually, got to go on the front page of a corporate web design agency, so it's got to be something that code-minded clients can peek at and and go "mmm, that's an elegant and novel use of technology. I like what you've done there."

And if I could have it by about 4pm, that'd be, like, totally cool. :thumbsup:

Thanks for any replies. :)

Spudhead
12-19-2006, 12:47 PM
Ok, I've been playing.


<script type="text/javascript">
function getNews(url){
var pars = '';
var myAjax = new Ajax.Request( url, { method: 'get', parameters: pars, onComplete: showNews });
}

function showNews(newsXML){
var sHTML = "";
var newsItems = newsXML.responseXML.getElementsByTagName("item");
for (i=0;i<newsItems.length;i++){
newsTitle = newsItems[i].getElementsByTagName("title")[0].firstChild.nodeValue;
newsLink = newsItems[i].getElementsByTagName("link")[0].firstChild.nodeValue;
newsDescription = newsItems[i].getElementsByTagName("description")[0].firstChild.nodeValue;
newsCategory = newsItems[i].getElementsByTagName("category")[0].firstChild.nodeValue;

sHTML = sHTML + '<h1><a href="' + newsLink + '">' + newsTitle + '</a></h1><p>' + newsDescription + '</p>';

}

$('newsDiv').innerHTML = sHTML;

}
</script>


It does what I want, but I'm not convinced about my XML DOM-traversing. Does that syntax look elegant to you? :o

david_kw
12-19-2006, 05:03 PM
It seems like it should have some errors with links to the wrong spots. The problem with getElementsByTagName() is that it neglects to differentiate when a certain tag like <link> can be under different elements.

Looking at your xml doc


<?xml version="1.0" ?>
<!-- RSS generated by me on 19/12/2006 09:42:15< -->
<rss version="2.0">
<channel>
<title>Latest News</title>
<link>http://ben.ifthengo.com</link>
<description>Latest News from me</description>
<language>en</language>
<copyright>Copyright 2006 me</copyright>
<lastBuildDate>19/12/2006 09:42:15</lastBuildDate>
<image>
<title>my rss news feed</title>
<url>http://ben.ifthengo.com/img/rss.gif</url>
<link>http://ben.ifthengo.com</link>
</image>
<item>
<title>Man bites dog</title>
<link>http://ben.ifthengo.com/news.asp?id=1</link>
<description>Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Mauris tortor elit, egestas in, fringilla at, hendrerit dignissim, augue. Curabitur ac lectus. Vestibulum varius vestibulum nunc.</description>
<category>Press release</category>
</item>
<item>
<title>Dog bites man</title>
<link>http://ben.ifthengo.com/news.asp?id=2</link>
<description>Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Mauris tortor elit, egestas in, fringilla at, hendrerit dignissim, augue. Curabitur ac lectus. Vestibulum varius vestibulum nunc.</description>
<category>News</category>
</item>
</channel>
</rss>


You can see 2 occurrences of <item> but 4 occurrences of <link>. That means your links are incorrect for the items you've displayed.

XPath is really a better way to do it since it allows you to specify "item/link" which will only return the correct 2 links. Does scriptalicious have cross-browser xpath support?

david_kw

Spudhead
12-20-2006, 09:56 AM
I've already got just the news items, in:

var newsItems = newsXML.responseXML.getElementsByTagName("item");

I'm then getting just the link for each item, in:

newsItems[ i ].getElementsByTagName("link")


I think my problem is that I'm expecting too much of prototype.js - and of XML as a useful and flexible data repository. I'm expecting - and trying - to use it too much like I would a traditional server-side database solution, where I can look up values and generally jump around in the data to a much greater extent, without all this code. When I find myself having to select the first child node of the links array of a particular tag, when I know there's only one and that to do it in SQL is just "SELECT link FROM item WHERE id=42", I get worried.

Spudhead
12-20-2006, 04:26 PM
42 views, one reply? Clearly I'm not the only person I've managed to confuse :D

david_kw
12-20-2006, 04:32 PM
Ahh good point. I missed that one.

But XPath is probably what you are looking for. If you wanted to do that sql query in xpath it would be something like

getNode("item/link[@id=52]");

This says get link where it is a child of item and its id attribute is equal to 52. And the queries can become MUCH more complex.

Unfortunately you need a cross-browser pathform of some sort since IE and FF have different xpath solutions. Does Prototype have one?

david_kw