View Full Version : Working with the DOM and GM_xmlhttpRequest

12-29-2008, 06:55 AM

I'm working on a Greasemonkey script using the New York Times comments API. One of the things I want to do is parse the xml with GM_xmlhttpRequest.

Here's an example of what I'm pulling out:

and here is my js:

// set up 2 variables
var relatedDiv, commentPreview;
// define relatedDiv
relatedDiv = document.getElementById('related-content');
if (relatedDiv) {
// fetch and display feed
method: 'GET',
//url: 'http://greaseblog.blogspot.com/atom.xml',
url: 'http://api.nytimes.com/svc/community/v2/comments/url/exact-match.xml?url=http%3A%2F%2Fdotearth.blogs.nytimes.com%2F2008%2F12%2F16%2Fa-cooler-year-on-a-warming-planet%2F&api-key=mschwjy9jpmur8f98nerbjtv',
headers: {
'User-agent': 'Mozilla/4.0 (compatible) Greasemonkey/0.3',
'Accept': 'application/atom+xml,application/xml,text/xml',
onload: function(responseDetails) {
var parser = new DOMParser();
var dom = parser.parseFromString(responseDetails.responseText,
// go thru comment nodes
var entries = dom.getElementsByTagName('comment');
var cBody;
for (var i = 0; i < entries.length; i++) {
cBody = entries[i].getElementsByTagName('commentBody')[0].textContent;
cAuthor = entries[i].getElementsByTagName('display_name')[0].textContent;
cAuthorLoc = entries[i].getElementsByTagName('location')[0].textContent;
cReccomendations = entries[i].getElementsByTagName('recommendations')[0].textContent;
cSequence = entries[i].getElementsByTagName('commentSequence')[0].textContent;
// create the div
commentPreview = document.createElement("div");
commentPreview.innerHTML = '<p> ' +
cSequence + '/' + cTotal + cBody +
'<br>posted by: ' + cAuthor + ', '+ cAuthorLoc +'</a><br>'
+ cReccomendations +' recommendations</p>';
// write out my div before the related posts
relatedDiv.parentNode.insertBefore(commentPreview, relatedDiv);

This works fine for all comments, I can pull out authors, locations, etc. Where it breaks down is in responses, in the xml like this:

Mr. Revkin ... It's about time the NYT editorial board directed it's writers to "report" the facts, rather than perpetuating this outrageous climate change lie.<br /><br />“Make the lie big, make it simple, keep saying it, and eventually they will believe it”<br />
You've got to be more specific here. Is the "lie" that the world is warming, that people are contributing, that it's dangerous, or ...? I think you'll find my coverage is held out on all "sides" to be fair and accurate, both in describing what science has revealed, and what it hasn't figured out yet.
<a href="http://topics.nytimes.com/top/reference/timestopics/people/r/andrew_c_revkin/index.html" target="_blank">Andy Revkin</a
<location>Garrison, NY</location>
<display_name>Rich Pletcher</display_name>
<location>Fuquay Varina, NC</location>

If I request the node named display_name it will take it from any point in the hierarchy, when I don't want to go that deep. But as far as I can find in the documentation, there is no way to limit the depth that GM_xmlhttpRequest can go.

Am I missing something? How can I tell the script to choose the display_name from the lowest hierarchy before moving into the responses and then grabbing that one too?

Simple to do in xsl, but is there a mechanism for it here?

Thanks for any help.

rnd me
01-04-2009, 03:04 AM
this is either invalid XML, or a cut-off version.
thus there is not enough information to tell.

what i can say is that you can use getElementsByTagName on tags as well as whole documents.

without knowing the structure of the xml, something like


should give you what you want. ask for the parent tag, then the "display_name" tag.

also don't forget that you can easily use xsl.
run the transform in greasemonkey and .innerHTML the result into your document...

01-06-2009, 05:46 AM
Yes, this is a snippet.

But thanks rnd me, I didn't know I could use getElementsByTagName that way. In the meantime I decided to go the xpath route.