...

View Full Version : Getting A list of 'authors' from an Atom Feed



the-dream
03-05-2009, 06:51 PM
Hi guys...

I'm in search of help again... basically, what I am trying to do, get a list of 'authors' from a Twitter atom feed in the simplest way possible.

This is what I've done so far...



<?php

$options = array(
CURLOPT_RETURNTRANSFER => true, // return web page
CURLOPT_HEADER => false, // don't return headers
CURLOPT_FOLLOWLOCATION => true, // follow redirects
CURLOPT_ENCODING => "", // handle all encodings
CURLOPT_USERAGENT => "spider", // who am i
CURLOPT_AUTOREFERER => true, // set referer on redirect
CURLOPT_CONNECTTIMEOUT => 120, // timeout on connect
CURLOPT_TIMEOUT => 120, // timeout on response
CURLOPT_MAXREDIRS => 10, // stop after 10 redirects
);

$feedURL = "http://search.twitter.com/search.atom?q=dog";

$ch = curl_init($feedURL);
curl_setopt_array($ch,$options);
$content = curl_exec( $ch );
$err = curl_errno( $ch );
$errmsg = curl_error( $ch );
$header = curl_getinfo( $ch );
curl_close( $ch );

$header['errno'] = $err;
$header['errmsg'] = $errmsg;
$twitterFeed = $content;

file_put_contents("tweetdata.xml", $twitterFeed);

?>


So basically, it's grabbing the feed:
http://search.twitter.com/search.atom?q=dog

And putting all of the data into the local file:
tweetdata.xml

Now, I need to figure out how to get the following out of that file...


<entry>
<id>tag:search.twitter.com,2005:1284428632</id>
<published>2009-03-05T18:09:46Z</published>
<link type="text/html" rel="alternate" href="http://twitter.com/Unclefishbits/statuses/1284428632"/>
<title>my dog carries his food into our living room &amp; drops it on the floor in various patterns. I am convinced he is communicating with me.</title>
<content type="html">my &lt;b&gt;dog&lt;/b&gt; carries his food into our living room &amp;amp; drops it on the floor in various patterns. I am convinced he is communicating with me.</content>
<updated>2009-03-05T18:09:46Z</updated>
<link type="image/png" rel="image" href="http://s3.amazonaws.com/twitter_production/profile_images/57614586/TUNATO_normal.jpg"/>
<twitter:source>&lt;a href="http://www.twhirl.org/"&gt;twhirl&lt;/a&gt;</twitter:source>
<author>
<name>Unclefishbits (Fishbits)</name>
<uri>http://twitter.com/Unclefishbits</uri>
</author>
</entry>


The bit I need to try and parse out is in bold... I have no idea how to do this, and have been trying for hours!

Please help!

Thanks!

Iszak
03-05-2009, 07:14 PM
Many people will tell you to use an XML Parser - but if you want a quick (not as in efficient) way use regular expressions like this.


preg_match_all('#<name>([A-Z0-9]+)\s\(([^\)]+)\)</name>#i', $twitterFeed, $match);

I've made it so it separates the user name and real name, but I don't know what characters are valid for user names so you may have to add extra characters in [A-Z0-9] such as hyphens, underscores, etc. If you want to get the complete string, as a whole you can use.



preg_match_all('#<name>([^<]+)</name>#i', $twitterFeed, $match);


Enjoy!

the-dream
03-05-2009, 07:19 PM
Thanks!

Yeah, I wanted to use RegEx rather than a full blown XML parser, it just seemed a bit over kill for what I was trying to do.

I'll let you know how I get on!

the-dream
03-05-2009, 07:54 PM
Just incase you're interested, I did some more coding... and this script now returns a list of Twitter usernames, for a given twitter search:



<?php

$options = array(
CURLOPT_RETURNTRANSFER => true, // return web page
CURLOPT_HEADER => false, // don't return headers
CURLOPT_FOLLOWLOCATION => true, // follow redirects
CURLOPT_ENCODING => "", // handle all encodings
CURLOPT_USERAGENT => "spider", // who am i
CURLOPT_AUTOREFERER => true, // set referer on redirect
CURLOPT_CONNECTTIMEOUT => 120, // timeout on connect
CURLOPT_TIMEOUT => 120, // timeout on response
CURLOPT_MAXREDIRS => 10, // stop after 10 redirects
);

$feedURL = "http://search.twitter.com/search.atom?q=%23macbundlebox";

$ch = curl_init($feedURL);
curl_setopt_array($ch,$options);
$content = curl_exec( $ch );
$err = curl_errno( $ch );
$errmsg = curl_error( $ch );
$header = curl_getinfo( $ch );
curl_close( $ch );

$header['errno'] = $err;
$header['errmsg'] = $errmsg;
$twitterFeed = $content;

file_put_contents("tweetdata.xml", $twitterFeed);

preg_match_all('#<name>([^<]+)</name>#i', $twitterFeed, $match);

$peepCount = count($match[1]);
$counter = 0;

$match[1][0] = preg_split("/\(/", $match[1][0]);

while($peepCount > $counter) {
$match[1][$counter] = preg_split("/\(/", $match[1][$counter]);
echo $match[1][$counter][0].'<br />';
$counter = $counter + 1;
}

?>



EZ Archive Ads Plugin for vBulletin Copyright 2006 Computer Help Forum