01-08-2010, 08:17 PM
I wonder if someone would do me a small favor and help me with a regex problem? This code was written by somebody else but it works great for extracting link urls on a page. But I don't want it to look for ALL links, just the links that have this this beginning"


So actually the links would look like this

<a href="/search.php?????">

$source = preg_replace("/a[\s]+[^>]*?href[\s]?=[\s\"\']+"."(.*?)[\"\']+.*?>"."([^<]+|.*?)?<\/a>/", urlencode($source), $source);


01-09-2010, 03:06 AM
/(<a href=")(\/search\.php\?.+?)(">)(.+?)(<\/a>)/

Untested. Maybe something like that...

01-09-2010, 04:40 AM
'%<a [^>]+href="(?P<url>/search\.php\?[^"]+)"[^>*]*>(?P<text>[^< ]+)</a>%si'

01-09-2010, 02:55 PM
Thanks guys!

01-09-2010, 09:02 PM
All examples above are correct but I will suggest to make it more flexible, sine you could have three possible URLs on the page:

<a something href='/search.php?variables' something>
<a something href="/search.php?variables" something>
<a something href=/search.php?variables something>

LINK => preg_match('#<a[^>]+href=(?:"|\'|)(/search\.php\?[^"\'\s]+)#', $data, $values);
TEXT => preg_match('#<a[^>]+href=(?:"|\'|)/search\.php\?[^>]+>([^<]+)#', $data, $values);

01-09-2010, 09:18 PM
Thanks PHP6! That works great! :)

