View Full Version : Parsing URL in PHP - only isolate specific section of "path"

01-06-2010, 09:36 PM

I have been doing some reading to figure out how to exactly parse what I am trying to create, maybe you can help.

I have a URL that is created using Wordpress that looks like this:


I have used the following code to parse the above URL

$url = 'http://www.domain-name.com/wpblog/index.php/tag/my-tag';


echo parse_url($url, PHP_URL_PATH);

and it returns the following:

[scheme] => http
[host] => www.domain-name.com
[path] => /wpblog/index.php/tag/my-tag
[query] =>
[fragment] =>

What I really want to do is to isolate the last section of the path ("my-tag") and place it into a sentence, such as "You have searched for: my-tag"

Can anybody help? Can the path be split into a deeper array, or is there a bit of code that will capture the text after the last "/"?

Also, would there be a way to implement a way to replace the "-" with a space?


01-06-2010, 09:53 PM
You could try something like this

$url = 'http://www.domain-name.com/wpblog/index.php/tag/my-tag';

$path = parse_url($url, PHP_URL_PATH);
// split the path
$parts = explode('/', $path);
//get the last item
$tag = end($parts);
// replace the dash with a space
$tag = str_replace('-', ' ', $tag);
echo $tag;

01-06-2010, 10:12 PM
Thanks for your help...but I get the following error:

Warning: parse_url() expects exactly 1 parameter, 2 given in /home/kowski/public_html/testurl.php on line 4

Not sure what the issues is...the line in questions is the following:

$path = parse_url($url, PHP_URL_PATH);

Seems it doesn't like the "$url" and the "PHP_URL_PATH" params in the same function...


01-06-2010, 10:46 PM
You're PHP version is < 5.1.2.
You can acheive the same effect by just using the parse_url and pulling the 3rd offset to it:

$aPath = parse_url($url);
$sPath = $aPath['path'];

then follow the above using $sPath for you're parameter to explode (or just change it to $path).

04-27-2012, 01:45 AM

I'm trying to to pull just the path.

I'm using parse_url ($someUrlString , PHP_URL_PATH)

It works fine when you punch in the full URL "http://www.yahoo.com/pathNameHere/lalala

but when a user provides only www.yahoo.com/pathNameHere/lalala, without the "http://" portion, the path doesn't show up as isolated from the domain. I still get the full www.yahoo.com/pathNameHere/lalala.

Is there a way to have the path isolated even if www.yahoo.com/pathNameHere/lalala is entered rather than have http://www.yahoo.com/pathNameHere/lalala?

Please advise on how to tackle this. Any helps would be greatly appreciated :thumbsup:.


04-27-2012, 02:06 AM
No, you need to do it manually as parse_url doesn't know how to split up a path without a scheme.
Use pattern matching or cut it up with string manipulation to determine the parts.

04-27-2012, 07:53 PM
No, you need to do it manually as parse_url doesn't know how to split up a path without a scheme.
Use pattern matching or cut it up with string manipulation to determine the parts.

Thanks Fou-Lou :thumbsup: You're right. The parse_url() function does not isolate everything for you. I managed to come up with the regular expression to separate the 'scheme' that contained various combinations of 'host' (domains and subdomain(s)) addresses and isolated the path by itself. I used the preg_replace() function.

In case anyone else was struggling to isolate the scheme for the URL, this is the regular expression to punch into the preg_replace() or preg_match().


04-27-2012, 08:24 PM
You can simplify a bit as well. You can group http together and check if s? exists, and you can jus tuse [a-z0-0] if you flip the insensitive flag.
The {1} is also not required as it is implicit. If it doesn't have a multiplier of ? (0 or 1) or * (1 or more), then it has to be 1.
A slightly better pattern that should present similar to parse_url would be:


That should give you the full path, the scheme, the domain, the path, the querystring, and the fragment separated in offsets 0 through 5. Haven't tested it much, but looks to do the job. It also doesn't really obey the rules of dns naming, but that's a whole nother mess.