View Full Version : string manipulation help please.

05-06-2009, 11:46 PM
Hi all,

i have searched google and looked at the string manual but i think it more complicated than i am able to understand :)

i have several records

#1: State Of Play : Complete BBC Series 1 [2003] [DVD]
#2: Cranford : Complete BBC Series [2007] [DVD]
#3: Twilight - 2 Disc Special Edition (DVD) (2008)

and so on which are being pulled from my db but i would like to know how to display it without the "#1: " and if possible the info between the []'s and () as well if there present

i thought maybe trim would work but looking at the examples in the manual it looks like it may not be as straight forward.

any ideas

here is the code i am using

while ($row = mysql_fetch_array($item_query)){
//$title = get_string_between($row['title'], ':','[');
print "<div class='item'>

<div class='itemIMG'>
<img src='{$row['image']}'
height='100' />

<div class='itemTitle'>{$row['title']}</div>
<div class='itemInfo'></div>
<div class='itemButton'>
<img src='images/compare.gif'
alt='Compare {$row['title']} Prices'
title=''Compare {$row['title']} Prices''></div>


the $row['title'] is the row the data is stored in which i want to cut down

any ideas

05-06-2009, 11:58 PM
ok i have used

list ($no, $short_title) = split (': ', $row['title']);

to remove the "#1: "

but how do i remove the ['s and ('s after the title???

05-07-2009, 12:03 AM
Regexp is the way to go.

preg_replace("/#\d:\s*|\s*\(.*?\)|\s*\[.*?\]/", "", $row['title'])

05-07-2009, 12:07 AM
wow thats so cool, i really must learn preg_replace :D

thank you!!!!

05-07-2009, 12:09 AM
you can strip out what you don't want with a regex

$row["title"] = "#1: State Of Play : Complete BBC Series 1 [2003] [DVD]";
$row["title"]=preg_replace("/^#\d+:\s+|[(\[][^\])]+[\])]/", "", $row["title"]);

05-07-2009, 12:23 AM
Well you should learn your regexps fast! Otherwise there is no chance you can take mine and timgolding's, which both have their respective deficiencies, and merge them into something that works robustly.

For example, because I took only the three strings that you posted into account, mine will stupidly be broken by #10: State Of Play : Complete BBC Series 1 [2003] [DVD]. timgolding's will break if something like #1: State Of Play : Complete BBC Series 1 [2003] [DVD(Multi5)] comes along.

05-07-2009, 12:26 AM
Yeah i noticed those too. But don't have time to fix them now. I reckon the poster should start here http://uk.php.net/manual/en/book.pcre.php

05-07-2009, 12:35 AM
cheers i will read up on that,

luckly i only have #1:,#2:. and #3: so you code should always work for me but i still want to learn it and i will read that link...


05-11-2009, 02:22 PM
hi guys,

ok i have a similar request

im pulling some data out of a feed which contains some htlm code [B] etc which i dont want to display and ~

e.g it might be
<b>DVD</b> ~ Daniel Craig

and if possible i would like it to say
DVD - Daniel Craig

is this possible

05-11-2009, 02:29 PM
ok i have got rid of the <b>'s etc by using strip_tags

now i am left with

DVD ~ Dev Patel
~ Green Day (Artist)

how do i remove everything before and including ~ and (artist)

any ideas

05-11-2009, 03:00 PM
ok i have managed to achieve this by going the long way around

$otherinfo = strip_tags($row['other']);
$otherinfo = trim($otherinfo, "DVD");
$otherinfo = trim($otherinfo, "~ ");
$otherinfo = trim($otherinfo, "(Artist)");
$otherinfo = trim($otherinfo, "(Author)");

but at least it works :)


05-11-2009, 03:03 PM
Well you should learn your regexps fast!

This reminds me of nice sig here "Teach me how to be an engineer, i don't care if it takes all day" :D

05-11-2009, 03:06 PM
correction the above doesnt work :( as it removes all (Artist) as individual characters instead of a sting :(

anyone have any ideas?

05-11-2009, 06:54 PM
p.s does anyone have a good simple to follow regex tutorial to help me learn i have found a few which instantly give me a headache lol need something simple so even a biff can follow :D

also i still need to convert these
DVD ~ Nicole Kidman => Nicole Kidman
~ Green Day (Artist) => Green Day
by Michael Connelly (Author) => by Michael Connelly

i've tried experimenting but no success, any one have any ideas?


05-11-2009, 07:29 PM
As you have already found out, your "trim" approach does not work, because it removes characters, not strings. You could try something like

preg_replace("/DVD|~ |\(Artist\)|\(Author\)/", "", $otherinfo);

And while funny, I don't think the comparison of learning regexps to becoming an engineer holds much truth. It is honestly beyond me how there are whole books written about the topic. Well thinking about it, there are books written about HTML, so what do you know.

05-11-2009, 07:45 PM
my point of that comparison was time. you can't become an engineer in a day, as much as you can't learn regex fast (in a day, or month). although it's true, you can learn it eventualy, but you can spend your life trying to be engineer and never succeed in becoming one :p

and reading books about regex is very useful but i don't think that there are many programmers out there that know regex "by heart", and when you need to write some a bit more complicated regex it is always pain in the bottom (i would write arse but i'm afraid regex will change it to *** ;) ) to do it.

they are powerful but with great power comes great responsibility (:D) of learning how to use that power properly.

and as for HTML, i don't know for sure, but i would bet something valuable that browsers would be useless without existence of regex :)

05-11-2009, 07:54 PM
That was kind of my point too: I think you can. Regular expressions are a very easy concept, and I believe the main reason why people consider them difficult to learn is that on first glance they look like they are written in an impossible to read magic language that only the highest of wizards are able to understand.

What I grant you though, is that you have to be very precise when working with regexps, and judging by a whole lot of posts I've read in here so far, that's not everybody's strong suit.

05-11-2009, 08:01 PM
just to add a little bit of comment to your comment

only those regex that highest of wizards wrote look like they are written in an impossible to read magic language that only the highest of wizards are able to understand. :D

05-11-2009, 08:36 PM
cheers venegal !!! works a treat :D

05-11-2009, 08:42 PM
This is about as easy to understand as it comes.


Though the tutorials here are base on the perl 5 regex engine which is not exactly the same as PCRE it's pretty much the same. Think its even got a section explaining the differences. I found php.net the best resource for me when i learned it.

05-20-2009, 01:24 PM
hi guys, i need help again :)

i have many titles which are stored in my db and each title has the word "-inch" or the " symbol, how would i use preg_replace to remove everything after and including the above and the the 2 characters infront of the above.


Samsung LE32B450C4 32" 720p HD Ready LCD TV with Freeview to Samsung LE32B450C4

Sony Bravia KDL32W5500U 32-inch LCD TV with freeview to
Sony Bravia KDL32W5500U

is this possible, thank you


05-20-2009, 02:52 PM
$strings = array(
'Samsung LE32B450C4 32" 720p HD Ready LCD TV with Freeview',
'Sony Bravia KDL32W5500U 32-inch LCD TV with freeview'
foreach ($strings as $string){
echo preg_replace("/.{2}(-inch|\").*/", "", $string) . "\n";

05-20-2009, 03:40 PM
hi mate that works well for the -inch but not for the items which have the " symbol instead of the -inch

thank you

05-20-2009, 04:04 PM
Really? It works fine for me. What does the snippet output for you?

05-20-2009, 04:21 PM
mmm that works fine???

so why does that not work for my data???

here (http://www.kernow-connect.com/Price%20Comparison/electronics.php)is my page

and the code i used to replace the titles

$alt title = $row['title'];
$short_title = preg_replace("/.{2}(-inch|\").*/", "", $alt_title) . "\n";

any ideas mate


05-20-2009, 04:34 PM
it shouldnt matter that i have already used preg_match to get the current string would it?

05-20-2009, 05:10 PM
Ah, I see your problem. Your strings don't contain "s but &quot;s.

So use

05-20-2009, 05:31 PM
hi mate thanks for this,

however some titles still remain unchanged :(

only 3 or 4 if you can have a look and see whats wrong i'd be greatful
here (http://www.kernow-connect.com/Price%20Comparison/electronics.php)is the page again.

many thanks

05-20-2009, 05:35 PM
ok i have managed to get rid of the remaining ones by changing to this:

preg_replace("/.{2}(-inch|\"|&quot;| inch).*/", "", $alt_title) . "\n

it is still leaving me with a few -'s
i tried

(-inch|\"|&quot;| - inch)

but that reset it back to show all the data again?

any ideas

05-20-2009, 06:39 PM
ok i have got rid of the -'s using

$short_title = trim($short_title);
$short_title = trim($short_title,'- ');

and that seems to work fine :)

thanks for you help!!!!!

05-21-2009, 01:19 AM
Aww, I went out and missed all the action. But glad you could do it on your own.

05-21-2009, 01:16 PM
cheers for all you help mate, i'll probably be calling on you again lol


06-23-2009, 01:31 PM
ok can anyone please tell me why this isnt working?

$short_title = preg_replace("/.{0}( - ).*/", "", $alt_title) . "\n";

it doesnt seem to do anything to the string :(

this is the string
Samsung LE40A856 - 40'' Widescreen 1080P Full HD LCD TV - With Freeview Rose Black

and im trying to get just Samsung LE40A856

any ideas

06-23-2009, 02:22 PM
ok can anyone please tell me why this isnt working?

$short_title = preg_replace("/.{0}( - ).*/", "", $alt_title) . "\n";

it doesnt seem to do anything to the string :(

this is the string
Samsung LE40A856 - 40'' Widescreen 1080P Full HD LCD TV - With Freeview Rose Black

and im trying to get just Samsung LE40A856

any ideas

i dont know what "/.{0}( - ).*/" that does as a regex, but it looks wrong.

I would try

$short_title = preg_replace("/^([^\-]+)(\-.*)$/s", "$1", $alt_title);

Thats untested but its that sort of idea