PDA

View Full Version : Simple regex not working ?



jeddi
03-19-2010, 06:39 AM
Hi,

I am trying to change dashes to underscores in text file.

The reason is that I am having problems processing some xml tags
so I want to change this <merchant-name> to <merchant_name>

To do this I used this regex:


$pattern = '#<(\w+)-(\w+)>#';
$replacement = '<$1_$2>';
$source = preg_replace( $pattern, $replacement, $source, -1 , $count);


But for some reason, nothing is changed :(

Have I misunderstood something ?

The idea is that <anything-here> gets changed to <anything_here>

There is data between the tags that should not be touched, it is just the tag
names themselves that I want to change.

In the file, there about 2000 tags which need changing.

Can anyone see what I am doing wrong ?


Thanks.


.

SKDevelopment
03-19-2010, 11:00 AM
Your regexp should work, except it does not take into account closing tags. You could try something like this:


$source = '<root><merchant-name>text</merchant-name><merchant-name>text 2</merchant-name></root>';
$pattern = '#<(\/?\w+)-(\w+)>#';
$replacement = '<$1_$2>';
$source = preg_replace( $pattern, $replacement, $source, -1 , $count);
echo $source;

It is quickly written though. It could be that you could need something more complicated.

Also I've supposed the the XML document is well-formed and does not contain CDATA-sections.

Rowsdower!
03-19-2010, 12:45 PM
This seems like overkill. What's wrong with a simple str_replace() for this?


$source=str_replace("-","_",$source);

Nevermind, I didn't READ. :D


There is data between the tags that should not be touched, it is just the tag
names themselves that I want to change.

jeddi
03-19-2010, 12:49 PM
Thanks for input,
I amended my script to include the closing tags .)

But I notice that when there are tow hyphens, then
they do not get changed e.g. <sales-page-link>

Is there a reason for this, and how can I overcome it ?

Thanks.





.

MattF
03-19-2010, 12:53 PM
The reason is that I am having problems processing some xml tags
so I want to change this <merchant-name> to <merchant_name>


If I recall correctly, using {merchant-name} rather than just merchant-name in your XML parser should sort the hyphen issue, i.e: enclose all id's/names with {}.

SKDevelopment
03-19-2010, 01:12 PM
Thanks for input,
I amended my script to include the closing tags .)

But I notice that when there are tow hyphens, then
they do not get changed e.g. <sales-page-link>

Is there a reason for this, and how can I overcome it ?

Thanks.

You could use the e-modifier. Something like this:


<?php
$source = '<root><sales-page-link>text</sales-page-link><sales-page-link>text 2</sales-page-link></root>';
$pattern = '#(<\/?[\w-]+>)#e';
$replacement = 'str_replace("-","_","$1")';
$source = preg_replace( $pattern, $replacement, $source, -1 , $count);
echo $source;
?>

Please notice: You should be very careful when using the e-modifier. Because in this case preg_replace() executes string as PHP code. It is as dangerous as using eval() (http://php.net/eval). You must be absolutely that on one could inject any PHP code to the function and so execute his own PHP code on your system.