...

View Full Version : Problem with this preg_match()



jeddi
03-05-2010, 10:35 AM
Hi,

I am having to check an xml file
for the last closing because sometimes they don't include it !

The last closing tag should be </marketplace>

and this is the code I am using:



$pattern = '#\<\/marketplace.*?>#';
if( preg_match($pattern ,$file_out) == 0) {
file_put_contents( $file_out, '</marketplace>', FILE_APPEND);
write_log("Added the </marketplace>\r\n");
}


My problem is that this check always adds the </marketplace>
even when there is one, so I get a double closing tag !


This is the last bit of the xml file after the code ran:


</product-categories></product></marketplace></marketplace>

What have I done wrong here ?

Thanks.



.

MattyUK
03-05-2010, 11:57 AM
Could this also work?

if(strcasecmp ( substr($file_out, -14) , "</marketplace>" )==0){
$file_out.='</marketplace>/r/n';
}


Case insensitive and it may be less overhead than a preg_match.

However I suspect the problem may be in the pattern. If it is always at the end you can use the $ to indicate the end of the line. I have no idea why you have the .? in there. I could be wrong but don't you need to double escape the \


if (preg_match('#<\\/marketplace>$#', $file_out)) {
$file_out.='</marketplace>/r/n';
}

Hope these ideas help.

Matt

PS: you may also wish to examine this:

if (preg_match('%<(marketplace)[^>]*>(.*?)</\\1>$%s', $file_out)) {

}
It should match on the opening tag then use that capture to look for the closing tag. but only if the closing tag had a carriage return or end of file after it. I assume you will have no un-escaped % in the file. Not sure if that would cause a problem. Could always try switching them for # or some other symbol you are happy with. The contents are captured too so you might want to throw them out for performance. I'll let you figure that out.


I should add I dreamed these up. I didn't test them. I think they should be fine.

jeddi
03-05-2010, 12:08 PM
Hi Matt,

Thanks for the input.

The string I am testing for is: "</marketplace>"

Why do I need to double escape the / cvharacter I thought this was enough :


if (preg_match('#<\/marketplace>$#', $file_out)) {
file_put_contents( $file_out, '</marketplace>', FILE_APPEND);
}

If I use:
$file_out.='</marketplace>/r/n';

and $file_out = "my-path/myfile.txt"

Surely I will just get:
$file_out = "my-path/myfile.txt</marketplace>"

i.e. It will change the file name not the file itself.

Or am I wrong ?


.

MattyUK
03-05-2010, 09:24 PM
Hi

Try it and see, I guess. Or I can if I get more freedom later.

From recollection you need to double escape the escaped character because you PHP and preg_match both use \ to escape the next char.

so \ and preg_match escapes it \\ and preg_match escapes it turning \\ into and effective \ which php then uses and the final \ is applied as an escape.

I remember once was using a preg_match in an include and had to put four slashes in before it worked. Some guru on here helpfully told me to add a \ for every time it was going to be evaluated. I accepted their word and never looked it up but I suspect documentation will be around somewhere. I was using / to surround the pattern so perhaps that complicated things further.

The three regexs I gave you should work. The third one looks for <marketplace*>*</marketplace>/r/n. With * being any number of any characters. Which should suffice. The second one one looks for </marketplace>/r/n and the first one use substr to do the same.

Best thing would be to try it with one, two three ... escapes and the different patterns and go with the one that works for you.


If I use:
$file_out.='</marketplace>/r/n';

and $file_out = "my-path/myfile.txt"

Surely I will just get:
$file_out = "my-path/myfile.txt</marketplace>"

i.e. It will change the file name not the file itself.

I assumed $file_out was a string equal to the file contents. If so my approach I think should work. If it is a file handle instead then you are right my approach would be bad. But if it is a file handle I'm not sure preg_match would work at all. Doesn't that need a string? I didn't think you could throw file handles at it and expect the magic to work.

Good luck.

Matt

kbluhm
03-05-2010, 10:48 PM
There is no need to escape forward slashes unless the delimiters are forward slashes. You're using #.

MattyUK
03-05-2010, 11:22 PM
Ahhh ok that makes sense. I usually use / delimiters. I should have thought it through. thank you kbluhm



EZ Archive Ads Plugin for vBulletin Copyright 2006 Computer Help Forum