View Full Version : regex help
firepages
01-02-2003, 05:53 AM
Hi, I am trying to parse the php.ini file using regex to grab all the directives and their values... this nearly works...
<?
$yaks=file('c:\winnt\php.ini');
$pattern="/^(.*)(?:\s)?=(?:\s)?(.*)/";
foreach($yaks as $y){
preg_match($pattern,$y,$rets);
if($rets){echo $rets[1].' >> '.$rets[2].'<br />';}
}
?>
it matches
directive = value
&
;directive = value
which is exactly what I want, however It also matches
; comment = moretext
so how do I tell my regex not to match '; whatever' i.e. no whitespace between the ; and the (hopefully) directive ?
despite what Mordred often suggests the longer I look at regex the more I hate it :)
edit: err the forum wiped the escaped chars + the path to php.ini but they are there!
mordred
01-02-2003, 01:50 PM
Do you want the pattern not to capture the whitespace within the match or should the whole match fail if a number of whitespace is found after ; ?
$pattern1 = "/^(;?(?:\s)*.*?)(?:\s)?=(?:\s)?(.*)/";
leaves out the whitespace
$pattern2 = "/^(;?(?<!\s)\w.*?)(?:\s)?=(?:\s)?(.*)/";
let the match fail (tested on your posted example).
That's just a quick hack and the difference is kinda hard to explain, but you'll understand when you test those patterns.
Hopefully without any vBulletin tags the characters are left as they are! :)
firepages
01-02-2003, 02:39 PM
"but you'll understand when you test those patterns. "
- no I won't ;)
"or should the whole match fail if a number of whitespace is found after ; ? "
thats the one & $pattern2 worked a treat ! thankyou
errr while you are here :o
the last issue I have with parsing the .ini file is where PHP have decided to stick comments on the same line as the directive... eg
directive = "value" ; some comment that would normally be on its own line
this happens rarely at the moment and I can dig them out later but is it possible to also lose this in the regex ?
complicating things I cant just look for a semicolon as it is sometimes a delimiter in the directive value, i.e. here I only really want the stuff in bold... the anount of whitespace between the value and the comment is supposed to be a tab but not always it appears ...
..................
memory_limit = 8M ; Maximum amount of memory a script may consume (8MB)
;include_path = ".:/php/includes" ;annoying comment
.....................
it appears that if there is a delimiter in the directive value then that value is always quoted which may help ?
again I can parse this out later but its sooo close (thanks to your help) to perfect at the moment ..... ;)
mordred
01-02-2003, 08:50 PM
Here you go:
$pattern3 = "/^(;?(?<!\s)\w.*?)(?:\s)?=\s*(?:'|\")?(.+?)(?:'|\")?(?:\s+)(?:;)?(?:.*)$/";
took some time figuring out how not to hit the memory size limit, but I think this one's save enough to use.
Script:
$yaks = file('test.ini');
foreach($yaks as $y) {
preg_match($pattern3, $y, $rets);
if ($rets) {
echo '|' . $rets[1] . '| >> |' . $rets[2] . '|<br />';
}
}
Content of test.ini:
memory_limit = 8M ; Maximum amount of memory a script may consume (8MB)
;include_path = ".;/php/includes" ;annoying comment
Output running the script with PHP4.3 on Win2k:
|memory_limit| >> |8M|
|;include_path| >> |.;/php/includes|
I put used those vertical bars to show me how much whitespace had been matched while tuning the pattern.
firepages
01-03-2003, 03:12 AM
Here you go:
$pattern3 = "/^(;?(?<!\s)\w.*?)(?:\s)?=\s*(?:'|\")?(.+?)(?:'|\")?(?:\s+)(?:;)?(?:.*)$/";
<cough>ah well If I had known it would be that easy </cough> :D
absolutely super stuff Mordred thankyou !
I tested it against a few php.ini's I have lying around and it did the job each time spot on.
it still scares me to look at but ;).
vBulletin® v3.8.2, Copyright ©2000-2012, Jelsoft Enterprises Ltd.