View Full Version : preg_matching & overwriting

02-22-2010, 04:52 PM
I'm trying to put together a script that will do the following:

1. Open and parse a csv file

2. Match a specific value within a constant tag on every row

3. Strip away the rest of the string

4. Save and close the file

For example, these might be relevant 3 rows in the original file.csv:

<tag>My name is John</tag>
<tag>They call him John</tag>
<tag>John is 15 years old</tag>

And here's what I'd like in the overwritten file.csv:


Non-matches should be overwritten with nothing i.e. blanked.

Here is the basic code I'm trying to flesh out, I know I should probably be using fgetcsv somewhere and proper regex but that's why I'm here:

if($fh = fopen('file.csv', 'r+')){
$string = fread ($fh, filesize('file.csv'));
preg_match_all("/^(<tag>)(?=.*?John)(?=.*?</tag>)$/", $string, $match);
foreach($match[0] as $value)
$string = str_replace($string, $value, $string);

// then strip the rest..........

} else {
die ("Error");

echo $result;

I think the general concept is clear, any ideas how to get it working?


02-22-2010, 07:20 PM
If you don't need the rest of the content, why not just implode the $matches into your file:

$string = fread ($fh, filesize('file.csv'));
preg_match_all("/^(<tag>)(?=.*?John)(?=.*?</tag>)$/", $string, $match);
file_put_contents('file.csv', implode("\r\n", $match[1]));
Not tested, just an example.

02-23-2010, 04:51 AM
I just tried that, it echos all rows but wipes the csv file entirely.

I think the write function and regex aren't quite right here, it probably also needs an array or something to match each specific row.

This regex seems to work a bit better but I'm still not sure how to extract only "John" out of the string/s:

preg_match_all('/<tag\b[^>]*>(.*?John)<\/tag>/', $str, $matches);


02-23-2010, 05:10 AM
preg_match_all('#<tag.+?(John).*?</tag>#i', $str, $matches);

02-23-2010, 12:10 PM
Hey, that actually worked, thanks! :thumbsup:

Just a couple more things: when no match for "John" is found it simply deletes the entire cell, which messes up the wider structure of the file. Could something else like "Data Not Found" be substituted instead, as in an if-else statement.

Also, the overwrite deletes all other columns, can this be avoided? I've tried using FILE_APPEND but no luck so far.

Oh, and how would ucwords be applied here to capitalize only the first letter of every word e.g. "JOHN SMITH/john smith" -> "John Smith"?

Thanks again.