MoRiA
01-05-2004, 11:22 PM
How would I convert a section of text including HTML code (from, say, a database) so that newlines were preceeded with <br> (or the XHTML comlient <br />) but didn't put these linebreak tags in the middle of other HTML tags such as <img> tags?
nl2br() will change:
This is
some text
<img src="http://blah.com/img.gif"
title="Image">
into
This is<br />
some text<br />
<img src="http://blah.com/img.gif"<br />
title="Image">
The <img> tag will obviously not work with the line break stuck in the middle of it.
After a bit of thiking I came to the conclusion that I'd have to use a Regex but the precise regex eluded me and continues to do so. Therefore I have turned to others for help :D
Thanks in advance :)
firepages
01-06-2004, 02:45 AM
I hate these type of answers ;) , but the best bet would be to not put the newlines into the DB in the first place ?
nl2br(); only adds a break where a newline exists , no linebreak == no <BR/>
we have at least one resident regex guru ... but if you are using regex anywhere I would suggest you should be doing it B4 you add to the database , not at display time.
MoRiA
01-06-2004, 10:22 AM
Yeah I thought that using the regex before adding to the database would be faster etc but whether it's while inserting or while displaying i still need the regex ;)
Currently I have to enter the <br> tags manually while typing into the textarea which is a bit of a pain, I was just trying to make things easier.
Any regex I came up with myself managed to match either ALL the instances of <br> or NONE of them (this was after the nl2br()), neither of which are particularly useful...
mordred
01-06-2004, 12:56 PM
The difficult part is to locate those tags which contain newlines, the replacement functionality is easy then. Here's a quick hack that shows one way to do this:
$str = 'This <b id="
foo">is</b>
some text
<img src="http://blah.com/img.gif"
title="Image">';
function stripNewlines($matches) {
// replace newlines
$tag = str_replace(array("\n", "\r"), '', $matches[0]);
// add spaces between very near standing attributes
$tag = preg_replace('/([^=])(["\'])(\w)/', '\1\2 \3', $tag);
return $tag;
}
function cleanText($text) {
return preg_replace_callback('/<.*?\n.*?>/', "stripNewlines", $text);
}
// test output
var_dump(cleanText($str));
echo "\n\n";
var_dump(nl2br(cleanText($str))); // how it looks like after nl2br() is applied to the text
Note: The forum software tends to strip special characters from regexes. If you don't get the desired output, tell me, and I try to upload a correct version.