PDA

View Full Version : do a multiple character replace on one line with Perl


crmpicco
03-16-2007, 10:55 AM
$hotelreturned->{address} =~ s/<br>/ /g;
$hotelreturned->{address} =~ s/<br\/>/ /g;
$hotelreturned->{address} =~ s/<br \/>/ /g;
$hotelreturned->{address} =~ s/\;//g;
$hotelreturned->{address} =~ s/\#//g;

Is there a way to put this into a sub to do it or can it be done in one line?
Cheers, Picco

KevinADC
03-16-2007, 06:02 PM
$hotelreturned->{address} = mysubname( $hotelreturned->{address});


sub mysubname {
$t = shift;
$t =~ s/<br\s*\/?>/ /gi;
$t =~ s/[;#]//g;
return $t;
}

shyam
03-16-2007, 06:06 PM
both :)
$hotelreturned->{address} ~= s/([;#]|<br\s*\/?>)/gi;

FishMonger
03-16-2007, 06:52 PM
When a regex (or double quoted strings) contains the same character as the delimiter, it's prefered to change the delimiter to something else to prevent the "leaning tower syndrom".

$t =~ s~<br\s*/?>~ ~gi;

This simple regex isn't really a problem, but it gets to be a real mess in more complex regex's or in double quoted print statements with lots of escaped \" quotes.

miller
03-16-2007, 08:51 PM
Ditto to FishMonger's statement.

My personal preference is to use brackets {} as a delimiter if forward slash isn't convenient. This is for multiple reasons:

1) You don't have to worry about escaping a bracket inside a regex as long as they are balanced. This works just fine:
$string =~ s{(foo){1,5}bar}{baz};

2) As programmers, we're already trained to see brackets since they are used for all blocks.

3) Most text editors already have syntax highlighting for brackets, so they are doubly easy to notice.

Anyway, this would change Kevin's function like so:


$hotelreturned->{address} = mysubname( $hotelreturned->{address});

sub mysubname {
my $t = shift;
$t =~ s{<br\s*/?>}{ }gi;
$t =~ s/[;#]//g;
return $t;
}


Yes, it's actually more characters, but it's easier to read and demonstrates what FishMonger was talking about.

Also, ignore shyam's comment. Not only is his broken code, but doing it in a single regex is a waste of effort as the destinations are different, one's a space and one is empty.

- Miller

ralph l mayo
03-16-2007, 09:46 PM
Not actually recommended, but just for completeness:


$hotelreturned->{address} =~ s{(<br\s*/?>|[;#])}{' 'x(length $1 > 1)}giexms;

miller
03-16-2007, 11:52 PM
Not actually recommended, but just for completeness:


$hotelreturned->{address} =~ s{(<br\s*/?>|[;#])}{' 'x(length $1 > 1)}giexms;


Yes ralph, as I said, a waste of effort.

Code is obfuscated. Additions are updates are more complex since you must first decipher what is being done.

I would have preferred that you not bother sharing that "for completeness sake". Fortunately, I doubt most people would be silly enough to actually use it.

- M

ralph l mayo
03-17-2007, 01:37 AM
I would have preferred that you not bother sharing that "for completeness sake".
I'll try to take your delicate aesthetic sense into account in future posts. :P