Hi,
I am trying to build a string comprising a set of words which are in another passage of text - highlighted by brackets.
so I have tried numerous ways and am getting nowhere.
Any tips you can provide will be very welcome.
$keywords = 'The (quick) brown fox (jumped) over the (lazy) dog';
#$keywords =~ s/\([^()]\)+/$1/;
if ($keywords =~ m/^\([^()]\)+$/){ print qq( s1=$1 $2 $3); }
$keywords =~ s/([(]?[-\w\ ]+[)]?){0,8}/$1/;
the special var (rather the result), should be like this:
$1 = 'quick jumped lazy';
bazz
OK, this gets me the (quick) split across $1 $" $3 but, not the other bracketed words.
[code]
if ($keywords =~ /(\()([\w\ ]+)(\))+/) { print qq( s1=$1 s2=$2 s3=$3 s4=$4 ); }
[code]
Shannon Blonk
07-13-2009, 01:53 AM
Your first try was pretty close -- just needed your repetition in the right spot.
@words= 'The (quick) brown fox (jumped) over the (lazy) dog'=~/\(([^)]+)\)/g;
or
@words= 'The (quick) brown fox (jumped) over the (lazy) dog'=~/\((.*?+)\)/g;
Thanks Shannon, I'll take a look at that closely.
In the meantime I had mushed this together which works but the regex is much tidier.
my $keywords = $description;
$keywords =~ s/'//g;
$description =~ s/\/\(//g;
$description =~ s/\)\///g;
my @search_terms;
my @array = split ( '/' , $keywords);
my $count=0;
foreach my $word (@array)
{
$count++;
#print qq( word = $word <br /> );
if ($word =~ /^(\([-\w\'\ ]+\))$/ )
{
my ($keep,$discard) = split /\// , $word, 2;
$keep =~ s/\(//;
$keep =~ s/\)//;
#print qq(keep = $keep);
push (@search_terms, $keep);
}
}
bazz
Shannon Blonk
07-13-2009, 03:34 AM
nerh?
From that code it looks like the keywords are delimited by /( and )/, not plain parentheses? You kill all the single quotes in $keywords then allow them in the $word match? Then split on a slash after a match that excludes slashes?
I'm all confusified now.
o-o-h the stripped out bit is sloppy. :(
I'm going back to the regex idea so only the () will be used and not the /( and )/.
Thanks for your answer.
bazz
confused myself now. again :(
this first one works technically except that it prevents me from using brackets in normal text.
my @words = $keywords =~ /\(([^)]+)\)/g;
If I change it to this, it does not work but does not error
my @words = $keywords =~ /\({[^}]+}\)/g;
what I really want is to be able to use a symbol which would not be used in english text. Something like |
my @words = $keywords =~ /\(|[^|]+|\)/g;
that doesn't capture the words instead, it captures the whole paragraph.
And this captures the whole pargraph too.
my @words = $keywords =~ /\(|[^|]+\)/g;
so here is my breakdown of the regex.
/ = regex limiter
\( = regex boundary
| = first item to match on
[ = start of char class
^ = find the first occurrence of the following char
| = the following char to match on (mentioned in the above point)
] = end of char class
+ = 1 or more occurrences
\) = end regex boundary
/ = regex end limiter
g = make it all global
; = end of line
Any pointers most welcome.
bazz
Shannon Blonk
07-18-2009, 03:27 AM
my @words = $keywords =~ /\(|[^|]+\)/g;
would break down as
/ = regex delimiter
\( = open parenthesis (the backslash escapes the begin-capture meaning)
| = alternation (or)
[ = begin character class
^ = invert character class
| = literal pipe character
] = end character class
[^|] = anything but a pipe character
+ = repeat 1 or more times, greedily
\) = literal close paren.
/ = end regex
So, that would match an open paren OR a group of one or more non-pipe characters followed by a close paren.
To use you choice of keyword marker:my ($beginKey,$endKey)=qw{ | | };
my $s = "The |quick| brown fox |jumped| over the |lazy| dog";
my @keywords= $s=~/$beginKey(.*?)$endKey/g;
# if you want to strip the keyword delimiter out of the original string
($beginKey,$endKey)=qw( -={ }=- );
$s = "The -={quick}=- brown fox -={jumped}=- over the -={lazy}=- dog";
@keywords=();
$s=~s/$beginKey(.*?)$endKey/push(@keywords,$1),$1/eg;:)
Blimey, I was way off.
Thanks for your response. I'll study it to try to mame sense of it.
bazz