Hello and welcome to our community! Is this your first visit?
Register
Enjoy an ad free experience by logging in. Not a member yet? Register.
Page 1 of 2 12 LastLast
Results 1 to 15 of 22

Thread: regex help

  1. #1
    Regular Coder
    Join Date
    Jan 2010
    Location
    Washington
    Posts
    223
    Thanks
    34
    Thanked 0 Times in 0 Posts

    Question regex help

    hi im trying to get my crawler to work right and need a little help.
    Im using this code to extract info such as "10.pr", "10_pr", "10pr", "pr.10", "pr_10", "pr10" and so on etc.

    Code:
       preg_match('/(\d{1,3})[._]?(pr|tr|gr|zr|amc|mp|o|iv|is|ve|sr)|(pr|tr|gr|zr|amc|mp|o|iv|is|ve|sr)[._]?(\d{1,3})/', $t, $m);
    my problem is the second half of the code
    PHP Code:
    (pr|tr|gr|zr|amc|mp|o|iv|is|ve|sr)[._]?(d{1,3}) 
    does not seem to index at all. I used a regex tester and all my examples i put in work correctly but when i use this with the crawler, it seems only the first half of the code works
    Code:
    (\d{1,3})[._]?(pr|tr|gr|zr|amc|mp|o|iv|is|ve|sr)
    it will index my examples "10.pr", "10_pr", "10pr" perfectly, but when the crawler tries to index "pr.10", "pr_10", "pr10". It does not seem to work. I have tried refomatting this regex several times to make it work so im out of ideas. Is it maybe the | in the middle of the code causing it? is there a way to make sure the whole regex code is read? I used the | in the middle because it was all i could get to work in the regex tester. can somebody help please? thanks.

  • #2
    Senior Coder
    Join Date
    Jul 2009
    Location
    South Yorkshire, England
    Posts
    2,318
    Thanks
    6
    Thanked 304 Times in 303 Posts
    Untested. Try:

    Code:
    preg_match('/((\d{1,3})[._]?(pr|tr|gr|zr|amc|mp|o|iv|is|ve|sr)|(pr|tr|gr|zr|amc|mp|o|iv|is|ve|sr)[._]?(\d{1,3}))/', $t, $m);

  • #3
    Regular Coder
    Join Date
    Jan 2010
    Location
    Washington
    Posts
    223
    Thanks
    34
    Thanked 0 Times in 0 Posts
    Quote Originally Posted by MattF View Post
    Untested. Try:

    Code:
    preg_match('/((\d{1,3})[._]?(pr|tr|gr|zr|amc|mp|o|iv|is|ve|sr)|(pr|tr|gr|zr|amc|mp|o|iv|is|ve|sr)[._]?(\d{1,3}))/', $t, $m);
    nope that didnt seem to do it either.

  • #4
    Senior Coder
    Join Date
    Jul 2009
    Location
    South Yorkshire, England
    Posts
    2,318
    Thanks
    6
    Thanked 304 Times in 303 Posts
    Does it match anything of you just use the latter of the two expressions on its own?

    Code:
    preg_match('/(pr|tr|gr|zr|amc|mp|o|iv|is|ve|sr)[\._]?(\d{1,3})/', $t, $m);
    Try changing preg_match to preg_match_all too, if there may be more than one match per time.

  • #5
    Regular Coder
    Join Date
    Jan 2010
    Location
    Washington
    Posts
    223
    Thanks
    34
    Thanked 0 Times in 0 Posts
    Quote Originally Posted by MattF View Post
    Does it match anything of you just use the latter of the two expressions on its own?

    Code:
    preg_match_all('/(pr|tr|gr|zr|amc|mp|o|iv|is|ve|sr)[\._]?(\d{1,3})/', $t, $m);
    Try changing preg_match to preg_match_all too, if there may be more than one match per time.
    it does match, im just trying to make it loop through the whole regex and grab "10.pr", and "pr.10" in any combination.

    here is what i just tried and it almost works.
    PHP Code:
    preg_match_all('/((\d{1,3})[._]?(pr|tr|gr|zr|amc|mp|o|iv|is|ve|sr)|(pr|tr|gr|zr|amc|mp|o|iv|is|ve|sr)[._]?(\d{1,3}))/'$t$m); 
    The problem is, it is suppost to group similar types and it does but for the "part" table in the db it gives everything a 1 automatically, even if has distinguished multiple types.

    if i could just get this to index part numbers correctly it will be working just fine.

    here is my whole function.
    PHP Code:
    function part_search($t)
    {
       
    preg_match_all('/((\d{1,3})[._]?(pr|tr|gr|zr|amc|mp|o|iv|is|ve|sr)|(pr|tr|gr|zr|amc|mp|o|iv|is|ve|sr)[._]?(\d{1,3}))/'$t$m);
        if (isset(
    $m[1]))
        {
            return (int)
    $m[1];
        } else {
            return 
    0;
        }

    is there anything you can think of that might help further?
    Last edited by cosmicsea; 03-20-2010 at 12:35 AM. Reason: typo

  • #6
    Senior Coder
    Join Date
    Jul 2009
    Location
    South Yorkshire, England
    Posts
    2,318
    Thanks
    6
    Thanked 304 Times in 303 Posts
    What do you see printed if you change this line:

    Code:
    return (int)$m[1];
    to:

    Code:
    print_r($m[1]);
    ?

  • #7
    Regular Coder
    Join Date
    Jan 2010
    Location
    Washington
    Posts
    223
    Thanks
    34
    Thanked 0 Times in 0 Posts
    the weird thing is if i take out preg_match_all and make it preg_match it will stop marking everything 1 and make it work correctly, But it does the original problem again and just wants to do the "10.pr" format type and not "pr.10".

  • #8
    Regular Coder
    Join Date
    Jan 2010
    Location
    Washington
    Posts
    223
    Thanks
    34
    Thanked 0 Times in 0 Posts
    Quote Originally Posted by MattF View Post
    What do you see printed if you change this line:

    Code:
    return (int)$m[1];
    to:

    Code:
    print_r($m[1]);
    ?
    ok hold on ill test it.

  • #9
    Regular Coder
    Join Date
    Jan 2010
    Location
    Washington
    Posts
    223
    Thanks
    34
    Thanked 0 Times in 0 Posts
    Quote Originally Posted by MattF View Post
    What do you see printed if you change this line:

    Code:
    return (int)$m[1];
    to:

    Code:
    print_r($m[1]);
    ?
    here is what it outputted on a sample test which looks good. its just those 0's are suppost to be the part numbers.
    PHP Code:
    Array
    (
        [
    0] => pr.1
    )
    Array
    (
        [
    0] => pr.2
    )
    Array
    (
        [
    0] => pr.3
    )
    Array
    (
        [
    0] => pr.4
    )
    Array
    (
        [
    0] => 01.pr
    )
    Array
    (
        [
    0] => 02.pr


  • #10
    Senior Coder
    Join Date
    Jul 2009
    Location
    South Yorkshire, England
    Posts
    2,318
    Thanks
    6
    Thanked 304 Times in 303 Posts
    You're using preg_match_all now though, hence you're getting a multi-dimensional array rather than a single level array. do print_r($m); and it will show you the full array output. Can't remember offhand which one the relevant parts will be in.

  • #11
    Regular Coder
    Join Date
    Jan 2010
    Location
    Washington
    Posts
    223
    Thanks
    34
    Thanked 0 Times in 0 Posts
    Quote Originally Posted by MattF View Post
    You're using preg_match_all now though, hence you're getting a multi-dimensional array rather than a single level array. do print_r($m); and it will show you the full array output. Can't remember offhand which one the relevant parts will be in.
    when i do that with preg_match_all i get

    PHP Code:
    Array
    (
        [
    0] => Array
            (
            )

        [
    1] => Array
            (
            )

        [
    2] => Array
            (
            )

        [
    3] => Array
            (
            )

        [
    4] => Array
            (
            )

        [
    5] => Array
            (
            )

    )
    and 
    when i do a preg_match i get 
    PHP Code:
    Array
    (
        [
    0] => pr.2
        
    [1] => pr.2
        
    [2] => 
        [
    3] => 
        [
    4] => pr
        
    [5] => 2
    )
    Array
    (
        [
    0] => 01.pr
        
    [1] => 01.pr
        
    [2] => 01
        
    [3] => pr


  • #12
    Regular Coder
    Join Date
    Jan 2010
    Location
    Washington
    Posts
    223
    Thanks
    34
    Thanked 0 Times in 0 Posts
    I just cant understand why i can even get the first bit to work correctly doing a regular preg_match and when i do a preg_match_all it will just index all 1's for the parts.
    Last edited by cosmicsea; 03-20-2010 at 01:28 AM. Reason: typo

  • #13
    Senior Coder
    Join Date
    Jul 2009
    Location
    South Yorkshire, England
    Posts
    2,318
    Thanks
    6
    Thanked 304 Times in 303 Posts
    You need to find out which arrays the respective parts are in and then use those arrays. Run this code.

    Code:
    foreach ($m as $key => $array)
    {
        print('Key: '.$key."\n");
        print_r($array);
    }

  • #14
    Regular Coder
    Join Date
    Jan 2010
    Location
    Washington
    Posts
    223
    Thanks
    34
    Thanked 0 Times in 0 Posts
    Quote Originally Posted by MattF View Post
    You're using preg_match_all now though, hence you're getting a multi-dimensional array rather than a single level array. do print_r($m); and it will show you the full array output. Can't remember offhand which one the relevant parts will be in.
    I think i know what will fix it but i cant figure out the regex correctly. if i do this
    PHP Code:
    (d{1,3})[._]?(pr|tr|gr|zr|amc|mp|o|iv|is|ve|sr)[._]?(d{1,3})? 
    it will correctly get "01.pr" but wont get "pr.01" and if i do
    PHP Code:
    (d{1,3})?[._]?(pr|tr|gr|zr|amc|mp|o|iv|is|ve|sr)[._]?(d{1,3}) 
    it will get "pr.01" and not "01.pr". is there a way to make this regex work to make it so it will grab either order of text? im sorry for all the questions, this is just getting frustrating for me.

  • #15
    Regular Coder
    Join Date
    Jan 2010
    Location
    Washington
    Posts
    223
    Thanks
    34
    Thanked 0 Times in 0 Posts
    Quote Originally Posted by MattF View Post
    You need to find out which arrays the respective parts are in and then use those arrays. Run this code.

    Code:
    foreach ($m as $key => $array)
    {
        print('Key: '.$key."\n");
        print_r($array);
    }
    ok i will try.


  •  
    Page 1 of 2 12 LastLast

    Posting Permissions

    • You may not post new threads
    • You may not post replies
    • You may not post attachments
    • You may not edit your posts
    •