hi im trying to get my crawler to work right and need a little help.
Im using this code to extract info such as "10.pr", "10_pr", "10pr", "pr.10", "pr_10", "pr10" and so on etc.
does not seem to index at all. I used a regex tester and all my examples i put in work correctly but when i use this with the crawler, it seems only the first half of the code works
Code:
(\d{1,3})[._]?(pr|tr|gr|zr|amc|mp|o|iv|is|ve|sr)
it will index my examples "10.pr", "10_pr", "10pr" perfectly, but when the crawler tries to index "pr.10", "pr_10", "pr10". It does not seem to work. I have tried refomatting this regex several times to make it work so im out of ideas. Is it maybe the | in the middle of the code causing it? is there a way to make sure the whole regex code is read? I used the | in the middle because it was all i could get to work in the regex tester. can somebody help please? thanks.
The problem is, it is suppost to group similar types and it does but for the "part" table in the db it gives everything a 1 automatically, even if has distinguished multiple types.
if i could just get this to index part numbers correctly it will be working just fine.
here is my whole function.
PHP Code:
function part_search($t) { preg_match_all('/((\d{1,3})[._]?(pr|tr|gr|zr|amc|mp|o|iv|is|ve|sr)|(pr|tr|gr|zr|amc|mp|o|iv|is|ve|sr)[._]?(\d{1,3}))/', $t, $m); if (isset($m[1])) { return (int)$m[1]; } else { return 0; } }
is there anything you can think of that might help further?
Last edited by cosmicsea; 03-20-2010 at 12:35 AM..
Reason: typo
the weird thing is if i take out preg_match_all and make it preg_match it will stop marking everything 1 and make it work correctly, But it does the original problem again and just wants to do the "10.pr" format type and not "pr.10".
You're using preg_match_all now though, hence you're getting a multi-dimensional array rather than a single level array. do print_r($m); and it will show you the full array output. Can't remember offhand which one the relevant parts will be in.
You're using preg_match_all now though, hence you're getting a multi-dimensional array rather than a single level array. do print_r($m); and it will show you the full array output. Can't remember offhand which one the relevant parts will be in.
I just cant understand why i can even get the first bit to work correctly doing a regular preg_match and when i do a preg_match_all it will just index all 1's for the parts.
Last edited by cosmicsea; 03-20-2010 at 01:28 AM..
Reason: typo
You're using preg_match_all now though, hence you're getting a multi-dimensional array rather than a single level array. do print_r($m); and it will show you the full array output. Can't remember offhand which one the relevant parts will be in.
I think i know what will fix it but i cant figure out the regex correctly. if i do this
it will get "pr.01" and not "01.pr". is there a way to make this regex work to make it so it will grab either order of text? im sorry for all the questions, this is just getting frustrating for me.