Hello and welcome to our community! Is this your first visit?
Register
Enjoy an ad free experience by logging in. Not a member yet? Register.
Results 1 to 13 of 13
  1. #1
    New Coder
    Join Date
    Jul 2011
    Location
    USA
    Posts
    39
    Thanks
    0
    Thanked 1 Time in 1 Post

    Help Matching Whole Words

    I'm using this inside my script and it works fine:

    PHP Code:
    $censoredfilter file("badwords.dat"FILE_IGNORE_NEW_LINES FILE_SKIP_EMPTY_LINES);
    $text str_replace($censoredfilter'bleep'$text); 
    But it matches all instances of a word and I only want it to match whole words. No matter how I try using preg_replace it doesn't work. Can somebody show me how to use preg_replace to match only whole words with what i have already? Thanks

  • #2
    Senior Coder angst's Avatar
    Join Date
    Apr 2004
    Location
    Toronto, Ontario
    Posts
    2,114
    Thanks
    15
    Thanked 122 Times in 122 Posts
    have a look at this: http://chumby.net/?p=44

  • #3
    God Emperor Fou-Lu's Avatar
    Join Date
    Sep 2002
    Location
    Saskatoon, Saskatchewan
    Posts
    16,987
    Thanks
    4
    Thanked 2,660 Times in 2,629 Posts
    For getting around partials, you'll really need to evaluate each word in the phrase, or to use pattern matching. Adding a space to the end of the word would get around the partial, but it would introduce a new error on allowing it as the last phrase on the end of the line and doesn't accommodate things like periods. Of patterns and evaluation, I'm not sure which would win a performance test, but regex would definitely be easier to do.
    You can convert what you have into just the full word parts with a walk and a preg_replace. It would be better to modify the existing file and bound it instead of letting a function do it since I can't say which words should be partials and fulls.
    PHP Code:
    $s 'word dogma dog.  Mouse too'// my phrase
    $aReplace = array('dog''mouse'); // results of file
    $sReplace 'str_repeat("*", strlen($0))'// replace with

    function wordBoundaryAdd($item)
    {
        return 
    sprintf('\b%s\b'preg_quote($item));
    }

    $aPCREReplace array_map('wordBoundaryAdd'$aReplace);
    $sSearch sprintf('#%s#ei'implode('|'$aPCREReplace));
    print 
    preg_replace($sSearch$sReplace$s); 
    Yeah, that looks like it would work.

  • #4
    New Coder
    Join Date
    Jul 2011
    Location
    USA
    Posts
    39
    Thanks
    0
    Thanked 1 Time in 1 Post
    Thanks guys.

    @angst: I saw that very same article when I was doing a google search but couldn't get it to work from reading a data file with the bad words.

    @Fou-Lu: Your code works but how do I incorperate my data file into your script?

  • #5
    God Emperor Fou-Lu's Avatar
    Join Date
    Sep 2002
    Location
    Saskatoon, Saskatchewan
    Posts
    16,987
    Thanks
    4
    Thanked 2,660 Times in 2,629 Posts
    Assign $aReplace to the array from file().

  • #6
    New Coder
    Join Date
    Jul 2011
    Location
    USA
    Posts
    39
    Thanks
    0
    Thanked 1 Time in 1 Post
    Like this?

    PHP Code:
    $aReplace file("note_badwords.dat"); // results of file 
    I tried this and it doesn't seem to work.

  • #7
    God Emperor Fou-Lu's Avatar
    Join Date
    Sep 2002
    Location
    Saskatoon, Saskatchewan
    Posts
    16,987
    Thanks
    4
    Thanked 2,660 Times in 2,629 Posts
    Make sure you are trimming off the linefeeds by adding the FILE_IGNORE_NEW_LINES as well.
    If that still doesn't work, show what you have as well as a print_r($aReplace) result.

  • #8
    New Coder
    Join Date
    Jul 2011
    Location
    USA
    Posts
    39
    Thanks
    0
    Thanked 1 Time in 1 Post
    Ok here's your script just the way you posted it:

    http://area52.heliohost.org/replace_words_a.php

    Here's with file() & print_r();

    http://area52.heliohost.org/replace_words_b.php

    contents of replace_words_b.php:

    PHP Code:
    <?php
    $s 
    'word dogma dog.  Mouse too'// my phrase 
    $aReplace file("words.dat"); // results of file 

    $sReplace 'str_repeat("*", strlen($0))'// replace with 

    function wordBoundaryAdd($item

        return 
    sprintf('\b%s\b'preg_quote($item)); 

    $aPCREReplace array_map('wordBoundaryAdd'$aReplace); 
    $sSearch sprintf('#%s#ei'implode('|'$aPCREReplace)); 
    print 
    preg_replace($sSearch$sReplace$s);
    print 
    "<br>";
    print_r($aReplace);
    ?>
    When I tried adding FILE_IGNORE_NEW_LINES it returned errors.

  • #9
    God Emperor Fou-Lu's Avatar
    Join Date
    Sep 2002
    Location
    Saskatoon, Saskatchewan
    Posts
    16,987
    Thanks
    4
    Thanked 2,660 Times in 2,629 Posts
    I won't follow external links from work; you'll need to embed or attach them instead.
    File() shouldn't be throwing any errors when given FILE_IGNORE_NEW_LINES.

  • #10
    New Coder
    Join Date
    Jul 2011
    Location
    USA
    Posts
    39
    Thanks
    0
    Thanked 1 Time in 1 Post
    File a works like it should.

    File b returns this:

    word dogma dog. Mouse too
    Array ( [0] => dog [1] => mouse )
    The data file contains: dog mouse

    When I add FILE_IGNORE_NEW_LINES*|*FILE_SKIP_EMPTY_LINES to file() I get these warnings:

    Warning: file() expects parameter 2 to be long, string given in /home1/area52/public_html/replace_words_c.php on line 3

    Warning: array_map() [function.array-map]: Argument #2 should be an array in /home1/area52/public_html/replace_words_c.php on line 11

    Warning: implode() [function.implode]: Invalid arguments passed in /home1/area52/public_html/replace_words_c.php on line 12

    Warning: strlen() expects exactly 1 parameter, 0 given in /home1/area52/public_html/replace_words_c.php(13) : regexp code on line 1

  • #11
    God Emperor Fou-Lu's Avatar
    Join Date
    Sep 2002
    Location
    Saskatoon, Saskatchewan
    Posts
    16,987
    Thanks
    4
    Thanked 2,660 Times in 2,629 Posts
    They are constants. They shouldn't be wrapped in quotations.

    Edit:
    BTW, I assumed given your first post that each word to block is on a line of its own. That still is the case correct?

  • #12
    New Coder
    Join Date
    Jul 2011
    Location
    USA
    Posts
    39
    Thanks
    0
    Thanked 1 Time in 1 Post
    They're not wrapped in quotes. They're just like you see in my first post.

    Yes each word is on a new line.

  • #13
    New Coder
    Join Date
    Jul 2011
    Location
    USA
    Posts
    39
    Thanks
    0
    Thanked 1 Time in 1 Post
    Nevermind about the errors (warnings). I corrected them. That was an error with my code editor.
    Last edited by byrondallas; 04-18-2012 at 01:02 PM.


  •  

    Posting Permissions

    • You may not post new threads
    • You may not post replies
    • You may not post attachments
    • You may not edit your posts
    •