Hello and welcome to our community! Is this your first visit?
Register
Enjoy an ad free experience by logging in. Not a member yet? Register.
Results 1 to 9 of 9
  1. #1
    Regular Coder
    Join Date
    Oct 2009
    Posts
    434
    Thanks
    7
    Thanked 3 Times in 3 Posts

    was POST sent from 'my' page or a robot ?

    I have a basic contact form but have been getting a lot of dummy posts with links and stuff.

    I added a prove you are human by answering a sum but still the posts are coming in.

    I think I need to find out first if the POST's are originating from my own page or a robot page, yes? Then look at other ways to help stop or reduce the auto ones getting through.

    I have used capture but this was taken off after a lot of people complained that they were not being able to read the text correctly and gave up and called us on the phone.

    What other methods could I use on the form to help reduce the spam ?

  • #2
    God Emperor Fou-Lu's Avatar
    Join Date
    Sep 2002
    Location
    Saskatoon, Saskatchewan
    Posts
    16,987
    Thanks
    4
    Thanked 2,660 Times in 2,629 Posts
    There is no guaranteed way to verify that a request was provided from your form.
    You need to combat spam from the server side after receiving the quest. This is why its so difficult to get rid of spam.
    PHP Code:
    header('HTTP/1.1 420 Enhance Your Calm'); 

  • #3
    Master Coder felgall's Avatar
    Join Date
    Sep 2005
    Location
    Sydney, Australia
    Posts
    6,642
    Thanks
    0
    Thanked 649 Times in 639 Posts
    robots will almost certainly send their requests from your page. Only someone manually trying to bypass the test will create a copy of the page.

    Anyway real people can use security measures to protect their privacy that will prevent you being able to tell where the request came from.
    Stephen
    Learn Modern JavaScript - http://javascriptexample.net/
    Helping others to solve their computer problem at http://www.felgall.com/

    Don't forget to start your JavaScript code with "use strict"; which makes it easier to find errors in your code.

  • #4
    Regular Coder
    Join Date
    Oct 2009
    Posts
    434
    Thanks
    7
    Thanked 3 Times in 3 Posts
    Yes I can see this happening a lot.

    I had a thought on using a custom hash to store the date and time of the first time the page is loaded.

    My method...
    Page loads and the form has a new field called myhash which has a time stamp of month-day-hour-minute like so 03211203 (three minutes past midday on 21 march)
    But my page would create a hash for example...
    a0b3c2d1e1f2g0h3i

    I have used letters for now, but my hash would actually be 32 characters long and would really contain random letters and numbers, but the position of the actual date and time would only be known to me. This hash would be stored in the myhash field and submitted with the form.

    once the form is submitted the form first checks to see if the forms myhash field contains anything and if so the php script would grab it and convert it back to the original datetime stamp that was used to create it. I have created the script that encrypts and decrypts the hash all work great. So once the decrypt is done a check is done to see if the date and time the form has been sent with is in fact a date and time that is within 10 minutes of the current date and time the form is processed and if so allow the rest of the form to be processed otherwise the form shows an error that they took too long to fill it out and to enter the prove you are human answer to a basic sum again and resubmit the form again.

    phew...

    Hopefully this sounds ok and that it can help reduce the bots from automating this form being sent. I know that if a human does fill this in that they can just enter the prove you are human answer to the sum and click submit again and it would work. But for the bots would this still work ?

  • #5
    Senior Coder
    Join Date
    Aug 2006
    Posts
    1,269
    Thanks
    10
    Thanked 277 Times in 276 Posts
    There are lots of things you can do, all with some success. I think the worst situation is that there are actual human beings who seem to type stuff into forms for some reason. Not quite sure why...

    Anyway, a list of captcha possibilities:
    - Image captcha as you've tried
    - Image with something other than "type these 4 letters"
    - Math - "what's 5+7?"
    - HTML check (is the customer allowed to type html in these fields?)
    - Honeypot (ie, bogus field test)
    - Timeout (as you've described, but you can just pass the time in a session variable)
    - Input field size (is there a min or max amount of text you want to allow?)
    - SPAM words (viagra, etc)
    - See if JS is enabled on the client (ie, assume most real people have it, and most bots don't)

    Try some, try all, nothing will be perfect at least from what I've seen.

  • #6
    Regular Coder
    Join Date
    Oct 2009
    Posts
    434
    Thanks
    7
    Thanked 3 Times in 3 Posts
    - Image captcha as you've tried
    not a fan as I have said tried that got the tee-shirt.
    - Image with something other than "type these 4 letters"
    same as above, needs lots of them to make it worthwhile but would take too long to create.
    - Math - "what's 5+7?"
    already have this anyway as a basic step, and doubled up with my new custom hash method.
    - HTML check (is the customer allowed to type html in these fields?)
    yes i see this one being possible but, i already have tags removed script done on all fields.
    - Honeypot (ie, bogus field test)
    I shall look more in to this one, not heard of it.
    - Timeout (as you've described, but you can just pass the time in a session variable)
    _________________________DEEEEERRRRRR.... LOL why did I not think of that. Spent more of the day creating the scripts to encrypt and decrypt the hash!!!
    - Input field size (is there a min or max amount of text you want to allow?)
    each field varies and would prevent a real form filler from using it if they really wanted to have more text than I allowed.
    - SPAM words (viagra, etc)
    i very rarely get these sent in the form, but I know I will have to think about it one day.
    - See if JS is enabled on the client (ie, assume most real people have it, and most bots don't)
    The form have a lot of JS and it also downgrades gracefully if it is disabled. Which is once in a blue moon, but I can not afford to block a user if they turned it off.

    thank you for your input

  • #7
    Senior Coder
    Join Date
    Feb 2011
    Location
    Your Monitor
    Posts
    4,332
    Thanks
    60
    Thanked 526 Times in 513 Posts
    Blog Entries
    4
    Quote Originally Posted by needsomehelp View Post
    - Image captcha as you've tried
    not a fan as I have said tried that got the tee-shirt.
    So try a different one with a different font that is clearer. One such as this from white had web design:
    PHP Captcha Security Images

    Quote Originally Posted by needsomehelp View Post
    - Image with something other than "type these 4 letters"
    same as above, needs lots of them to make it worthwhile but would take too long to create.
    Admittedly not my thing either but I could still get that functioning within a day.

    Quote Originally Posted by needsomehelp View Post
    - Honeypot (ie, bogus field test)
    I shall look more in to this one, not heard of it.
    Principle is simple. Two forms, identical to look at, one with the bog standard field names (name, email, comments etc) which you hide using css (and thus ignore it in php when submitted) and then the other with completely different names.

    On that note, randomizing your field names is a good idea. This is a good article that should get you started:
    How to stop form spam | PHP | Spambot | Captcha

    One other thing, storing the form data in a database and emailing the user often works well. They have to click a link you sent them in order to have the form delivered. Most bots are capable of bypassing those steps but I suspect very few will be able to bypass all three in that order.
    See my new CodingForums Blog: http://www.codingforums.com/blogs/tangoforce/

    Many useful explanations and tips including: Cannot modify headers - already sent, The IE if (isset($_POST['submit'])) bug explained, unexpected T_CONSTANT_ENCAPSED_STRING, debugging tips and much more!

  • #8
    Regular Coder
    Join Date
    Sep 2002
    Posts
    456
    Thanks
    0
    Thanked 20 Times in 20 Posts
    1. Optical Character Recognition are used by some bots to bypass captcha images since they can extract the leters and or objects if the programmer wants them too.
    phpOCR: Recognize text & objects in graphical images - PHP Classes
    http://www.phpclasses.org/.../2874-P...al-images.html
    This is class can be used as a tool for optical character recognition. It can
    recognize text in monochrome graphical images after a training phase. The
    training ...
    2. Randomizing field name si s probably the best idea, but what gets randomized may depend on your template desgn.

    3. Bot pots - saw a code not too long ago that said you could place a bogus link at the top of your page which should stop crawlers. I was looking to
    find out how to build a honey pot.
    NO Limits!! DHCreationStation.com
    ------------------------------------------------------------
    Broken items wanted for tinkerin'! PostItNow@BrokenEquipment.com
    Global Complaint Dept.

  • #9
    Senior Coder
    Join Date
    Feb 2011
    Location
    Your Monitor
    Posts
    4,332
    Thanks
    60
    Thanked 526 Times in 513 Posts
    Blog Entries
    4
    Quote Originally Posted by c1lonewolf View Post
    2. Randomizing field name si s probably the best idea, but what gets randomized may depend on your template desgn.
    It's certainly not the best idea no matter how clever it is but it does stop the simpler bots that attempt to submit the same form field names time after time. Mixing this idea with the other countermeasures I mentioned however does generally put an end to the spam. I used this random field names technique myself by its own and it did drastically cut the spam but some still got through. Employing the other tactics however saw it completely drop dead.

    Quote Originally Posted by c1lonewolf View Post
    3. Bot pots - saw a code not too long ago that said you could place a bogus link at the top of your page which should stop crawlers. I was looking to
    find out how to build a honey pot.
    Never heard of that one but the normal bot pot is a decoy form with the bog standard field names such as name, email, comments etc that the bot will be drawn to and submit. You hide it using the css so the real human user never sees is however the bots see it in code, get excited and nail it repeatedly with your php completely ignoring it.

    On the link idea though, if its a crawling bot (and not all will be) then you could simply do sleep(3600) (thats an hour) and hope that the bot doesn't disconnect (I suspect most would) or you could loop once a second outputting a random letter of the alphabet to keep it hanging on (though obviously your max exec time will need adjusting). This one isn't really for the faint hearted as it could drive up CPU usage and also infuriate several bot owners who then DDoS you for it in retalitation. Sometimes it's better to let them think they've got one over you while you silently deal with it.
    See my new CodingForums Blog: http://www.codingforums.com/blogs/tangoforce/

    Many useful explanations and tips including: Cannot modify headers - already sent, The IE if (isset($_POST['submit'])) bug explained, unexpected T_CONSTANT_ENCAPSED_STRING, debugging tips and much more!


  •  

    Posting Permissions

    • You may not post new threads
    • You may not post replies
    • You may not post attachments
    • You may not edit your posts
    •