...

View Full Version : Finding the number of occurrences of a characters in a string



theside
11-14-2010, 07:29 PM
Hi there

I have a forum which is being recently inundated with spam users.

I do have measures in place to prevent spam registrations but they seem to be failing right now.

I think the issue is a "real person" is registering (bypassing the captcha) and once registered, a computer program is ading hundreds (yes hundreds) of spam posts...

The email addresses are all gmail - I do not want to ban these accounts

IP addresses change and are not on look-up!

The only common finding is that the email addresses all have >1 period (.) before the @ symbol.

Therefore, I want to add a check to see whether there are >1 periods (.) prior to the @ symbol in the email address...

If there is then return false...



So,

I think I need to use preg_replace_all

This, according to php will find all occurrences of the period like so:


preg_match_all(".", $email, $emaildots);


It then returns an array $emaildots

Now, I'm guessing i need to use count here to find the number of occurrences of the period, but not sure how I go about that with this array...

Your help would be much appreciated (again!!)

Thank you

K

mlseim
11-14-2010, 07:36 PM
I recommend that whatever you do, make everything work the same.

For example, the spammer enters a post. Process the entry, see that
it's from a spammer, so you don't post it. But carry-on just as though
you did. If you suddenly display a "spammer detected" message or some
other message, they will know that their post did not work.

I believe when people do the spamming, they pretty much enter as much as
they can, but they don't go into the forums or blogs and see if their actual posts
appear. Let them waste their time entering posts, but don't save or display them.

Someone will pop-in with a REGEX to help you find the .@ (I'm not a REGEX expert).



.

theside
11-14-2010, 07:40 PM
Actually that's a very good idea

So, I let them register and post, but.. have a function to detect whether it is spam and then do not display it should it be spam (i.e. contains all the usual spam words)

That's a very good idea in fact!!!!

The spammers seem to be upping the game recently - is it because Christmas is coming?

They are very annoying - and waste an awful lot of my time

Kind regards

K

kbluhm
11-14-2010, 07:58 PM
list( $alias, $domain ) = explode( '@', $email );

if ( FALSE !== strpos( $alias, '.' ) )
{
// email alias contains a dot
}

poyzn
11-14-2010, 08:03 PM
try this


if(preg_match('#^[^\.]+[\.]{1}[^\.]+\@#', $email)) {
return true; //returns true if there is one dot before @
}

MattF
11-14-2010, 08:04 PM
The problem with silently dropping input is the fact that legitimate users will get no feedback if they get caught by accident. Give an error message and the user can plainly see why something happened and hence contact you regarding it. Silently drop input though and they'll have no idea what's happening. They may even spend ages thinking they have a problem at their end. From a usability viewpoint, the silent discard method is a horrible approach. Usability should suffer the least amount humanly possible in the antispam crusade. If you have to trample usability in the process, you're doing something wrong.

Besides, if you actually put some basic thought and time into studying it, you will find that spammers and a lot of their tools are supremely dense. Even something as simple as a non-standard input field in a well known/spammed form can scuttle a large percentage of them.

poyzn
11-14-2010, 08:09 PM
The problem with silently dropping input is the fact that legitimate users will get no feedback if they get caught by accident. Give an error message and the user can plainly see why something happened and hence contact you regarding it. Silently drop input though and they'll have no idea what's happening. They may even spend ages thinking they have a problem at their end. From a usability viewpoint, the silent discard method is a horrible approach. Usability should suffer the least amount humanly possible in the antispam crusade. If you have to trample usability in the process, you're doing something wrong.

Besides, if you actually put some basic thought and time into studying it, you will find that spammers and a lot of their tools are supremely dense. Even something as simple as a non-standard input field in a well known/spammed form can scuttle a large percentage of them.

Absolutely agree with MattF. You can create additional step for verification for those who have one-dotted email.

theside
11-14-2010, 11:44 PM
Hi guys, thank you for your comments

I have decided to change the registration form a little - this is so that first, I can decide whether these are human or non-human sign-ups and spam posts...

If they are non-human, the new form should fox them as they will not know the required fields! Although I understand this will not last long.

It is, as pointed out, a well known forum script - phpBB!!!

So yes, it is targetted by these malicious people.

The forum is also very busy, so i do not want to deter people from signing up either...

I'm going to sit on it for a bit and see what happens with my new registration form!

Thanks guys


K

mlseim
11-15-2010, 01:31 AM
phpBB ... I think I know what it is.

You have to turn off the ability for people to see who the members are.
Only the admin can view members.

http://www.google.com/search?q=phpbb+turn+off+members+list

MattF
11-15-2010, 02:42 PM
Other things which may make a difference:

1) Remove, (or preferably just hide via CSS or comment tags), the 'powered by' link. Spammers Google/Yahoo for that stuff to find applicable sites.

2) Add a honeypot input. A text input which is hidden by CSS or comment tags and has no purpose other than to see who fills it in. When it's hidden, a browser won't display it, hence normal users can't fill it in. Bots probably will, however. If that input is set and not empty, block or put them on approval at least.

3) Use the likes of stopforumspam. Dependency on external sites isn't a personal preferred method of mine, but if you're getting hit quite hard, that should weed out a lot of the chaff.

4) Use a text and answer question, (not captcha, to retain accessibility). Something simple like: 'What is 2+2?'

mlseim
11-15-2010, 02:55 PM
Matt's number 4 is a good one to use ... easy for user's to work with.

I think another method would be a <select> box with the requirement
that the user picks the bottom <option>

<input type="select" name="cap" value="VERIFY REGISTRATION">
<option value="bad">No, I do not want to register now</option>
<option value="good">Yes, register me now</option>
</select>

Spamming robots will always pick the first one.

Radio buttons would have the same effect.

MattF
11-15-2010, 03:05 PM
I think another method would be a <select> box with the requirement that the user picks the bottom <option>

<input type="select" name="cap" value="VERIFY REGISTRATION">
<option value="bad">No, I do not want to register now</option>
<option value="good">Yes, register me now</option>
</select>

Spamming robots will always pick the first one.

Radio buttons would have the same effect.

Quite like that idea. Might add that one to my current spam magnet to see how it fares. Might as well use them for testing purposes and make some use out of the spambots. :D They seem to love that test site I plonked up a while ago. Around 200 spam registrations within six months, and it's rarely been linked to online, so not exactly widely known.

poyzn
11-15-2010, 03:55 PM
Really usefull ideas in this topic. You can also use keycaptcha (https://www.keycaptcha.com/?changelang=en)

MattF
11-16-2010, 03:44 AM
Added that select option to see how it works out. Just on the offchance that some might consider registering so that I can compare some valid server_vars, (proxy connection, encoding, lang etc), against the bot vars, this is the site:

http://gxcr.org/fluxbb/index.php

If you view the source on the registration page, you can see a working example of a hidden text input which they've all filled in since it was applied. It's the one which is enclosed within comment tags, (<!-- [input] -->).

Note: It's a spam magnet with no valid forum value whatsoever, but my results are tainted towards the bots at the moment with having no valid results to compare against.

MattF
11-17-2010, 03:39 AM
I think another method would be a <select> box with the requirement that the user picks the bottom <option>

[...]

Spamming robots will always pick the first one.

Radio buttons would have the same effect.

Definitely seems to work as you suggested. Three new additions so far since implementing that and they've all done as you said.

kbluhm
11-17-2010, 03:52 AM
http://gxcr.org/fluxbb/index.php

If you view the source on the registration page, you can see a working example of a hidden text input which they've all filled in since it was applied. It's the one which is enclosed within comment tags, (<!-- [input] -->).

Note: It's a spam magnet with no valid forum value whatsoever, but my results are tainted towards the bots at the moment with having no valid results to compare against.

Are you sure? I believe it would be treated as a comment and not interpreted as a form input. I've just tried a very simple test:


<form action="" method="get">
<input type="text" name="test1" value="1" />
<!-- <input type="text" name="test2" value="2" /> -->
<input type="submit" />
</form>

...and the query string comes out like so:


/test.html?test1=1

MattF
11-17-2010, 04:02 AM
Yup. Been checking for a non empty POST value for that commented input, (did a test registration myself too via the browser which obviously ignored it and left it unset), and every spam bot has filled it in, (after I corrected my numbnuts moment and set it as a text input as it should be), and the post value, (their e-mail address), has been available in each case.

I did wonder initially if the comment tags might work, but they appear to cause no detriment. Could have gone the CSS route for hiding it, but that would obviously be ineffective and give no indication with text browsers, hence why I tried the comment tags first.

This is the processing code for that input on that registration page. Just bog standard fare.



if (isset($_POST['vemail']) && trim($_POST['vemail']) != '')
{
$output .= '['.htmlspecialchars($_POST['vemail']).']'."\n";
}

kbluhm
11-17-2010, 06:06 AM
That's odd, my test above shows otherwise. :\

MattF
11-17-2010, 06:24 AM
Just tested that code you posted and have come to the same conclusion. It doesn't work. :D Neither post nor get. Only thing I can think is that they're parsing it in a cockhanded manner and removing the comment tags, else whatever parser they're using just doesn't understand comment tags. Browsers seem to be parsing it as expected, and bots appear to be breaking it, (which is actually quite fortunate), and turning it into an active input.



EZ Archive Ads Plugin for vBulletin Copyright 2006 Computer Help Forum