...

View Full Version : Keyword density script



electrify77
05-21-2009, 06:31 PM
Hi,

I was wondering if you can point me into some directions. I'd like to create a keyword density checking script like this one: http://www.seochat.com/seo-tools/keyword-density/

I was wondering maybe there are some ready-made ones or some modules to help me with my task.

Main problems I see are how to determine what is actual text and not html tags, and then how to set it to find density for 2-3 word phrases, like in the above example.

All help is greatly appreciated! Thank you!

hthought
05-22-2009, 02:57 PM
I don't really know of any sources but this one is a pretty simple program to code. I would use python along with regular expression and urllib2 module to do the job. You can easily determine what is html tags or text using the re(regular expression module) and use urllib2 as a crawler to the page.

FishMonger
05-22-2009, 04:44 PM
No need to use Python when Perl has plenty of modules designed for HTML retrieval and parsing. Using a regex to parse HTML is very fragile and can easily brake. It's better to use an HTML parser.

The most often used modules for this type of tack are:
LWP
http://search.cpan.org/~gaas/libwww-perl-5.826/lib/LWP.pm

LWP::UserAgent
http://search.cpan.org/~gaas/libwww-perl-5.826/lib/LWP/UserAgent.pm

LWP::Simple
http://search.cpan.org/~gaas/libwww-perl-5.826/lib/LWP/Simple.pm

HTML::Parser
http://search.cpan.org/search%3fmodule=HTML::Parser

HTML::HeadParser
http://search.cpan.org/search%3fmodule=HTML::HeadParser

KevinADC
05-23-2009, 01:43 AM
There is also HTML::Strip although I have never used it and I'm not sure if its an actual parser.

electrify77
05-23-2009, 02:30 PM
Thanks guys - that's a nice start. I'll see them and report on my progress.



EZ Archive Ads Plugin for vBulletin Copyright 2006 Computer Help Forum