View Full Version : Counting all the words from a file...

01-15-2004, 11:05 AM
I would like to count all the word from a given file / website...

I mean count all the words freqvency ...

What it's wrong in my code!!!!!!



01-18-2004, 05:30 PM
What exactly are you trying to calculate? The number of instances of each word or how many time words are found from your predefined dictionary array?

I noticed you're receiving 500bytes from the file and parsing it, then continually looping through that till the file is completely read, the problrm with this is that if the 500 bytes ends half way through a word, it'll mess up all the calculations. I suggest you loop through the file first adding it all to a string then once that's done, do your calculations.

01-19-2004, 02:41 AM
when the code gets to looking like that ... there is 9/10 times an easier way (I found out the hard way ok ;) )

$yaks = str_replace(
array( "\r\n" , "\n" , "\t" ) ,
'' ,
implode( '' , file( 'http://www.redhits.com/cl/tfile.htm' ) )
$bits = array_count_values( explode( ' ' , $yaks ) ) ;
arsort( $bits ) ;
print_r( $bits ) ;

note that PHP's strip tags is not to good at removing javascript (especially inline) and inline styles so you may ned to remove that stuff seperately.