PDA

View Full Version : Calculating Data


wadeintothem
09-10-2008, 05:54 AM
I'm having a heck of a time coding this, maybe hopefully someone here has an answer.

Given a set of names on several flat text files

FILE 1
Able
Brian
Charlie
David
Edward
Able
Charlie
Edward
Charlie
David
Brian
Able
Brian

File 2
Charlie
David
Brian
Brian
Edward
Charlie
Able

I would like to count recurrences and sort it getting totals of times the name is repeated per file and a grand total. Kind of like ending up with:


NAME --1 -- 2 -- Total
Able --3 -- 1 -- 4

Brian --3 -- 2 -- 5

Charlie -3 -- 2 -- 5

David --2 -- 1 -- 3

Edward - 2 -- 1 -- 3



I wrote this for after reading the file into an array:
$mark1=0; $mark2=1; $number=0;

@thisname = sort(@thisname);
foreach $thisname (@thisname) {
if ($thisname[$mark1] eq $thisname[$mark2]) {
$mark1++; $mark2++; $number++;
} else {
$number++;
push(@name, $thisname[$mark1]);
push(@digit, $number);
$mark1++; $mark2++; $number=0;
}
}

which actually works fine for counting recurrences from one text file, but associating the names from one text file to the next is where I break down. Of course I've written it another 100 ways that dont work at all. This is beyond my pea brain so any help is appreciated!

Thanks

KevinADC
09-10-2008, 06:53 AM
Are you familiar with hashes? That would be the way to go.

wadeintothem
09-10-2008, 03:54 PM
I've not used them much outside of a form handler.
It may be the only way to go. I've printed up a couple of good hash "courses" and am trying to work my way through them.

The name should be a key, but it would need to come from a scalar... so its actually $name (with a value of "brian"), then I could probably count the value up using ++ each time it matches.. say

for (@thisname) {
$number++ if SOMETHING eq SOMETHING

maybe I could just use ($_ eq $thisname[$line])
$line++;


something like that. When I get home from work I'm going to delve into this...

any ideas would be appreciated.

KevinADC
09-10-2008, 05:38 PM
I'll attempt to give you a jump start:

my %count = ();
my @files = qw(file1 file2);
foreach my $file (@files) {
open my $FH, "<", "path/to/$file" or die "$!";
while (<$FH>) {
chomp;
$count{$_}++;
}
close($FH);
}
foreach my $name (keys %count) {
print "$name = $count{$name}\n";
}

That shows how to use a hash to count simple things, like the names in the files. But you will notice the output is the total count from all the files, not a breakdown with subtotals and a grandtotal. For that you need to use a more complex data structure, an array of hashes or a hash of hashes, but that might be getting the ox to far out in front of the cart at this point in your perl education.

wadeintothem
09-11-2008, 06:12 AM
That is a more efficient way of calculating the entirety of the files than I had already written; however, it simply is the same... it returns the value and key of all the files.

I do understand this type coding is complicated, which is probably why I've tried 500 ways of doing it without success :)

So.. that said..

Is there a way of creating array names on the fly?

My thinking for a possible solution is that at the end of the loop for each file read, I could push all the hash data into an array (which I could access later) then clear the hash, start the loop again reading the next file, push into a new array etc.

There is no known quantity of text files within a directory, it could be 1 or 10.

Another possible solution I'm thinking.. cheat! Write the compiled data to a temp text file, do what I need to do, then unlink it.

FishMonger
09-11-2008, 07:17 AM
This sounds like a homework assignment for a programming class, so we won't be able/willing to give you a complete solution. However, we can guide you.

It is possible to create array names on the fly, but that would be the worst approach.

There are several possible approaches to your problem. You could use an array of hashes or one hash similar what Keven shows but with a more complex structure (which would probably be my choice.

Using this structure:
$hash{name}{file} = number

your sample data would look like this:
$VAR1 = {
'Brian' => {
'file2.txt' => 2,
'file1.txt' => 3
},
'Edward' => {
'file2.txt' => 1,
'file1.txt' => 2
},
'David' => {
'file2.txt' => 1,
'file1.txt' => 2
},
'Charlie' => {
'file2.txt' => 2,
'file1.txt' => 3
},
'Able' => {
'file2.txt' => 1,
'file1.txt' => 3
}
};

Obviously, I've left out the details on how I built the hash and how you'd loop through it to produce the output you need.

wadeintothem
09-11-2008, 07:31 AM
LOL, no I'm well beyond that stage of life..which is why I dont care how it gets done as long as it gets done! For what its worth, this is for a hobby site and tracking number of games a particular person is in. Likewise, I'm a hobbiest PERL coder who has goofed with it for years, but every now and then my simple loops arent good enough :).

For fun, I'll post my solution, which does work. I'll look at yours and see if I can use it.

$myfile = "$tempnum.$mynum.txt";

open(FILE, ">$dir/$myfile");
close(FILE);
foreach $mydata (@mydata) {

my %count=();
$name="";
open(FILE, "$dir/$mydata");
while (<FILE>) {
chomp;
$count{$_}++;
}
open(FILE, ">>$dir/$myfile");
flock(FILE, 2);
foreach my $name (keys %count) {
print FILE "$name$count{$name}\n";
#just fyi, the data is delimited with | so the separation does not matter
}
close(FILE);

}

open(FILE, "$dir/$myfile");
chomp(@unsortdata = <FILE>);
close(FILE);

foreach $unsortdata (@unsortdata) {
($gnumber, $gname, $gameday) = split(/\|/, $unsortdata);
$newdata= "$gname|$gnumber|$gameday";
push (@newdataarray, $newdata);
}
@sorteddata = sort (@newdataarray);

$linea=0; $lineb=1; @gamearray=();

foreach $sorteddata (@sorteddata) {
($aname, $aIDnumber, $agameday) = split(/\|/, $sorteddata[$linea]);
($bname, $bIDnumber, $bgameday) = split(/\|/, $sorteddata[$lineb]);

if ($aname eq $bname) {
push (@gamearray, $agameday);
$linea++; $lineb++;
} else {
push (@gamearray, $agameday);
print "$aname $aIDnumber";
foreach $gamearray (@gamearray) {
print "<input type=\"text\" name=\"whoknows\" size=\"3\" value=\"$gamearray\">";
}
print "<BR\>\n";
$linea++; $lineb++; @gamearray=();
}

}
unlink("$dir/$myfile");



LOL
Hey but it actually works..

I'll look at yours now...

wadeintothem
09-11-2008, 07:35 AM
Yeah, yours wouldnt work for me fish, but thanks anyway! The reason is within the directory is any number of files, maybe 1, maybe 10. It cant be set up like you have it.

This site is obviously a great resource if only to debug my thought process through whining.. glad I found it.

What do you think about my novice bandaid solution ? :)

With mine, once I html out put it to a form, life is easy street.

FishMonger
09-11-2008, 07:50 AM
Ok, since this isn't a homework assignment and you've shown that you've made a good effort, I'll show you my solution based on your sample input data and desired output.
#!/usr/bin/perl

use warnings;
use strict;

my %names;
foreach my $file ( <file*.txt> ) {
open my $FH, '<', $file or die $!;
while ( my $name = <$FH> ) {
chomp $name;
$names{$name}{$file}++;
}
}

foreach my $name ( sort keys %names ) {
my $total;
print $name;
foreach my $file ( sort keys %{$names{$name}} ) {
$total += $names{$name}{$file};
print " --$names{$name}{$file}";
}
print " --$total\n";
}

FishMonger
09-11-2008, 07:57 AM
You should note that the first foreach loop will process all files, without having to specify (hard code) each and every file.

KevinADC
09-11-2008, 09:36 AM
Fish,

Last edited by FishMonger; 09-10-2008 at 06:05 PM.

What did you edit in my post?

wadeintothem
09-11-2008, 01:57 PM
Ok, since this isn't a homework assignment and you've shown that you've made a good effort, I'll show you my solution based on your sample input data and desired output.
#!/usr/bin/perl

use warnings;
use strict;

my %names;
foreach my $file ( <file*.txt> ) {
open my $FH, '<', $file or die $!;
while ( my $name = <$FH> ) {
chomp $name;
$names{$name}{$file}++;
}
}

foreach my $name ( sort keys %names ) {
my $total;
print $name;
foreach my $file ( sort keys %{$names{$name}} ) {
$total += $names{$name}{$file};
print " --$names{$name}{$file}";
}
print " --$total\n";
}
wow, that works. Maybe I should have did my homework when I was young!

Gentlemen, I thank you!

FishMonger
09-11-2008, 02:42 PM
Fish,



What did you edit in my post?

Sorry, I should have explained the change when I did it.

I changed
open my $FH, ">", "path/to/$file" or die "$!";

to
open my $FH, "<", "path/to/$file" or die "$!";

KevinADC
09-12-2008, 09:12 AM
Ahh..... good catch. Thanks.