...

View Full Version : converting from PHP - foreach with key



Phil Jackson
01-23-2010, 12:33 AM
Hi all, was wondering if someone could help me out. If anyone knows PHP then to get a key from an array you do the following:


$x = array('foo', 'bar', 'foobar');
foreach($x as $key => $value){
echo "$value has a key of $key <br />";
}

// foo has a key of 0
// bar has a key of 1
// foobar has a key of 2

and then you can do things such as


unset($x[0]);

Im finding it very hard to understand how to do this in perl

I have something like:




#!/usr/bin/perl
use strict;
use warnings;
use LWP::Simple;
use Data::Dumper;
use DBI;

print "\nEnter a website: http://www.";
chomp(my $url = <>);
my %pagesArray = ($url);

while ( $x!=0 ) {
if( scalar %pagesArray != 0 ) {
foreach my $page (%pagesArray) {
my $key = $pagesArray{$page};
...
...
...
...

But im really not sure what im doing. I've read numerous tutorials and i still cant get my head around it. If someone could give me some hints it would be much appreciated.

Regards,

Phil

FishMonger
01-23-2010, 04:44 AM
I have a very limited knowledge of php, but I do know your $x array is defined as an array, not a hash.

A php hash would be defined as:

$x = array("foo" => "bar", 12 => true);

The difference being that arrays are ordered lists numerically indexed 0..n whereas hashes are unordered string indexed key/value pairs.

In Perl, you need to specify the hash key when assigning.

In your Perl script, the %pagesArray ends up with the key being the user's response and its value is undef. In addition to that, since you enabled warnings, as you should, your %pagesArray assignment will generate a warning.


C:\test>type perl-1.pl
#!/usr/bin/perl

use strict;
use warnings;
use Data::Dumper;

print "\nEnter a website: http://www.";
chomp(my $url = <>);
my %pagesArray = ($url);

print Dumper \%pagesArray;

C:\test>perl-1.pl

Enter a website: http://www.google.com
Odd number of elements in hash assignment at C:\test\Perl-1.pl line 9, <> line 1.
$VAR1 = {
'google.com' => undef
};


What did you what to use as the hash key?

Using a hash in the sample code you posted is odd, but assuming it's needed for the portions that you didn't show, here's a possible assignment solution.

print "\nEnter a website: http://www.";
chomp(my $url = <>);
my %pagesArray;
$pagesArray{url1} = ($url);

C:\test>perl-1.pl

Enter a website: http://www.google.com
$VAR1 = {
'url1' => 'google.com'
};

oesxyl
01-23-2010, 09:46 AM
Hi all, was wondering if someone could help me out. If anyone knows PHP then to get a key from an array you do the following:


$x = array('foo', 'bar', 'foobar');
foreach($x as $key => $value){
echo "$value has a key of $key <br />";
}

// foo has a key of 0
// bar has a key of 1
// foobar has a key of 2

and then you can do things such as


unset($x[0]);
in php you always have a implicit key, in perl you need to define a hash. In your case you use a array and you can use a for:


my $len = scalar $x;
for(my $i = 0; $i < $len; $i++){
print "key: ",$i, "value: ", $x[$i], "\n";
}

can be done in many ways but this is the simplest, I guess, :)


Im finding it very hard to understand how to do this in perl
in perl arrays and hashes are distinct data structure, in php both are the same things, :)


I have something like:




#!/usr/bin/perl
use strict;
use warnings;
use LWP::Simple;
use Data::Dumper;
use DBI;

print "\nEnter a website: http://www.";
chomp(my $url = <>);
my %pagesArray = ($url);

while ( $x!=0 ) {
if( scalar %pagesArray != 0 ) {
foreach my $page (%pagesArray) {
my $key = $pagesArray{$page};
...
...
...
...

But im really not sure what im doing. I've read numerous tutorials and i still cant get my head around it. If someone could give me some hints it would be much appreciated.

Regards,

Phil
try to read perldsc manpage

best regards

Phil Jackson
01-23-2010, 11:53 AM
my $len = scalar $x;
for(my $i = 0; $i < $len; $i++){
print "key: ",$i, "value: ", $x[$i], "\n";
}

I think i understand. I will try some things and let you know how i get on thx.

oesxyl
01-23-2010, 12:00 PM
I think i understand. I will try some things and let you know how i get on thx.
there are some examples and very good explanation in perldsc manpage:

http://velociraptor.mni.fh-giessen.de/Perl/man/perldsc.html

best regards

Phil Jackson
01-23-2010, 12:04 PM
What I'm trying to do is convert a php script i've writen. It runs sooo slow. Its a spider you find all pages on a website.



<?php

include("func.fullPath.php");
$website = 'http://actwebdesigns.co.uk';

// check for valid domain
if(!preg_match("#^[^\.]*\.(?:(?:com)|(?:co\.uk)|(?:net)|(?:org)|(?:cc)|(?:tv)|(?:info)|(?:org\.uk)|(?:me\.uk)|(?:biz)|(?:na me)|(?:eu)|(?:uk\.com)|(?:eu\.com)|(?:gb\.com)|(?:gb\.net)|(?:uk\.net)|(?:me)|(?:mobi))$#is", $website)){
die("not valid domain name");
}

// check the stus of the domain
function domainStatusCheck($url){
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_HEADER, true);
curl_setopt($ch, CURLOPT_NOBODY, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_TIMEOUT, 10);
$header = curl_exec($ch);
if(preg_match("#HTTP/1\.1\s200\s#is", $header)){
return $url;
}elseif(preg_match("#HTTP/1\.1\s301\sMoved\sPermanently#is", $header)){
if(preg_match("#Location:\s([^\s]+)#is", $header, $newUrl)){
return $newUrl[1];
}else{
return FALSE;
}
}else{
return FALSE;
}
}

// exit if domain status returns anything other than 200 or 3001
if( ! $website = domainStatusCheck($website)){
die("We are not able to continue with our requests as your server is returning an error.");
}

function ACTsiteMap($website){
$pagesArray = array($website);
$foundPagesArray = array();
$mainPagesArray = array();
while(true){
if(count($pagesArray)!=0){
foreach($pagesArray as $key => $page){
if(preg_match("#((?:\.html)|(?:\.php)|(?:\.htm)|(?:\.asp)|(?:\.shtml)|(?:/))$#is", $page)){
if(preg_match("#^".preg_quote($website, "#")."#is", $page)){
if($contents = @file_get_contents($page)){
if(preg_match_all("#<a[^>]*href=\"([^\"\#\?]+(?:(?:\.html)|(?:\.php)|(?:\.htm)|(?:\.asp)|(?:\.shtml)|(?:/)))\"#is", $contents, $matches)){
foreach($matches[1] as $linkHref){
//echo $linkHref."<br />";
if(!preg_match("#^".preg_quote($website, "#")."#is", $linkHref))
$fullPath = fullPath($page, $linkHref);
else
$fullPath = $linkHref;
if(!in_array($fullPath, $foundPagesArray) && !in_array($fullPath, $pagesArray))
$pagesArray[] = $fullPath;
if(!in_array($fullPath, $foundPagesArray))
$foundPagesArray[] = $fullPath;
}
}
if(!in_array($page, $mainPagesArray))
$mainPagesArray[] = $page;
unset($pagesArray[$key]);
}else{
$tempArray = array_reverse($mainPagesArray);
if(isset($tempArray[0])){
$url = $tempArray[0];
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_HEADER, true);
curl_setopt($ch, CURLOPT_NOBODY, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_TIMEOUT, 10);
$header = curl_exec($ch);
if(preg_match("#HTTP/1\.1\s301\sMoved\sPermanently#is", $header)){
if(preg_match("#Location:\s([^\s]+)#is", $header, $newUrl)){
$page2 = preg_replace("#^".preg_quote($website, "#")."#is", "", $page, 1);
$fullPath = fullPath(trim($newUrl[1]), $page2);
$pagesArray[] = $fullPath;
}
}
}
unset($pagesArray[$key]);
}
}else{
unset($pagesArray[$key]);
}
}else{
unset($pagesArray[$key]);
}
}
}else{
break;
}
}
unset($mainPagesArray[0]);
if(count($mainPagesArray)!=0)
return $mainPagesArray;
else
return FALSE;
}

print_r(ACTsiteMap($website));

?>



and this is my start in perl



#!/usr/bin/perl
use strict;
use warnings;
use LWP::Simple;
use Data::Dumper;
use DBI;

print "\nEnter a website: http://www.";
chomp(my $url = <>);
my @pagesArray = ($url);
my @foundPagesArray = ('');
my @mainPagesArray = ('');
my $x = 1;
print "\n\n";

while ( $x!=0 ) {

if( (scalar @pagesArray) != 0 ) {
for ( my $i = 0; $i < scalar @pagesArray; $i++ ) {
my $page = $pagesArray[$i];
$page =~ s/^\///;
if( $page =~ m#((?:\.html)|(?:\.php)|(?:\.aspx)|(?:\.htm)|(?:\.asp)|(?:\.shtml)|(?:/))$#is ){
my $pregQuote = "http://www.".$url->quote;
if( $page =~ m#^$pregQuote#is ){
if( my $content = get "http://www." . $url . "/" ){
print $content;
}
}else{
delete $pagesArray[$i];
}
}else{
delete $pagesArray[$i];
}
}
}else{
$x=0;
}
}

Phil Jackson
01-23-2010, 12:30 PM
I think i understand. I will try some things and let you know how i get on thx.

The problem being when i add and delete to and from the array, $something[$number] is not going to be correct.

Phil Jackson
01-23-2010, 12:38 PM
im just reading and thinking out loud here but would this work?



#!/usr/bin/perl
use strict;
use warnings;
use LWP::Simple;
use Data::Dumper;

my @recs = ('foo', 'bar', 'fooobar');

foreach my $rec (@recs => my $foo ) {
print $rec, "->", $foo, "\n";
}

oesxyl
01-23-2010, 12:44 PM
forget about php code, the algorithm is slow. I will come back with some code later, :)

best regards

oesxyl
01-23-2010, 01:11 PM
im just reading and thinking out loud here but would this work?



#!/usr/bin/perl
use strict;
use warnings;
use LWP::Simple;
use Data::Dumper;

my @recs = ('foo', 'bar', 'fooobar');

foreach my $rec (@recs => my $foo ) {
print $rec, "->", $foo, "\n";
}

when you dinamicaly add or remove items from a array, for/foreach is bad, a while is the solution( same like in php)


my @recs = ('first', 'second', 'third');

while(@recs){ # while @recs is not empty
my $item = shift @recs; # get first item from rec, will remove 'first' from @rec
push @recs, 'another one'; # put a item at the tail of @recs, @recs will be longer now
}

you need to be sure that you remove more items then you add else while loop will never end.
also look for unshift and pop perl function.

best regards

Phil Jackson
01-23-2010, 03:04 PM
when you dinamicaly add or remove items from a array, for/foreach is bad, a while is the solution( same like in php)


my @recs = ('first', 'second', 'third');

while(@recs){ # while @recs is not empty
my $item = shift @recs; # get first item from rec, will remove 'first' from @rec
push @recs, 'another one'; # put a item at the tail of @recs, @recs will be longer now
}

you need to be sure that you remove more items then you add else while loop will never end.
also look for unshift and pop perl function.

best regards

Thats some quality information thank you! Could you tell me if when using "exists" to compare whether a string is in an array, if it has to be a hash?



while( @pagesArray ){
my $page = shift @pagesArray;
if( !exists ( $foundPages[$page] ) ){
push(@foundPagesArray, $page);
$page =~ s/^\///;
push @pagesArray, 'another one'; # put a item at the tail of @recs, @recs will be longer now
}
}

FishMonger
01-23-2010, 04:01 PM
There are numerous problems with most of the posted code which makes it difficult for me to know where to start/stop.


when you dinamicaly add or remove items from a array, for/foreach is bad, a while is the solution( same like in php)If you use poor logic in the for/foreach loop, then I would agree. However, using a while loop doesn't prevent you from using poor logic, so it too could end up being bad.



my @recs = ('first', 'second', 'third');

while(@recs){ # while @recs is not empty
my $item = shift @recs; # get first item from rec, will remove 'first' from @rec
push @recs, 'another one'; # put a item at the tail of @recs, @recs will be longer now
}
you need to be sure that you remove more items then you add else while loop will never end.
Hmm, you give a warning about creating an unwanted infinite loop, but you create one anyway?


chomp(my $url = <>);
...
...
$url->quote
$url is defined as a scalar, not an object, so it's not going to have that quote method.


]if( !exists ( $foundPages[$page] ) ){
The exits function is used to check the existence of a hash key, not an array element. The confusion here is probably due to the syntax difference between Perl and php. Perl uses { } braces around the hash key, not [ ] the brackets.

You're making the same error when using the delete function.

There are a number of other things I could point out, but I'm out of time and need to work on some VoIP issues at work.

Phil Jackson
01-23-2010, 04:06 PM
Your all good, i've got a good start now, retrieves all valid links on first page, i'll let you know when things go tits up again!



#!/usr/bin/perl
use strict;
use warnings;
use LWP::Simple;
use Data::Dumper;
use DBI;

print "\nEnter a website: http://www.";
chomp(my $url = <>);
my @pagesArray = ("http://www.".$url);
my @foundPagesArray = ('');
my @mainPagesArray = ('');
print "\n\n";

while( @pagesArray ) {
my $page = shift @pagesArray;
if( ! grep {$_ eq $page} @foundPagesArray ) {
push(@foundPagesArray, $page);
$page =~ s/^\///;
if( $page =~ m#((?:\.html)|(?:\.php)|(?:\.aspx)|(?:\.htm)|(?:\.asp)|(?:\.shtml)|(?:/))$#is ) {
if( $page =~ m#^http://(www\.)?$url#is ){
if( my $content = get $page ){
my @content = ($content =~ m#<a[^>]*href=\"([^\"\#\?]+(?:(?:\.html)|(?:\.php)|(?:\.htm)|(?:\.asp)|(?:\.shtml)|(?:/)))\"#g);
foreach ( @content ) {
print $_, "\n";
}
}else{
print "3";
}
}else{
print "2";
}
}else{
print "1";
}
}
}

Phil Jackson
01-23-2010, 04:22 PM
I'm back again. Just a quick one, am i using "push" correctly as it is not pushing the links into the array;



#!/usr/bin/perl
use strict;
use warnings;
use LWP::Simple;
use Data::Dumper;
use DBI;

print "\nEnter a website: http://www.";
chomp(my $url = <>);
my @pagesArray = ("http://www.".$url);
my @foundPagesArray = ('');
my @mainPagesArray = ('');
print "\n\n";

while( @pagesArray ) {
my $page = shift @pagesArray;
if( ! grep {$_ eq $page} @foundPagesArray ) {
push(@foundPagesArray, $page);
if( $page =~ m#((?:\.html)|(?:\.php)|(?:\.aspx)|(?:\.htm)|(?:\.asp)|(?:\.shtml)|(?:/))$#is ) {
if( $page =~ m#^http://(www\.)?$url#is ){
if( my $content = get $page ){
my @content = ($content =~ m#<a[^>]*href=\"([^\"\#\?]+(?:(?:\.html)|(?:\.php)|(?:\.htm)|(?:\.asp)|(?:\.shtml)|(?:/)))\"#g);
foreach ( @content ) {
my $link = $_;
if( $link ne "./" && $link ne "../" ) {
if( ! grep {$_ eq $link} @foundPagesArray ) {
push ( @foundPagesArray, $link );
push ( @pagesArray, $link );
##print $link, "\n";
}
}
}
}else{
print "3";
}
}else{
print "2";
}
}else{
print "1";
}
}
}

FishMonger
01-23-2010, 04:54 PM
You are using push correctly. However, your code prior to that is flawed.

Do yourself a favor and use one of the tested and well established modules that is designed for this purpose.

HTML::LinkExtor - Extract links from an HTML document
http://search.cpan.org/~gaas/HTML-Parser-3.64/lib/HTML/LinkExtor.pm

List of other HTML modules.
http://search.cpan.org/modlist/World_Wide_Web/HTML

Phil Jackson
01-23-2010, 07:47 PM
I understand what your saying here but, I think I am going to carry on as I am learning sooo much at the min, even if its the basics.



EZ Archive Ads Plugin for vBulletin Copyright 2006 Computer Help Forum