View Full Version : Need to change text separator from , to | in cgi form script
NickCR
08-17-2007, 07:26 AM
Hi. I know virtually nothing about Perl/CGI, but I work for a not-for-profit org and I am trying to get some online forms to work. At the moment, the html forms use the hazrep cgi script below to output the form fields to a csv text file. I want to change the comma to a | instead, as people have been putting commas in their responses and this throws out the data import into excel - ie puts the data into too many columns. How do I make this change - is it as simple as changing the "$FORM{'name'}',$FORM{'dateday'}\n"; to "$FORM{'name'}|$FORM{'dateday'}\n"; in the print FILE section of the script below? The script works fine when there are commas, but doesn't seem to work when there are | instead. Any help would be greatly appreciated. Here is the CGI script.
#!/usr/bin/perl
# hazrep.cgi
$| = 1;
print "Content-type: text/html\n\n";
if ($ENV{'REQUEST_METHOD'} eq "POST")
{
read(STDIN, $request, $ENV{'CONTENT_LENGTH'});
@rxpairs = split(/&/, $request);
foreach $pair (@rxpairs) {
($name, $value) = split(/=/, $pair);
$value =~ tr/+/ /;
$value =~ s/%([a-fA-F0-9][a-fA-F0-9])/pack("C", hex($1))/ge;
$FORM{$name} = $value;
# print "$name/$value - $FORM{$name} <BR>\n";
}
}
open(FILE, ">> /clientdata/www/au/a/accessarts.org.au/data/hazrep.txt");
print FILE "$FORM{'name'}|$FORM{'dateday'}|$FORM{'datemonth'}|$FORM{'dateyear'}|$FORM{'hour'}|$FORM{'minutes'}|$ FORM{'ampm'}|$FORM{'location'}|$FORM{'natureofhazard'}|$FORM{'avoidhazard'}|$FORM{'notes'}|$FORM{'co mpletedby'}|$FORM{'datesubmitday'}|$FORM{'datesubmitmonth'}|$FORM{'datesubmityear'}\n";
close (FILE);
my($mailTo) = 'nick@accessarts.org.au';
my($img_link) = 'http://accessarts.org.au/sound_circles_facilitator_pages_images/aai_logo_new.jpg';
#
#
#print to client
print <<"HTML";
<HTML>
<head>
<title>Access Arts - Hazard Form Completed</title>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
</head>
<body bgcolor="#FFFFFF" link="#990000" vlink="#990000" alink="#FF0000" leftmargin="0" topmargin="0" marginwidth="0" marginheight="0">
<TABLE width=90% border=1 align=center borderColor=#999999
background=indexxx_files/gry.jpg bgColor=#ffffff>
<TBODY>
<TR align="center" vAlign=top borderColor=#ffffff bgColor=#ffffff>
<TD width="100%" height="0"> <table width="100%" border="0" align="center" bordercolor="#FFFFFF" bgcolor="#FFFFFF">
<tr valign="top">
<td colspan="2"><table width="100%" border="0">
<tr>
<td width\=12%><img src=\"$img_link\" alt="Access Arts Logo" width\="128" height\="80"><\/td>
<td width="88%"> <blockquote><font size="5" face="Arial, Helvetica, sans-serif"><a name="TOP"></a>Access
Arts Inc® - The Key to Success in the Arts</font></blockquote></td>
</tr>
</table></td>
</tr>
<tr bgcolor="#FFFFFF">
<td colspan="2" bordercolor="#000000"><font face="Arial, Helvetica, sans-serif" color=#ffffff
size=3><strong><font color=#000000 size=2><p align="center">Location - Access Arts Level 1
Stores Building Brisbane Powerhouse 119 Lamington Street New Farm
Qld 4005<strong> <br>
Postal Address Access Arts Inc<strong>®</strong> PO Box 1034, New
Farm, Qld 4005<br>
Contacts - Phone: (07) 3358 6200 Fax: (07) 3358 6211<br>
Regional Callers 1300 663651 (local call charge only)</strong></font></strong></font></p>
</td>
</tr>
</TBODY>
</TABLE>
<BR>
<table width="100%" border="0" cellpadding="5" cellspacing="2" align="center">
<tbody>
<TR align\=\"center\" VALIGN\=\"top\" borderColor\=\#ffffff bgColor\=\#ffffff><TD VALIGN\=\"top\"><font face="Arial, Helvetica, sans-serif">
<p><H3>Thank you $FORM{'name'}</H3></p>
<p>Your Hazard Report Form has been submitted to Access Arts Web Server at Net Registry. Please <A HREF\=\"mailto:\/\/$mailTo?subject=Hazard Report Submitted by $FORM{'name'}">email<\/A> Nick Reilly to let him know that you have completed the Hazard Report Form and he will download and process the data. Click
<A HREF\=\"http:\/\/accessarts.org.au/qaforms/forms_main.htm">here<\/A> to return to the Access Arts Forms Home Page, or click <A HREF\=\"http:\/\/accessarts.org.au/index.htm">here<\/A> to go to the Access Arts Main Page</p>
</font>
</td>
</tr>
</table>
<br />
<table width="100%" border="0" cellpadding="5" cellspacing="2" align="center">
<tr>
<td height="33" colspan="3" valign="top"><table width="100%" border="0" cellspacing="0" cellpadding="2" bgcolor="#CCCCCC">
<tr>
<td height="25"> ©2007 <font color="#000000" face="Arial, Helvetica, sans-serif"><strong>Access</strong>
<strong>Arts</strong> <strong>Inc</strong></font><font color="#000000" size="5" face="Arial, Helvetica, sans-serif"><strong><font size="4">®</font></strong></font></td>
<td align="right"><a href="#top"><font size="3" face="Arial, Helvetica, sans-serif">Back
to the Top of the Page</font></a></td>
<\/TR>
</table><\/TD>
<\/TR>
</table>
<\/TD>
<\/TR>
</TBODY>
</TABLE>
<p> </p>
<\/BODY>
<\/HTML>
HTML
exit;
KevinADC
08-17-2007, 05:42 PM
remove or escape the | in the user input:
if ($ENV{'REQUEST_METHOD'} eq "POST")
{
read(STDIN, $request, $ENV{'CONTENT_LENGTH'});
@rxpairs = split(/&/, $request);
foreach $pair (@rxpairs) {
($name, $value) = split(/=/, $pair);
$value =~ tr/+/ /;
$value =~ s/%([a-fA-F0-9][a-fA-F0-9])/pack("C", hex($1))/ge;
$value =~ tr/|//d; #removes the '|' character from input
$FORM{$name} = $value;
# print "$name/$value - $FORM{$name} <BR>\n";
}
}
The script is very poorly written. I would not use it if I were you unless you really trust the people that use the form. It is about as insecure as a cgi script can possibly be.
FishMonger
08-17-2007, 05:43 PM
What "doesn't seem to work when there are | instead"?
Writing to the file will work the same when using either the , or | delimiter, but when importing the "csv" file into excel, you'll need to "tell" excel that you're using | as the delimiter.
BTW,
That's a really poorly written script. Here's a list of what's wrong.
1) It doesn't have warnings enabled, so it can't tell you about non fatal errors.
2) It's not using the strict pragma, which is used to help you use good programming practices and forces you to predeclare you variables.
3) It's not running in taint mode which is used to prevent you from doing something unsafe with user supplied data.
4) It's not using the CGI module for either the form submission processing or html output.
5) It's not using the CGI::Carp module to redirect the warnings and fatal errors to the browser.
6) It's not checking the return code of the open call to verify that it was successful. In fact, there is no error handling at all.
7) It's using depreciated html tags and poorly written html.
javabits
08-17-2007, 07:26 PM
You might consider using quotes around your items instead of changing the delimeter.
So your csv file would look like (this would fix your comma issues)
"the", "quick", "brown", "fox, jumps"
Your code would need to change to something like this
print FILE "\"$FORM{'name'}\",\"$FORM{'dateday'}\""
semper fi...
KevinADC
08-17-2007, 08:02 PM
What "doesn't seem to work when there are | instead"?
It messes up his fields in the data file because he is using it as the field delimiter. I think you covered everything else though ;)
FishMonger
08-17-2007, 08:15 PM
You might consider using quotes around your items instead of changing the delimeter.
So your csv file would look like (this would fix your comma issues)
"the", "quick", "brown", "fox, jumps"
Your code would need to change to something like this
print FILE "\"$FORM{'name'}\",\"$FORM{'dateday'}\""
semper fi...
First, the print statement would be better written as:
print FILE qq("$FORM{'name'}","$FORM{'dateday'}");
Secondly, your suggestion of using the double quotes won't solve the problem.
With a line in the csv file such:
"the","quick","brown","fox, jumps","high"
The resulting cell entries once imported into the spreadsheet would be:
A1 = the
B1 = "quick"
C1 = "brown"
D1 = "fox
E1 = jumps"
F1 = "high"
As you see, "fox, jumps" was split into 2 cells (D1 & E1) and E1 has a leading space. In addition to that, if the fields in the csv file are also space separated in addition to the comma separation as in your example, each field except A1 would have a leading space plus the unneeded double qoutes.
FishMonger
08-17-2007, 08:23 PM
It messes up his fields in the data file because he is using it as the field delimiter.)
It shouldn't and doesn't in my test.
A test file with this:
the|quick|brown fox|jumps|high
Imported as:
A1 = the
B1 = quick
C1 = brown fox
D1 = jumps
E1 = high
Note, there's no problem with leading spaces or unneeded quotes.
However, if the first half of the file is using comma separation and the second half is using pipe separation, then of course that may cause a problem.
javabits
08-17-2007, 08:55 PM
Secondly, your suggestion of using the double quotes won't solve the problem.
With a line in the csv file such:
"the","quick","brown","fox, jumps","high"
The resulting cell entries once imported into the spreadsheet would be:
A1 = the
B1 = "quick"
C1 = "brown"
D1 = "fox
E1 = jumps"
F1 = "high"
As you see, "fox, jumps" was split into 2 cells (D1 & E1) and E1 has a leading space. In addition to that, if the fields in the csv file are also space separated in addition to the comma separation as in your example, each field except A1 would have a leading space plus the unneeded double qoutes.
I'm using Excel 2003 and the csv file actually imports fine. I opened the csv directly, opened it from excel and did a data import from a new worksheet, and pasted the text and then did text to columns. All of these methods worked for me. I did notice when importing data there is a text identifier that can be specified (in my case the double quote). Perhaps this doesn't work for older versions of Excel.
semper fi...
KevinADC
08-17-2007, 10:30 PM
It shouldn't and doesn't in my test.
A test file with this:
the|quick|brown fox|jumps|high
Imported as:
A1 = the
B1 = quick
C1 = brown fox
D1 = jumps
E1 = high
Note, there's no problem with leading spaces or unneeded quotes.
However, if the first half of the file is using comma separation and the second half is using pipe separation, then of course that may cause a problem.
I think your test was incorrect Fish. Assume I enter this data: "brown|fox", so now the data file is:
the|quick|brown|fox|jumps|high|
that will not export correctly.
FishMonger
08-18-2007, 12:03 AM
I'm using Excel 2003 and the csv file actually imports fine. I opened the csv directly, opened it from excel and did a data import from a new worksheet, and pasted the text and then did text to columns. All of these methods worked for me. I did notice when importing data there is a text identifier that can be specified (in my case the double quote). Perhaps this doesn't work for older versions of Excel.
semper fi...
It could be a version issue. I tested on the older Excel 2000 version and I used the "text identifier" specifying the double quote and had the same failed results as when not specifying the text identifier.
FishMonger
08-18-2007, 12:10 AM
I think your test was incorrect Fish. Assume I enter this data: "brown|fox", so now the data file is:
the|quick|brown|fox|jumps|high|
that will not export correctly.
Given that interpretation of the problem, then you'd be correct. However, that's not how I interpreted it and is why my first post was asking for clarification.
netroact
08-18-2007, 01:27 AM
7) It's using depreciated html tags and poorly written html.
What is poorly written in the html?
FishMonger
08-18-2007, 02:17 AM
What is poorly written in the html?
Maybe I should have chosen my words more carefully. Would you agree that it doesn't do a good job of following the current level of programming standards/practices?
It uses the depreciated font tag and uses a mixture of upper and lowercase tags instead of all lowercase. It also uses tables for page layout instead of css. The fourth issue is falls into poor Perl practices, which is the inconsistent and unnecessary escaping of the quotes and forward slashes in the here document. IMO, those don't follow current programming standards and is why I feel it is poorly written.
netroact
08-18-2007, 02:49 AM
Maybe I should have chosen my words more carefully. Would you agree that it doesn't do a good job of following the current level of programming standards/practices?
It uses the depreciated font tag and uses a mixture of upper and lowercase tags instead of all lowercase. It also uses tables for page layout instead of css. The fourth issue is falls into poor Perl practices, which is the inconsistent and unnecessary escaping of the quotes and forward slashes in the here document. IMO, those don't follow current programming standards and is why I feel it is poorly written.
I didn't really mean it as a criticism. I was just curious about your view on the html.
I would agree that a uppercase/lowercase mixture is a VERY BAD practice. I say this after fixing so many web pages and scripts where clients have mixed cases in such things as image source tags on a Unix server. I decided early on to use lowercase on almost everything, except where standards like FILEHANDLES apply.
As far as new standards go, I'm still waiting on the main browser developers to start appying css stardards across browser platforms before I leave the simplicity of basic tables for tabular data. People tend to "throw out the baby with the bath water". I usually prefer to style fonts over font tags though.
One thing I have noticed: Many people skip the basics when learning web page programming. I can't convince clients of the importance of just learning about basic tags like html, head, title, body. To me that's like trying to build a house without framing it first.
And without a basic knowlege of html, css will be confusing to say the least.
FishMonger
08-18-2007, 03:29 AM
I didn't really mean it as a criticism. I was just curious about your view on the html.
No offense taken. I'm the first to admit that I'm not very good at web development (most of my stuff is for back-end system administration), but I do have strong opinions on writing good quality and compliant code, be that Perl or html or whatever.
One thing I have noticed: Many people skip the basics when learning web page programming. I can't convince clients of the importance of just learning about basic tags like html, head, title, body. To me that's like trying to build a house without framing it first.
And without a basic knowlege of html, css will be confusing to say the least.
I fully agree with that, but once you have that knowledge, there's no problem with utilizing the cleaner shortcut methods for producing the code.
KevinADC
08-18-2007, 04:54 AM
Given that interpretation of the problem, then you'd be correct. However, that's not how I interpreted it and is why my first post was asking for clarification.
And for all I know your interpretation was the correct one. But so far the OP has not replied to this thread so it's impossible to know. But we're all having a good discussion without him. :D
NickCR
08-21-2007, 04:58 AM
Hi. Thanks to everyone who posted replies, your assistance is much appreciated. Sorry I was unable to reply before now I did not have access to my email.
The script finally worked using '|' as a delimiter instead of ',' - it seems that although the updated version was on the server (I could see it via FTP), it had not updated for some reason. It was easy enough to designate | as the delimiter in Excel 2007 so it now works beautifully, and hopefully no one will use | in one of the forms.
As I said in my original post, I know nothing about Perl/CGI scripting, and was only able to get this far by looking at a lot of script examples on the internet and trying to work out what the variables were and how to modify them. I did not have time to learn Perl before the forms needed to be online and working. Also, I have only been doing HTML for 2 weeks or so, and I had not really looked at the HTML component as I wanted to get the script working first.
I have a few questions in response to your replies, particularly an early one from FishMonger, which had the following points:
"1) It doesn't have warnings enabled, so it can't tell you about non fatal errors.
2) It's not using the strict pragma, which is used to help you use good programming practices and forces you to predeclare you variables.
3) It's not running in taint mode which is used to prevent you from doing something unsafe with user supplied data.
4) It's not using the CGI module for either the form submission processing or html output.
5) It's not using the CGI::Carp module to redirect the warnings and fatal errors to the browser.
6) It's not checking the return code of the open call to verify that it was successful. In fact, there is no error handling at all.
7) It's using depreciated html tags and poorly written html."
As I do not know Perl/CGI, unfortunately I do not understand any of these points except for no 7, which I will review with my 2 weeks of HTML expertise now that the script is working. Are there any tips that anyone can give me that are relatively easy to implement and that would improve the script, particularly if there are security concerns?
Any assistance would be much appreciated. I am happy to put the time in to learn Perl as a longer term project, but this role landed in my lap with a fairly short deadline to be online, and was completely outside of my normal IT admin role.
Thanks again for your input and assistance.
nkrgupta
08-21-2007, 05:53 AM
I have a few questions in response to your replies, particularly an early one from FishMonger, which had the following points:
"1) It doesn't have warnings enabled, so it can't tell you about non fatal errors.
2) It's not using the strict pragma, which is used to help you use good programming practices and forces you to predeclare you variables.
3) It's not running in taint mode which is used to prevent you from doing something unsafe with user supplied data.
4) It's not using the CGI module for either the form submission processing or html output.
5) It's not using the CGI::Carp module to redirect the warnings and fatal errors to the browser.
6) It's not checking the return code of the open call to verify that it was successful. In fact, there is no error handling at all.
7) It's using depreciated html tags and poorly written html."
As I do not know Perl/CGI, unfortunately I do not understand any of these points except for no 7, which I will review with my 2 weeks of HTML expertise now that the script is working. Are there any tips that anyone can give me that are relatively easy to implement and that would improve the script, particularly if there are security concerns?
Any assistance would be much appreciated. I am happy to put the time in to learn Perl as a longer term project, but this role landed in my lap with a fairly short deadline to be online, and was completely outside of my normal IT admin role.
Thanks again for your input and assistance.
Below is a modified but untested version of the perl part of your code as per FishMonger's suggestions. Also you should go through CGI (http://search.cpan.org/~lds/CGI.pm-3.29/CGI.pm) documentation and try to use the module for most of your HTML work, such as printing standard html headers etc.
#!/usr/bin/perl -wT
# hazrep.cgi
$| = 1;
use CGI;
use CGI::Carp qw(warningsToBrowser fatalsToBrowser);
use strict;
my $cgi=CGI->new();
print $cgi->header(-type=>'text/html');
my %FORM;
if ($cgi->request_method() eq "POST")
{
my $request;
read(STDIN, $request, $ENV{'CONTENT_LENGTH'});
my @rxpairs = split(/&/, $request);
foreach my $pair (@rxpairs) {
my ($name, $value) = split(/=/, $pair);
$value =~ tr/+/ /;
$value =~ s/%([a-fA-F0-9][a-fA-F0-9])/pack("C", hex($1))/ge;
$FORM{$name} = $value;
}
}
my $file="/clientdata/www/au/a/accessarts.org.au/data/hazrep.txt";
open(FILE, ">>$file ") or die ("Could not open file: $!");
print FILE join('|',$FORM{'name'},$FORM{'dateday'},$FORM{'datemonth'},$FORM{'dateyear'},$FORM{'hour'},$FORM{'mi nutes'}, $FORM{'ampm'},$FORM{'location'},$FORM{'natureofhazard'},$FORM{'avoidhazard'},$FORM{'notes'},$FORM{'c ompletedby'},$FORM{'datesubmitday'},$FORM{'datesubmitmonth'},$FORM{'datesubmityear'}\n};
close (FILE);
my($mailTo) = 'nick@accessarts.org.au';
my($img_link) = 'http://accessarts.org.au/sound_circles_facilitator_pages_images/aai_logo_new.jpg';
Confession: Its a shame that instead of going through these recommendations in almost every post, and knowing the possible consequences of not using them, i fail to apply them consistently to my codes. :(
Naveen
KevinADC
08-21-2007, 06:14 AM
hehehe... you did not use the CGI module for what it is best used for: parsing the form data:
my $cgi=CGI->new();
print $cgi->header(-type=>'text/html');
my %FORM = $cgi->Vars;
the last line above replaces these lines:
if ($cgi->request_method() eq "POST")
{
my $request;
read(STDIN, $request, $ENV{'CONTENT_LENGTH'});
my @rxpairs = split(/&/, $request);
foreach my $pair (@rxpairs) {
my ($name, $value) = split(/=/, $pair);
$value =~ tr/+/ /;
$value =~ s/%([a-fA-F0-9][a-fA-F0-9])/pack("C", hex($1))/ge;
$FORM{$name} = $value;
}
}
nkrgupta
08-21-2007, 06:17 AM
hehehe... you did not use the CGI module for what it is best used for: parsing the form data:
my $cgi=CGI->new();
print $cgi->header(-type=>'text/html');
my %FORM = $cgi->Vars;
the last line above replaces these lines:
if ($cgi->request_method() eq "POST")
{
my $request;
read(STDIN, $request, $ENV{'CONTENT_LENGTH'});
my @rxpairs = split(/&/, $request);
foreach my $pair (@rxpairs) {
my ($name, $value) = split(/=/, $pair);
$value =~ tr/+/ /;
$value =~ s/%([a-fA-F0-9][a-fA-F0-9])/pack("C", hex($1))/ge;
$FORM{$name} = $value;
}
}
That's what i call morning blues!!:eek: :D
FishMonger
08-21-2007, 06:25 AM
#!/usr/bin/perl
use strict;
use warnings;
use DBI;
use CGI ':standard';
use CGI::Carp qw(fatalsToBrowser);
use Fcntl ':flock';
my $mailTo = 'nick@accessarts.org.au';
my $img_link = 'http://accessarts.org.au/sound_circles_facilitator_pages_images/aai_logo_new.jpg';
my $file = '/clientdata/www/au/a/accessarts.org.au/data/hazrep.txt';
my @fields;
foreach my $field ( qw(
name dateday datemonth dateyear hour
minutes ampm location natureofhazard avoidhazard
notes completedby datesubmitday datesubmitmonth datesubmityear)
) {
push @fields, param($field);
}
open(my $fh, '>>', $file) or die "Could not open data file: $!";
flock($fh, LOCK_EX) or die "Can't lock data file $!";
print $fh join '|', @fields;
close $fh;
print header(),
start_html(-title=>'Access Arts - Hazard Form Completed',
-BGCOLOR=>'#FFFFFF',
);
NickCR
08-21-2007, 08:01 AM
Thanks Naveen, KevinADC and FishMonger, I will try your suggested scripts and let you know how they go.
vBulletin® v3.8.2, Copyright ©2000-2010, Jelsoft Enterprises Ltd.