...

View Full Version : PHP Best Practices - Discussion



JohnDubya
02-22-2007, 09:47 PM
Hello there, all you mad PHP coders out there. :) I am starting this thread in the hopes that it will turn out really helpful to those wishing to make their experience with coding in PHP more efficient and practical. My wish is that this would turn into a sticky, but we'll see how it goes.

As you all well know, there are a hundred different ways to do the same thing, but some solutions are better than others. That's what this thread is for. What way of doing things is the best way to do it? (what an awkward sentence!) lol

Here's how I think this will look: this thread is for discussing "Best Practices" that coders should use while coding PHP. Please discuss things out and try to come to a middle ground. Once a topic has been pretty thoroughly discussed, and there seems to be a general consensus, I will post it to the PHP Best Practices thread.

Please don't post on the PHP Best Practices thread. Of course, I can't enforce this, but I would like to keep that one specifically on-topic to allow readers to browse it easily.

Now, these "Best Practices" could look like just about anything, but I would hope that they stay somewhat general and not be too complex. I would like the intermediate PHP coders to be able to follow along.

Here are some examples of topics for which we could discuss "Best Practices":


In which situations is it better to use a switch() statement rather than an if() statement?
When should you use mysql_fetch_assoc() instead of mysql_fetch_array()?
Should you use extract() or while() to get info from a MySQL query?
What is the best way to organize your code (use comments, put separate chunks of code in another .php file and include() it, etc.)?


Instructions for Posting a New BP
Please put the subject of the BP in the Title of your post. Be as descriptive and complete as possible, while maintaining a low level of complexity (in other words, don't go too in-depth, or you will lose people).

Instructions for Replying to a BP
Please Quote the part of the person's post you are replying to, so we know what your comments are related to. That will also help me when I'm looking for a general consensus on a BP subject.

I am wanting to expand my boundaries in coding PHP, and I like to be as efficient as possible. Oh yeah, and I like to do things the fast, easy way if at all possible. :D So go ahead...post away! Show us your unfathomable wisdom, oh glorious gods of PHP goodness!

ralph l mayo
02-22-2007, 11:45 PM
1. Use switch in preference to if anytime you can. To be able to refactor a bunch of ifs as a switch (a) they must all test the same subject and (b) they must test for single values (==) and not ranges (<, >, <=, >=, != [although one of these can often be covered by the default case])

If the goal of the chain of ifs is to assign a single variable different values depending on the conditions *and there are no side effects of testing the conditions* it should be expressed as ternary instead of as if or else. Ternary for assignment is idiomatic in many languages, inc. PHP.

2. Prefer to fetch an array when it would not compromise readability... when you're only selecting 1 or 2 fields and when they'll go out of scope quickly so it doesn't particularly matter what they are named. Another good time to fetch the array is when you're going to assign the results to member data right away, so they'll get meaningful names without the need of associating with the column names.

3. Don't use extract for anything. It seems it was made for people incapable of understanding hashes.

4. The best way to organize your code is a matter of opinion, but MVC is very strongly established for web programming for (imho) good reason. At a minimum, organize your code into classes that provide abstractions for concepts in the problem space.

If you like to do things the efficient, fast, and organized way consider not using PHP at all :P Also, while I'm being inflammatory, tabs and vim will one day win the holy war.

firepages
02-23-2007, 01:35 AM
1) To get around single values or equality mentioned above you can also use...


<?php
switch(true){
case $cond && $cond2 :/*blah*/;
}
?>


2) & 3) +1
4) hmmm, this is why there are no best practices for a loosely typed language like PHP, MVC holds gotcha's for interpreted languages like PHP, & the classic JAVA interpretation of it does not, IMO, work especially well for PHP though some of the design patterns associated with MVC are useful across the boards.

Other issues with `best practices` are things like... I use ++$x because technically its faster that $x++ (at least in php4 not sure about php5) but others will say that the speed difference is negligible and I am being anal about it .. and they are probably right ;) , but thats best practice to me though not for many.
The use of eval() is something that has set off many a flame war with equally capable proponents, I personally fall on the side that thinks that eval() is inherently evil, extract as mentioned is a no-no (though I think there can be little argument there~)

Anyway, it will be interesting to see some other comments, as to whether this will end up a sticky.. not sure since this sort of thread can turn funny.

JohnDubya
02-23-2007, 01:52 AM
One "Best Practice" that I found was while needing to find a way to not send MySQL UPDATE's or INSERT's if an error occurred. At first, I was doing my error code like this:


if (!eregi("^[_a-z0-9-]+(\.[_a-z0-9-]+)*@[a-z0-9-]+(\.[a-z0-9-]+)*(\.[a-z]{2,3})$", $email)) {
$Error_Message .= $email.' is an invalid e-mail address!';
} else {
if ($first_name == '') {
$Error_Message .= $first_name.' cannot be blank!';
} else {
if ($Error_Message == '') {
$x = mysql_query("insert all the user info into the database");
}
}
}


As you can see, I was wasting time and space, and it would get very complicated very quickly. Then, I found the glories of using something like $Error_Stat. Simple, but I just didn't realize it until I read it in someone else's code. So like in the example above, instead of specifying an "else" for every if(), make each error, if true, set $Error_Stat (or whatever variable you choose) to 1. Make sure to set $Error_Stat to equal 0 at the top of the page first. Then, if $Error_Stat still equals 0 by the time the query comes up, do the UPDATE or INSERT.


$Error_Stat = 0;

if (!eregi("^[_a-z0-9-]+(\.[_a-z0-9-]+)*@[a-z0-9-]+(\.[a-z0-9-]+)*(\.[a-z]{2,3})$", $email)) {
$Error_Stat = 1;
$Error_Message .= $email.' is an invalid e-mail address!';
}

if ($first_name == '') {
$Error_Stat = 1;
$Error_Message .= $first_name.' cannot be blank!';
}

if ($Error_Stat == 0) {
$x = mysql_query("insert all the user info into the database");
}

ralph l mayo
02-23-2007, 03:05 AM
I have a few problems with that snippet, while we're talking best practices.

1. Don't reinvent the wheel. Email validation has been solved a million times before. The answer wasn't ^[_a-z0-9-]+(\.[_a-z0-9-]+)*@[a-z0-9-]+(\.[a-z0-9-]+)*(\.[a-z]{2,3})$ either. PHP lacks a really ubiquitious expansion library like Perl's CPAN but Pear is probably the closest. Before you set out to do something see if it's already done for you. See Not Invented Here (http://en.wikipedia.org/wiki/Not_Invented_Here) if this seems like cheating.

2. Normalize data structures unless you have a (typically performance-related) reason not to do so. This is more of a database design issue, but your snip isn't normalized because $Error_Stat doesn't contain any information that isn't derivable from the existing $Error_Message (since every error sets a message, if $Error_Message is its default [presumably blank] value there haven't been any errors)

3. Don't concatenate a string to keep a list. Arrays are AKA lists.

This is somewhat how I'd do it, but I find I'm rusty with PHP Object Oriented Programming (POOP, as I affectionately know it) after a period of disuse:



require_once('/var/www/php/Pear/Validate.php');

class UserDataException extends Exception
{
public $errors;
public function __construct($errors)
{
$this->errors = $errors;
}
}

class User
{
# Private so we can trap changes and see if they constitute errors.
# Also because private data with public accessors are generally good policy.
private $email, $errors, $first_name;

public function __construct()
{
$this->errors = array();
}

public function set_email($email)
{
$this->email = $email;
if (!Validate::email($email))
{
array_push($this->errors, 'Email failed validation');
}
}

public function set_first_name($first_name)
{
$this->first_name = $first_name;
if ($first_name == '')
{
array_push($this->errors, 'First name cannot be empty');
}
}

public function is_valid()
{
return count($this->errors) == 0;
}

public function save()
{
if (!$this->is_valid()) throw new UserDataException($this->errors);

// insert here
}
}


# Ideally all the above would be in some include file, so your client code could be this simple:

$user = new User();
$user->set_email('not gonna work');
$user->set_first_name('');
try
{
$user->save();
echo 'Everything went swimmingly';
}
catch (UserDataException $e)
{
echo 'Problems: <br />' . join($e->errors, '<br />');
}

aedrin
02-23-2007, 05:52 PM
PHP lacks a really ubiquitious expansion library like Perl's CPAN but Pear is probably the closest.

In my opinion this is one of the best practices:

If you can, avoid PEAR. Bulky, oversized libraries that have outdated code in them.

Code it yourself. This is not reinventing the wheel, this is learning.

Imagine trying to learn PHP, but you weren't allowed to code anything other people have.

You will learn more from doing things yourself than you ever will from books and/or tutorials.

And you will end up with lean/up to date code.


One "Best Practice" that I found was while needing to find a way to not send MySQL UPDATE's or INSERT's if an error occurred. At first, I was doing my error code like this:

The best practice you were looking for is: Don't go too deep in if statements. Try to keep it as flat as possible. Easier to read, more efficient code.

aedrin
02-23-2007, 05:54 PM
The use of eval() is something that has set off many a flame war with equally capable proponents, I personally fall on the side that thinks that eval() is inherently evil, extract as mentioned is a no-no (though I think there can be little argument there~)

+1

Just as in JavaScript, I have not found a single use for eval. This function can be safely ignored.

Webmonkey
02-23-2007, 06:03 PM
A pretty basic one, Always use
<?php Not..
<? becuase if you are working for a client or your server doesnt have php_short_tags on, Your code will be broken.

ralph l mayo
02-23-2007, 06:40 PM
If you can, avoid PEAR. Bulky, oversized libraries that have outdated code in them.
You probably have a point here, I haven't looked at anything in Pear on the source level but it may be both those things in some cases. The better solution than striking out on your own here is to submit a patch to the library. The whole point of the community library is peer review and honing of the sometimes admittedly wobbly wheels instead of wholesale square wheel manufacture.


Code it yourself. This is not reinventing the wheel, this is learning.
It's not a black-and-white issue of always using the libraries or never using them, you have to take it on a case-by-case basis.
Email validation is a perfect example of something you should never write yourself because it absolutely is reinventing the wheel. Either you're going to fudge it and come up with a fundamentally flawed solution like the one above or you're going to meticulously sift through the RFC and it's going to take a lot of a time and the end product is going to approximate the work that's already done iif you've done it right. Another example is bbcode parsing. Turns out it's not trivial to do it completely right. You can half-*** it, bang your head against it, or use a library solution that's been tested and proven.


Imagine trying to learn PHP, but you weren't allowed to code anything other people have.
T_T

You mean I'd have to just put my web-app together by linking modules? Without getting to wrestle with a bunch of minor implementation details? The horror...

JohnDubya
02-23-2007, 06:52 PM
This is some good stuff! Thanks for everyone's input. I edited the first post a bit, so please check it out. I think it's going to work better to have one post be purely for posting Best Practices (BP) that we've discussed out, and one post be for discussing (this one). And as it looks like we come to a general consensus about a topic, I will post people's comments (and give them credit). So keep on posting! :)

JohnDubya
02-23-2007, 08:53 PM
1. Don't reinvent the wheel. Email validation has been solved a million times before. The answer wasn't ^[_a-z0-9-]+(\.[_a-z0-9-]+)*@[a-z0-9-]+(\.[a-z0-9-]+)*(\.[a-z]{2,3})$ either. PHP lacks a really ubiquitious expansion library like Perl's CPAN but Pear is probably the closest. Before you set out to do something see if it's already done for you. See Not Invented Here (http://en.wikipedia.org/wiki/Not_Invented_Here) if this seems like cheating.

I found this on a scriptlets website. Someone else had already tested this and had it working, so I'm using it too. What's wrong with that? If there was an if_email() function, I would definitely use it.

JohnDubya
02-23-2007, 09:13 PM
2. Prefer to fetch an array when it would not compromise readability... when you're only selecting 1 or 2 fields and when they'll go out of scope quickly so it doesn't particularly matter what they are named. Another good time to fetch the array is when you're going to assign the results to member data right away, so they'll get meaningful names without the need of associating with the column names.


Ralph, could you revise this answer please? I'm having a hard time understanding it completely.

gsnedders
02-23-2007, 09:39 PM
I found this on a scriptlets website. Someone else had already tested this and had it working, so I'm using it too. What's wrong with that? If there was an if_email() function, I would definitely use it.

Try taking an email like:


"this "" is a test"@example.com

I'd probably add that regular expressions are slow, and there are a large number of times when they should be avoided, yet are commonly used.

Alex!
02-23-2007, 11:34 PM
Ok..I've got a couple of suggestions.

This is issue has come up many a time in other forums...where to put your curly brackets. Although it won't change anything at all for the end user, if you follow a few simple rules then when other people come to view and edit your code. Personally, I set them out as follows:


if(statement)
{
do this
}else
{
}
The reason is that it makes the curly brackets line up so that you can see where they start and end. Some people however thinks that this looks messy...

My second suggestion is for indentation..

Most people tend to use the tab for indentation if they use editors such as Dreamweaver or AceHTML. The main reason this is a bad idea is because the size of the tab in Dreamweaver is about 2 or 3 spaces where as in Emacs or Notepad the size of tab is about 4 or 5 spaces. As a result, if someone looks at your code in Emacs or Notepad etc. then if you have a couple of nested if statements then half your code is off the page making it harder to read.

My advice is instead to use the spacebar when indenting your code (using 2 spaces is quite common). Obviously, if you know you're always going to be working in one editor then it's fine to stick with tab.

My third piece of advice is about variable names. This one really gets on my nerves when reading someone's code. Please please please use descriptive variable names, not names such as 'xhj' or 'k' but instead use names such as 'title' or 'description' etc. etc.

Alex

Fumigator
02-23-2007, 11:41 PM
I see a lot of posts concerning MySQL queries where there is no error reporting, so the poster gets the error message "invalid resource", where if there was decent error reporting, the actual error would be displayed and the poster could figure out what's wrong from there.

So... I'd like to contribute my version of error reporting in PHP 4. (PHP 5 has try/catch logic available so this is mostly for the structured programming style.)

There is an important issue you must address when reporting query errors, and that is the issue of displaying error details to the whole world-- you don't want to do this. The more detail you display to John Q Public (table names, field names, etc), the more information a hacker potentially has to try to inject malicious code into your queries. Rather than displaying the details, create an error log table and insert a row into this table. Optionally, you can email an error notice to your own email account to alert you when (and how often) errors occur.

While you are testing your code, however, you want to be able to see as much detail as possible when an error occurs. Here is a method that makes it easy to do both.

First, write an error handling function that is called every time a query error occurs. To this function you will send:

The error text (the results of mysql_error())
The query text that caused the error
The name of the file the error occured in (optional; use $_SERVER['REQUEST_URI'])
The function name the error occured in (optional)

This function should be in a separate file that you INCLUDE in your PHP files. You should have an environment file that sets a global variable (a constant) that indicates the type of environment the script is running in as well.

Code to catch a query error and report it:


include('sqlerr.php');

.
.
.
function getInfo() {
$query = sprintf("
SELECT my_field
FROM my_table
WHERE my_field = '%s'
",
mysql_real_escape_string($_POST['myValue'])
);

$result = mysql_query($query);
if (!$result) {
SQLError(mysql_error(), $query, 'getInfo', $_SERVER['REQUEST_URI']);
}
}


The error reporting function:


//file sqlerr.php
function SQLError($errorText, $errorQuery, $errorFunction, $errorModule) {
global $envCode;

//insert a row into the error log table
$query = sprintf("INSERT into errorlog_tbl (
created_ts,
module_nm,
function_nm,
error_ds,
query_ds)
values (%s, '%s', '%s', '%s', '%s')",
"CURRENT_TIMESTAMP",
$errorModule,
$errorFunction,
mysql_real_escape_string($errorText),
mysql_real_escape_string($errorQuery));

$logresult = mysql_query($query);

if (!$logresult) {
print "<p>Attempt to log error failed.</p>\n";
} else {

//Display the error details ONLY when in a test environment
//The variable $envCode will be global and set in an environment file
if ($envCode == 'TEST') {
print "<p class=\"title\">Fatal Error Has Occurred.</p>\n";
print "<p class=\"title\">SQLERROR!</p>\n";
print "<p>Raw error text = $errorText</p>\n";
print "<p>Original query = $errorQuery</p>\n";
print "<p>Function = $errorFunction</p>\n";
print "<p>Module = $errorModule</p>\n";
print "<p>A fatal error has occcured and a report has been logged. If the error persists, contact Customer Service.</p>\n";
print "<p><a href=\"home.php\">Return to home page</a></p>\n";
}
}

die();
}

aedrin
02-23-2007, 11:46 PM
My advice is instead to use the spacebar when indenting your code (using 2 spaces is quite common).

This is a classic example of solving a problem with a problem.

You actually counter proved your own point.

Yes, editors translate tabs differently. This is based on a setting. The result? You can fine tune what code looks like to your own preference, and you bother no one.

Versus the forced look of spaces. I wouldn't read code if it was indented in 2 spaces (unless I had to of course).


The reason is that it makes the curly brackets line up so that you can see where they start and end. Some people however thinks that this looks messy...

And this is why tabs is good. It is all about personal preference. Eclipse allows you to right click and format a document. I hate it when curly brackets are on a newline, it makes code take up 2 pages when it could be half.

But like I said, personal preference. So I think we should avoid anything that is a personal preference. As you cannot consider them a best practice either way (both methods work).

aedrin
02-23-2007, 11:49 PM
Fumigator:


global $envCode;

Best practice: Avoid 'global' at all costs. It pollutes the global scope and allows for bad programming. If you need to define whether it is a test environment, do it like this.



define('DEBUG', true);

if (DEBUG) {
echo "var=".$var;
}

aedrin
02-23-2007, 11:53 PM
I wrote a function to do queries instead of using the ones in PHP directly. This keeps me from having to repeat 'or die(etc.)'. It also allows me to automatically use prepared statements without having to switch functions.

I keep those functions in a seperate file. This way you have a little library that provides your standard DB functions.

Then you could write a similar file, using a different DB. All you do is change your include/require, and you're using a new database.

As long as the functions keep the same signature (name+arguments).

Alex!
02-24-2007, 03:13 AM
So I think we should avoid anything that is a personal preference

Ok...point taken. Anyway, here's some links I found to some coding guidelines..

http://www.dagbladet.no/development/phpcodingstandard/
http://www.evolt.org/node/60247
http://area51.phpbb.com/docs/coding-guidelines.html

Notable points include the use of quotes:


Strings in PHP can either be quoted with single quotes ('') or double quotes (""). The difference between the two is that the parser will use variable-interpolation in double-quoted strings, but not with single-quoted strings. So if your string contains no variables, use single quotes and save the parser the trouble of attempting to interpolate the string for variables

and using quotes for associative arrays:


Finally, when using associative arrays, you should include the key within single quotes to prevent any ambiguities, especially with constants:

Alex

Fumigator
02-24-2007, 04:38 AM
About the global thing...

I agree that the accepted "best practices" is not to use them, but when something is actually global, such as an environment variable, then my opinion is they are ok.

But I do agree, using a "constant" rather than the keyword "global" is a better practice.

Fumigator
02-24-2007, 04:41 AM
I don't know if John W.'s intentions are to rewrite a complete coding standard, as they are vast and well known. The brackets thing and the indent thing starts getting into that coding standard area which is probably best left to individual shops (and people).

And there's my $.02 on THAT!

firepages
02-24-2007, 06:19 AM
About the global thing...

I agree that the accepted "best practices" is not to use them, but when something is actually global, such as an environment variable, then my opinion is they are ok.

But I do agree, using a "constant" rather than the keyword "global" is a better practice.

The superglobals $_GET,$_POST,$_ENV etc are already global assuming you reference them that way, no need for globals at all.

JohnDubya
02-24-2007, 08:43 PM
I don't know if John W.'s intentions are to rewrite a complete coding standard, as they are vast and well known. The brackets thing and the indent thing starts getting into that coding standard area which is probably best left to individual shops (and people).


I have no intention to rewrite anything. I'm glad Alex provided those links because I knew that there were sources out there that had already covered coding style conventions. In the post I wrote on the main thread, I just did a brief overview and then provided the links for further reading. Let me know if you guys think that works well.

bugman
02-25-2007, 01:04 PM
Besides single quotes and double quotes, you can use the heredoc notation. This is particularly useful for large chunks of text:



$sample_text= <<<END
Here's a line
Here's another line
Here's a third line
END;


Easier and clearer than standard quoted concatenation, as far as I'm concerned.

aedrin
02-26-2007, 04:19 PM
Except that it is harder to read, and you end up with at least 1 line improperly indented. (the last one)

JohnDubya
02-26-2007, 04:30 PM
Would some people mind explaining best practices for cleaning information to be put into a database? I've been reading so much lately, and it doesn't all fit together for me. I know you can use mysql_real_escape_string() to protect from SQL injection. There is also the whole "magic_quotes" system, but I know nothing about that. I've also heard of making your own function to clean up strings before sending them to the database. What is the easiest, most efficient, and safe way to get strings ready to go to the database?

aedrin
02-26-2007, 06:02 PM
The best approach in my opinion is to use prepared statements. Normally these require quite some extra code. But you can write a wrapper for it and it'll be almost nothing more.

This is similar to how I have it.



$result = query('INSERT INTO table(name, description) VALUES(?, ?)', array($name, $description));


You don't have to worry about escaping quotes/etc. with prepared statements as the statement is compiled first, so whatever they add to it is considered the value, and won't be read as SQL.

It's also easier to read in my opinion. ;)

Fumigator
02-26-2007, 06:28 PM
That's a nice idea. I think I'm going to incorporate that going into the future.

You should post the guts of it though-- your query() function ;)

aedrin
02-26-2007, 10:59 PM
This is the MySQL specific implementation.

It can be easily adjusted to fit other databases.


if (is_array($params)) {
foreach ($params as $key => $value) {
$params[$key] = mysql_real_escape_string($value);
}
}
$sid = 'sql'.uniqid();
mysql_query('PREPARE ' . $sid . ' FROM "' . $sql . '"') or die(queryError(mysql_error(), $sql));
$keys = array_keys($params);
$count = count($keys);
$using = array();
for ($i = 0;$i < $count;$i++) {
if (is_int($params[$keys[$i]])) {
mysql_query('SET @'.$i.' = ' . $params[$keys[$i]] . '');
} else if (is_string($params[$keys[$i]])) {
mysql_query('SET @'.$i.' = \'' . $params[$keys[$i]] . '\'');
} else {
mysql_query('SET @'.$i.' = \'' . $params[$keys[$i]] . '\'');
}
$using[] = $i;
}
$result = mysql_query('EXECUTE ' . $sid . ' USING @'.implode(', @', $using)) or die(queryError(mysql_error(), $sql));


Note: This can still be optimized. It was something I created quickly a while ago because I needed a replacement for PEAR's MDB2.

JohnDubya
03-01-2007, 05:37 PM
Let me ask this. What I've done as of late is validate strings by using the ctype_ PHP functions to make sure my data is ONLY composed of that type of data. If the string should only be letters and numbers, I check the string against ctype_alnum(). If the string is supposed to be only numbers, I use ctype_digit(). That way, I don't have to escape ' and " because the ctype_ functions won't allow them. Is this an equally good way to validate data going into the database and protecting against SQL injection?

aedrin
03-01-2007, 08:08 PM
If you do not execute the query when one of those checks returns false, then that would be enough protection against injection as far as I can see.

JohnDubya
03-01-2007, 08:49 PM
If you do not execute the query when one of those checks returns false, then that would be enough protection against injection as far as I can see.

Yeah, if the ctype_ check fails, it makes $Error_Stat equal 1. If $Error_Stat doesn't equal 0, then the query is not made.

Cool, I'm glad this works! Thanks.



EZ Archive Ads Plugin for vBulletin Copyright 2006 Computer Help Forum