...

GPC Stripping Tutorial

Fou-Lu
07-14-2008, 12:58 AM
* I hope nobody minds this being a tutorial on top of giving code. I do prefer to explain how something works so people who use such code know what it does, and more importantly how it works and why they should use it.
Contents

GPC
magic_quotes_gpc
magic_quotes_runtime
How can this break my code?
How to fix this?
What about _REQUEST?
What we have learned
This is too long, what can I do to shorten it?


GPC
GPC refers to superglobal Get/Post/Cookie. Prior to PHP5, GPC order could be controlled to change the order of these configurations. The order is important as it determines how values are overridden. $_GET['name'] is overridden by $_POST['name'] which in turn is overridden by $_COOKIE['name'].
magic_quotes_gpc
Magic quotes is a builtin directive for PHP to help protect data integrity. The purpose of magic quotes is to escape specific characters: ', ", NULL and \, and helps to protect the integrity of the data. Magic quotes cannot be configured at runtime and is incompatable with database cleaning methods. Its default directive is on.
magic_quotes_runtime
Magic quotes runtime is a builtin directive similar to magic quotes. The runtime functions on incoming data from external data - databases and files for example. Magic quotes runtime can be configured at runtime. Its default directive is off.
How can this break my code?
Magic quotes can damage you're data if used in combination with database cleaning methods (ie: mysql_real_escape_string). This happens since PHP attempts to control you're data by escaping special characters while you also attempt to do the same. Consider a form entry allowing a user to enter their last name. Most names will not create a conflict unless it contains a special character - O'Reilly is a great example of this. Magic quotes will attempt to escape the ' within the name. Before insertion to you're database you also add you're own escaping through mysql_real_escape_string. Here would be the result:

Input: O'Reilly
Magic_quotes: O\'Reilly
mysql_real_escape_string: O\\\'Reilly

How can I fix this?
So what can you do to fix this? Well, you can determine if magic_quotes have been enabled and stripslash each of you're entry fields. I would not recommend the use of stripslashes without verifying the value of magic_quotes_gpc - there are situations where you want to include these characters, such as the above example code. Stripslashes would remove all the escape characters in the above data providing three appearances of the exact same name: O'Reilly.
My recommendation would be to create a strip that's handled and an initialization level - one that takes care of all the data before its ever used. This initialization file can be included into all working scripts to handle these for you. There are several methods of doing this, but the basis is always the same:

Check to see if magic_quotes is enabled
If it is, run you're superglobals through a recursive method taking care of stripslashes. This can be either by reference or by value depending on the working version of PHP.
Lets break this down into steps:

Check to see if magic quotes are enabled. Since PHP 6 will remove magic_quotes, check first if the function exists:

if (function_exists('get_magic_quotes_gpc') && get_magic_quotes_gpc())

Next, lets write the stripslashes function. We'll do it step by step, but I will point out this can be done in a simple single line of code. Step by step is more readable than the condensed version. I prefer adding this function inside the actual check for gpc - no point in wasting memory if its not necessary:

function GPCStrip($arr)
{

The function can be done in several ways. We will do ours the step-by-step way to show what it is doing. The first thing we need to do is determine if the value of $arr is an array, and if it is we will iterate through each item of the array and recursively reassign the values for it:

if (is_array($arr))
{
foreach ($arr AS $arrKey => $arrVal)
{
$arr[$arrKey] = GPCStrip($arrVal);
}
}

Now we need to handle the stripslashes. Since only a string needs to be stripslashed I will use an elseif switch. Please note that you can just stripslash since any primitive data can be stripslashed.

else if (is_string($arr))
{
$arr = stripslashes($arr);
}

And finally we need to return the new value for $arr:

return $arr;

Next we need to actually filter our globals through the newly created function. We will do $_GET, $_POST and $_COOKIE:

$_GET = GPCStrip($_GET);
$_POST = GPCStrip($_POST);
$_COOKIE = GPCStrip($_COOKIE);

Simple right? Good. Now lets do _FILES. These are a little different to handle, specifically the temporary name. Some systems are capable of using \ inside their filenames so we want to account for that in case we happen to get two \ in succession:

if (is_array($_FILES))
{
foreach ($_FILES AS $key => $val)
{
$_FILES[$key]['tmp_name'] = str_replace('\\', '\\\\', $val['tmp_name']);
}
}
$_FILES = GPCStrip($_FILES);

What we have done above is replace any instance of \\ with \\\\. This allows the above function to correctly strip out the additional escaped characters giving us the original in use.
Finally, we want to be lazy and not have to worry about the incoming data being stripslashed. This is handled by the magic_quotes_runtime directive and is generally off. We want to be thorough though so we will also handle that. Like magic_quotes_gpc, magic_quotes_runtime will be removed in PHP 6, so we must check for its existence prior to disabling:

if (function_exists('set_magic_quotes_runtime'))
{
set_magic_quotes_runtime(0);
}

Lets put this all together now:

if (function_exists('get_magic_quotes_gpc') && get_magic_quotes_gpc())
{
function GPCStrip($arr)
{
if (is_array($arr))
{
foreach ($arr AS $arrKey => $arrVal)
{
$arr[$arrKey] = GPCStrip($arrVal);
}
}
else if (is_string($arr))
{
$arr = stripslashes($arr);
}
return $arr;
}
$_GET = GPCStrip($_GET);
$_POST = GPCStrip($_POST);
$_COOKIE = GPCStrip($_COOKIE);
if (is_array($_FILES))
{
foreach ($_FILES AS $key => $val)
{
$_FILES[$key]['tmp_name'] = str_replace('\\', '\\\\', $val['tmp_name']);
}
}
$_FILES = GPCStrip($_FILES);
}
if (function_exists('set_magic_quotes_runtime'))
{
set_magic_quotes_runtime(0);
}



What about _REQUEST?
_REQUEST superglobal is a double edged sword - it is simple to use when either POST or GET is usable, but creates potential dangers with the inclusions and overriding of COOKIE. This is why request needs to be controlled. If you anticipate the use of REQUEST and only plan on using it for form data, chances are you only need to worry about _GET and _POST. With that said, reconstruct your _REQUEST superglobal with only get and post data:

$_REQUEST = array_merge($_GET, $_POST);

Done and done. I do not normally recommend the use of _REQUEST, but once you have some control on it it becomes quite invaluable.
What we have learned
You have just learned how to control two major directives in PHP - magic_quotes_gpc and magic_quotes_runtime. We have also gone over what you need to do in order to clean you're own data without fear of PHP attempting to do the same. PHP is a fantastic language filled with flexibility, but it is not intuitive enough to do this for you.
This is too long, what can I do to shorten it?
You can do many things to shorten this code, depending on the versions available for you're use. The above code will work from PHP 4.2+, and still function in PHP 6 (even though it will not be necessary it will not break you're code).
Things that can be done to shorten the code include:

Use pass by reference techniques
Use array_map to apply this to a function.

With careful uses of the above you can take this entire block of code down into eight lines of code. I will let you explore how to perform this should you want to try.


I hope you all enjoyed and learned something from my little article about stripping you're variables on a global level. Have fun, and keep on coding!

_Aerospace_Eng_
07-14-2008, 06:48 AM
Great tutorial. Thanks.

binaryWeapon
07-14-2008, 06:35 PM
Who wants to take bets on how long it will take for this thing to be stickied in the PHP category? :p Great tutorial, thanks!

semi-offtopic:
You got Aero's second thank in almost 4 years :eek:

Fou-Lu
07-14-2008, 06:43 PM
Lol, thanks guys, I was kinda expecting posts about it being too long :P
The thanks system has only been around for about a year or so and I spent all of last year in my last year of college - too busy to come around. Man, I just noticed that I've been a member here since 2002! I remember my first post when I was a programming noob, it had to do with baseball scores if I recall correctly :D. That reminds me, I haven't seen raf in a long time either...

Inigoesdr
07-15-2008, 01:07 AM
Nice tutorial. I've added it to the FAQ thread (http://www.codingforums.com/showthread.php?p=187898#post187898).

Fou-Lu
07-15-2008, 01:57 AM
Oh great mate thanx! I didn't even realize we had a faq thread, lol I never pay attention to stickies >.<

Northie
08-19-2008, 09:08 AM
The article is a nice overview, but i'm not sure if it's a help or, worse, promoting the wrong approach......

GET and POST aren't just different ways of passing variables to the server, they have a semantic meaning

GET - means to get data from the server, and a list of & separated variables are usually supplied. GET requests should not modify the state or content of the data on the server

POST - sends data to the server to be processed/stored etc etc. Use POST requests to modify the state or content on the server

PUT and DELETE are two other HTTP directives, but are beyond the scope of this discussion.

So, rather than use $_REQUEST, one should be using $_GET, $_POST and $_COOKIE explicitly. personally I think $_REQUEST is very unsecure for anyone who doens't fully understand this basic, but obscure, principles.

I feel that developers should know by what method they are expecting data (G,P or C) and build scripts that follow the semantic principles of GET and POST.

Back to magic quotes - They don't fully protect your databases and un-escaped data is only harmful when you try to take it out of PHP (database, command line, etc).

I would recommend reading why not to use magic quotes here - http://www.php.net/manual/en/security.magicquotes.whynot.php

I would also recommend reading up on more rigorous methods of data sanitising

Fou-Lu
08-19-2008, 09:31 AM
The article is a nice overview, but i'm not sure if it's a help or, worse, promoting the wrong approach......

GET and POST aren't just different ways of passing variables to the server, they have a semantic meaning

GET - means to get data from the server, and a list of & separated variables are usually supplied. GET requests should not modify the state or content of the data on the server

POST - sends data to the server to be processed/stored etc etc. Use POST requests to modify the state or content on the server

PUT and DELETE are two other HTTP directives, but are beyond the scope of this discussion.

So, rather than use $_REQUEST, one should be using $_GET, $_POST and $_COOKIE explicitly. personally I think $_REQUEST is very unsecure for anyone who doens't fully understand this basic, but obscure, principles.

I feel that developers should know by what method they are expecting data (G,P or C) and build scripts that follow the semantic principles of GET and POST.

Back to magic quotes - They don't fully protect your databases and un-escaped data is only harmful when you try to take it out of PHP (database, command line, etc).

I would recommend reading why not to use magic quotes here - http://www.php.net/manual/en/security.magicquotes.whynot.php

I would also recommend reading up on more rigorous methods of data sanitising

I'm not certain you read through the entirety of the tutorial. I have made mention on the use of _REQUEST and recommended not using it. The suggestion of _REQUEST is to control you're data handling should you desire to use it by limiting it to only post and get data.

The purpose of this is not to add quotes, it is to remove them from magic_quotes_gpc directive should it be enabled. Since other sanitation techniques are not sensitive to the existence of magic_quotes_gpc, it is necessary to strip them from you're code prior to performing the sanitation process. Most sanitation processes include single quotations and escape characters which are also escaped by the magic_quotes_gpc. This will cause repetitive escape sequences to plaque you're data. Database libraries have individual sanitation methods specifically designed for handling their character escapes. This was also mentioned in the tutorial.

As for you're remark on un-escaped data being only harmful when you take it out from php, if this means what I think it means, I completely disagree. I have seen (and performed) many database and XSS injections allowing me to extend their original code intentions. These are obviously caused by poor escaping or by using exploits such as PHP_SELF.

This tutorial is not designed to show how to sanitize you're data. Its designed to show how to all sanitizing you're data while preventing repetitive escaping due to built in and defaulted PHP directives.



EZ Archive Ads Plugin for vBulletin Copyright 2006 Computer Help Forum