...

View Full Version : 404 redirect script not working



galahad3
09-17-2009, 09:58 AM
Hi, I've set up a PHP script which checks the inputted URL against records in a db, and does one of two things- (1) sends the user to a specified URL if the typed URL matches a record in the db, or (2) just outputs the 404 ewrror page if there's no match in the db.

The problem is that it always outputs the 404, regardless of whether or not there's a matching record in the db. I guess something must be wrong with the script but I can't see what:

This is the code:

<?php
include ('manager/exhibitions/inc/dbconnect.php');

$query = "SELECT showname FROM exhibitionstable WHERE showname = '". mysql_real_escape_string($_SERVER['REQUEST_URI']) ."' ";

$numresults=mysql_query($query);
$numrows=mysql_num_rows($numresults);

if ($numrows == 0)
{
echo "<p>404 error</p>";
}
else
{
header('location: exhibitions.html');
exit;
}
?>

Any ideas?

SKDevelopment
09-17-2009, 10:07 AM
I would try 2 things:

1) echo your $query. It sometimes helps to see the problem at once. I would also try to run the echoed query outside of PHP (from phpMyAdmin or - if under Windows - HeidiSQL (http://www.heidisql.com/)).

2) try to output MySQL error message


$numresults=mysql_query($query) or die(mysql_error());

Of course or die(mysql_error()); must be commented or removed in the Production environment.

galahad3
09-17-2009, 10:14 AM
Thanks, tried echoing the $query and interestingl;y it displays in the output page:

SELECT showname FROM exhibitionstable WHERE showname = '/Show2'

"Show2" being the typed URL- but what I don't see is where on earth it's getting the slash from as it isn't in the SELECT statement...

I guess it's picking it from the typed URL but I need a way of stopping the script including it...

???

SKDevelopment
09-17-2009, 10:41 AM
Do you need to cut off the query string ? (the part of URL after "?" e.g. in URL's like /my_script?a=1).

I think you could use basename() (http://php.net/basename) and String functions (http://www.php.net/manual/en/ref.strings.php) like this:


$name = basename($_SERVER['REQUEST_URI']);
if(false!==strpos($name,'?'))
{
$name = substr($name,0,strpos($name,'?'));
}

galahad3
09-17-2009, 11:00 AM
Okay- so we can then use $name in place of the global SERVER variable inside the $query, right?

Thanks

SKDevelopment
09-17-2009, 11:27 AM
Yes.

If you need stricter validation for anything that comes from $_SERVER['REQUEST_URI'], you could also use regular expressions. I am not absolutely sure if $_SERVER['REQUEST_URI'] could be changed by a potential attacker (maybe not). And also your script already looks like a validation script so I mentioned this just in case.

String functions are usually considered much faster but regular expression normally give more control on what you validate without writing a lot of code. Regexps are slower but in cases where I want to be absolutely sure that something exactly corresponds the required pattern I always use regexps.

galahad3
09-17-2009, 01:52 PM
That seems to be working- many thanks. It's forwarding to the speciofied page if the URL matches the db entry anyway so that's a big step forwards.

Many thanks for the pointers.

[Hmm... well it *was* working... but now in the testing echo I get this output:

SELECT showname FROM exhibitionstable WHERE showname = 'fixed.htc'

Where that comes from I've no idea as it certainly isn't in the database!

This is the current code:

<?php
include ('manager/exhibitions/inc/dbconnect.php');

$name = basename($_SERVER['REQUEST_URI']);
if(false!==strpos($name,'?'))
{
$name = substr($name,0,strpos($name,'?'));
}

$query = "SELECT showname FROM exhibitionstable WHERE showname = '". mysql_real_escape_string($name) ."' ";

echo $query;

$numresults=mysql_query($query) or die(mysql_error());
$numrows=mysql_num_rows($numresults);

if ($numrows == 0)
{
echo "<p>404 error</p>";
}
else
{
header('location: exhibitions.html');
exit;
}
?>

What's interesting if that it worked ONCE, I typed in http://mydomain/travejl.html and travejl.html is an entry in the db) and it forwarded to exhibitions.html.

But now it just shows the output as above...

galahad3
09-17-2009, 04:28 PM
Okay, this is *very* weird.

For some reason, it only works once for each mis-typed URL, and ONLY if I make some arbitrary change to the script, undo the change, and save the script!

For example:

I save and upload the script
I type in a URL which is in the db, for example, mydomain.com/GNETIG
It forward to exhibitions.html. GREAT!

But...

I go back and type the URL again (even after completely clearing the cache or using a different machine) and it doesn't work. It just goes to the 404 output.

However if I stay on that page, go and re-save the script file again, and then refresh the page- bingo, it forwards to the exhibitions.html page.

Why would it be doing this?! Obviously it isn't workable as it needs to work every time, not just one time only and just after the script has been saved. Also I tested from a separate machine that hadn't browsed to that page before, and got the same result, so it's not session-based.

I should note that when I re-save the script I'm not making any changes to it. The script itself seems to work fine.

Anyone know why this bizarre behavior would take place?

SKDevelopment
09-17-2009, 04:39 PM
I think you would need to debug ... Echo $_SERVER['REQUEST_URI'] each time instead of redirecting, echo the query, check how many rows returned - if 0, run the SELECT query on the table and see why this happened ...

Most important - check that the script names are present in the table exhibitionstable and no script deletes them from the table.

galahad3
09-17-2009, 04:55 PM
Well, if I change the script so I echo the SERVER variable instead:

if ($numrows == 0)
{
echo "$_SERVER['REQUEST_URI']";
}

I actually get no output at all from the script...

There are no other swcripts running on the table and I've also manually checked the table and found all the records are still in place and unchanged...

SKDevelopment
09-17-2009, 05:05 PM
The correct syntax is either


echo $_SERVER['REQUEST_URI'];

or


echo "{$_SERVER['REQUEST_URI']}";


No output means the error output is suppressed.

Add the following to the very top of the script while debugging:


error_reporting(E_ALL);
ini_set('display_errors','1');

After debugging is over, please comment or delete these 2 lines.

galahad3
09-17-2009, 05:09 PM
Okay, interestingly it now outputs as follows:

SELECT showname FROM exhibitionstable WHERE showname = 'fixed.htc' /fixed.htc

I still have no idea what fixed.htc is or where the script gets this from, as it certainly isn't in the db nor is in the typed URL. ???

SKDevelopment
09-17-2009, 05:17 PM
How do you redirect to this script to process URL's ? With .htaccess ? You are using Apache mod_rewrite ? If .htaccess does not contain any information you consider sensitive (only in this case) could you post the file content here ?

galahad3
09-17-2009, 05:26 PM
There is a .htaccess in the web root, this is the contents of the file:

Options +FollowSymlinks
RewriteEngine On
RewriteBase /

# Fix Apache internal dummy connections from breaking [(site_url)] cache
RewriteCond %{HTTP_USER_AGENT} ^.*internal\ dummy\ connection.*$ [NC]
RewriteRule .* - [F,L]

# Rewrite domain.com -> www.domain.com -- used with SEO Strict URLs plugin
#RewriteCond %{HTTP_HOST} .
#RewriteCond %{HTTP_HOST} !^example\.com [NC]
#RewriteRule (.*) http://example.com/$1 [R=301,L]

# Exclude /assets and /manager directories from rewrite rules
RewriteRule ^(manager|assets) - [L]

# For Friendly URLs
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ index.php?q=$1 [L,QSA]

# Reduce server overhead by enabling output compression if supported.
#php_flag zlib.output_compression On
#php_value zlib.output_compression_level 5

galahad3
09-18-2009, 10:10 AM
Is there is anything in the .htaccess that needs to be changed?

I just don't see how the script works fine but only if it's just been saved!

SKDevelopment
09-18-2009, 10:56 AM
I do not see any reference to fixed.htc in .htaccess ...

I think I would try to output the whole array $_SERVER, not only $_SERVER['REQUEST_URI'] ... Maybe this would help to see the source of the problem ...



echo '<pre>' . print_r($_SERVER,true) . '</pre>';

galahad3
09-18-2009, 11:01 AM
This is the output from that: (I can see a reference to fixed.htc though I've no idea what this is)

SELECT showname FROM exhibitionstable WHERE showname = 'fixed.htc'

Array
(
[REDIRECT_STATUS] => 200
[HTTP_ACCEPT] => */*
[HTTP_ACCEPT_ENCODING] => gzip, deflate
[HTTP_USER_AGENT] => Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.0; WOW64; Trident/4.0; SLCC1; .NET CLR 2.0.50727; Media Center PC 5.0; InfoPath.2; .NET CLR 3.5.30729; .NET CLR 3.0.30618)
[HTTP_HOST] => 78.136.31.10
[HTTP_CONNECTION] => Keep-Alive
[HTTP_COOKIE] => __utma=124380556.1943753217.1253192956.1253198091.1253209624.4; __utmz=124380556.1253192956.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none); SN4a7987ae5bf24=i29e53imnt93p7eh01m3r3jjr6; webfxtab_resourcesPane=3
[PATH] => /bin:/usr/bin:/sbin:/usr/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/usr/X11R6/bin
[SERVER_SIGNATURE] => Apache/2.2.3 (Red Hat) Server at 78.136.31.10 Port 80

[SERVER_SOFTWARE] => Apache/2.2.3 (Red Hat)
[SERVER_NAME] => 78.136.31.10
[SERVER_ADDR] => 192.168.100.10
[SERVER_PORT] => 80
[REMOTE_ADDR] => 86.133.15.149
[DOCUMENT_ROOT] => /var/www/html
[SERVER_ADMIN] => root@localhost
[SCRIPT_FILENAME] => /var/www/html/index.php
[REMOTE_PORT] => 53913
[REDIRECT_QUERY_STRING] => q=fixed.htc
[REDIRECT_URL] => /fixed.htc
[GATEWAY_INTERFACE] => CGI/1.1
[SERVER_PROTOCOL] => HTTP/1.1
[REQUEST_METHOD] => GET
[QUERY_STRING] => q=fixed.htc
[REQUEST_URI] => /fixed.htc
[SCRIPT_NAME] => /index.php
[PHP_SELF] => /index.php
[REQUEST_TIME] => 1253268031
[HTTP_REFERER] =>
)

SKDevelopment
09-18-2009, 11:30 AM
Yes, according to this the initial URL seems to be /fixed.htc ...

I think you could have chain of redirects, though I do not see what initially redirected to fixed.htc if you say it was not the URL you have been testing with ...

I think it would be necessary to see which script redirects to which ... I would install the FireFox add-on Live HTTP Headers (https://addons.mozilla.org/en-US/firefox/addon/3829) and check the page in FireFox after this. It is a very convenient add-on. You would be able to see which headers your browser sends, which headers are sent back. This would allow to see all the chain of redirects you could possibly have after the initial request.

galahad3
09-18-2009, 12:30 PM
Well, I installed and ran the add-on whilst browsing to /GNETIG (one of the URLs that should forward to the exhibitions.html page)

This is the output- not sure what if anything can be done with it: (also the page just displays the 404 whereas it should forward)

User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9.0.13) Gecko/2009073022 Firefox/3.0.13 (.NET CLR 3.5.30729)
Accept: image/png,image/*;q=0.8,*/*;q=0.5
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Connection: keep-alive
Referer: http://78.136.31.10/GNETIG

HTTP/1.x 200 OK
Content-Length: 35
Date: Sun, 19 Jul 2009 17:30:00 GMT
Pragma: no-cache
Expires: Mon, 19 Jun 2000 05:41:28 GMT
Last-Modified: Wed, 21 Jan 2004 19:50:30 GMT
Content-Type: image/gif
Server: Golfe
Cache-Control: private, no-cache, no-cache=Set-Cookie, proxy-revalidate
----------------------------------------------------------
http://safebrowsing-cache.google.com/safebrowsing/rd/goog-malware-shavar_a_14361-14400.14361-14398.14399-14400:

GET /safebrowsing/rd/goog-malware-shavar_a_14361-14400.14361-14398.14399-14400: HTTP/1.1
Host: safebrowsing-cache.google.com
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9.0.13) Gecko/2009073022 Firefox/3.0.13 (.NET CLR 3.5.30729)
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Connection: keep-alive

HTTP/1.x 200 OK
Cache-Control: private,max-age=21600
Content-Type: application/vnd.google.safebrowsing-chunk
Expires: Fri, 18 Sep 2009 11:00:42 GMT
Date: Fri, 18 Sep 2009 11:28:48 GMT
Server: HTTP server (unknown)
Content-Length: 58385
----------------------------------------------------------

SKDevelopment
09-18-2009, 03:18 PM
These are 2 pages requested by GET from Google: one from www.google-analytics.com and the other from safebrowsing-cache.google.com. The browser certainly requested these pages and they have been successfully sent to the browser (200 OK server response).

But there must be more data shown to you. The requests to your pages are not shown at all.

Please clear all the data in the Live HTTP Headers, reload the page you where you saw the problem and publish the following lines here:

Basically we need in request: only GET and Host headers (normally these are the first 2 lines).
In responses we need only status codes (e.g. 200 OK or e.g. 301 Moved Permanently) and Location headers (would be present on redirects - when status code starts with "3").

Just in case I would remove any "Cookie" headers from the published data ... I am not sure which data your cookies could contain, so it is better not to publish them.

Also I would recommend to remove cookies from the previous post. Please edit it. Your browser sent only 1 cookie to safebrowsing-cache.google.com, I do not believe it is really dangerous, but I would not publish any cookies here anyway.

galahad3
09-18-2009, 03:33 PM
Okay, this is the complete content in the HTTP Headers add-on when browsing to /GNETIG as before: (I've stripped out everything apart from where I see a status code or a Location Header, which I guess is (for example) Keep-Alive 300 as that code starts with a 3. Don't know if I need to put more information than I have?

http://78.136.31.10/GNETIG

GET /GNETIG HTTP/1.1
Host: 78.136.31.10

Keep-Alive: 300


HTTP/1.x 404 Not Found

Keep-Alive: 300

Referer: http://78.136.31.10/GNETIG

HTTP/1.x 304 Not Modified

Keep-Alive: 300

Referer: http://78.136.31.10/GNETIG

HTTP/1.x 304 Not Modified

Keep-Alive: 300
Referer: http://78.136.31.10/GNETIG

HTTP/1.x 200 OK

SKDevelopment
09-18-2009, 04:10 PM
There are should be more GET requests there. They are in pairs request/response. You showed 1 request and 5 responses (there were 5 request there for sure). Still I think the responses could give the idea which is wrong ... You have got "304 Not Modified" which means the browser is taking pages from its local cache - not from the server. This is why it works once for each mistyped URL ... It simply does not get the page from the server again taking it from the local browser cahce...

Try to add to the very top of index.php the following lines:


header("Cache-Control: no-cache, must-revalidate");
header("Pragma: no-cache");

Then please manually clear cache in your browser (in the browser settings) and try to test your system again.

These headers recommend the browser not to cache the page. So probably it would help to avoid the problem when the script works only once for each mistyped URL. And then only after the modification date of the script has been changed.

galahad3
09-18-2009, 04:36 PM
Well, I tried that but it still only works the one time after I save the script.

Added this to the very top of index.php (and also another page on the site as well):

<?php
header("Cache-Control: no-cache, must-revalidate");
header("Pragma: no-cache");
?>

Something seems to be stopping it from clearing the local cache- even when I manually clear all browser history and settings...

galahad3
09-18-2009, 08:05 PM
I've done some testing in other parts of the site and found something strange- it looks as if other areas (for example, a set of pages which just run queries against a separate db as criteria and have nothing to do with the 404) are showing similar behaviour- i.e query works once, after saving script.

So it looks as if some sort of mySQL database caching (though of what sort I don't know) might be taking place. So it might not even be a problem with the 404 per se! Not sure what we can do to get to the bottom of it but have asked the server tech staff.

Thanks for the input however- and I'm sure this issue has more twists and turns yet.

SKDevelopment
09-19-2009, 02:27 PM
I do not think MySQL DB caching could affect this. It simply affects the queries in the way the result is taken from the cache and server is not queried. It would return the correct result and I think it would not affect the redirect ...

Still just in case: if you think it could be tried without caching, you could change your query to


$query = "SELECT SQL_NO_CACHE showname FROM exhibitionstable WHERE showname = '" . mysql_real_escape_string($name) . "'";

(I have added SQL_NO_CACHE after SELECT).

galahad3
09-21-2009, 10:30 AM
I'll try adding that and see if it makes a difference.

It's just odd that exactly the same behaviour is showing for a completely separate area of the site, unrelated to the 404, which utilises a separate db- which to me indicates a global setting somewhere. Although where that might be I still have no idea as yet.



EZ Archive Ads Plugin for vBulletin Copyright 2006 Computer Help Forum