PDA

View Full Version : query to find closest to [viewer] by zipcode?


boeing747fp
01-21-2010, 03:34 AM
ok i found a php class that compares the distance between 2 zipcodes.... however, i have a list of let's say 50,000 users with zipcodes and i need to have a query that finds the closest ones to [viewer] by zipcode distance without grabbing all the sql rows and running them through the php class.... is there something in MySQL that can do this? i've seen something about "OpenGIS" or "Spacial Extensions in MySQL" but i dont see how to do it...

Old Pedant
01-21-2010, 06:21 AM
If you don't mind looking at ASP code, I show an example here:
http://www.ClearviewDesign.com/Newbie

The ASP code isn't the important part; the DB design is the key to it all.

True, I then use some ASP code to set up the query to make it more efficient, but translating that part to PHP should be easy.

Download the ZIP file, read the "README", and then read the ASP file.

The important parts of the operations are explained reasonably well, I think.

All you care about in the ASP file is the code between
' first, find that zip in the DB:
...
... and ...
...
' and now we are ready to produce the HTML results and invoke the Google map

boeing747fp
01-22-2010, 02:33 AM
that doesnt help me...

Old Pedant
01-22-2010, 03:00 AM
Because? Because you don't understand the ASP code well enough to translate it? Or don't understand the changes I made to the DB table? Or???

Bazz, another regular here, is in the process of adapting this scheme to PHP for his site.

The principle is pretty simple: You get the latitude/longitude of the "home" zip code.

Then you make "box" around that lat/long big enough to make sure you are getting all other zips within range of the desired radius. That box becomes the min/max lat/long that you will use in the WHERE clause of your query, so that you don't have to actually do the distance calculation on all 50,000 records. Say that gets you down to 30 to 50 records to look through. So then you just use the pythagorean theorem to calculate the distance from the "home" zip code to each of the ones in range, per the "box".

The refinement I put on it--to make the performance better--is to *precalculate* the distance in miles from (0,0) lat/long, based on the zip code's lat/long. And to recognize that the number of miles in one degree of longitude varies, from the same as a degree of latitude at the equator to zero miles at the poles. So the basic distance calculation comes down to
(milesForLatPoint - milesForLatHome) ^ 2
+ ( milesPerDegreeLongAtHomeLat * ( longForPoint - longForHome ) ) ^ 2

That gets the square of the distance, so just take the square root of it to get he distance.

That's all reflected in the SQL query that I then use. Tranlsated into PHP terms, something like this:

$sql = "SELECT SQRT(Z.dist) AS radius, Z.* "
. " FROM (
. " SELECT ((latitudeMiles-$homeLat)^2 "
. " + ($longMPD*(longitude-$homeLong))^2) AS dist, * "
. " FROM ZipCodes "
. " WHERE latitudeMiles BETWEEN $minLat AND $maxLat "
. " AND longitude BETWEEN $minLong AND $maxLong "
. " ) AS Z "
. " WHERE Z.dist < $squareOfRadius "
. " ORDER BY 1, Z.zipcode; ";

Not a PHP person, but that should be close.

boeing747fp
01-22-2010, 06:09 PM
well i have to sort ALL 50,000 users by zipcode closest to the viewer, not just within a certain radius...

boeing747fp
01-22-2010, 06:51 PM
i've found a query that can do what i want however inside the query there is a large math equation and some of the values of the equation need to come from a separate table based on the current queried-row's zipcode....


$origin_lat=-97.676864;
$origin_lon=31.100505;

SELECT *,3956*2*ASIN(SQRT(POWER(SIN((".$origin_lat." - abs(".$destination_lat."))
*pi()/180/2),2)+COS(".$origin_lat."*pi()/180)*COS(abs(".$destination_lat.")*pi()/180)
*POWER(SIN((".$origin_lon."-".$destination_lon.")*pi()/180/2),2))) as distance FROM
accounts WHERE zipcode != '' AND zipcode >0 ORDER BY distance ASC

$destination_lat and $destination_lon need to come on a row-by-row basis to grab lat/lon from table "zip_codes" where zip_codes.`zipcode` = accounts.`zipcode`

.... is this even doable?

Old Pedant
01-22-2010, 08:20 PM
Certainly if you *NEED* all 50,000 distances, then yes, you need something like this.

But if you only need (say) the 10 closest, then a variation on what I show makes more sense.

Except in Alaska (and maybe Montana? Nevada?) if you used my code with a radius of, say, 100 miles, you'd almost surely get a lot more than 10 zip codes.

You could certainly do a query where you asked for all those in a radius of 25 miles and, if you didn't get enough results, expand to 50 mile radius, etc. Surely would be faster than processing all 50,000.

But yes, if you need the distance between zip codes that are more than a couple of hundred miles apart, my code won't be accurate enough.

Did you try my little online demo there?

http://www.clearviewdesign.com/Newbie/Demos/zipcodes/zipLocatorAndMapDemo.asp

Pick a zip code (e.g, 90210, Beverly Hills) and a good sized radius (say 50 miles?). That gets you roughly the *EIGHT HUNDRED* (800) closest zip codes. Surely that's enough?? It's sure more than show comfortably on a single web page, as you can see.

Even if you use a zip code of 80101 (Agate, Colorado, which is out in the middle of nowhere) and a radius of 50 miles, you get over 60 "hits". (Although you only get 1 if you use 15 miles!)

boeing747fp
01-22-2010, 10:12 PM
well it's going to paginate the results but the query needs to be able to find all of them closest to [viewer] and then the pagination will just handle LIMIT ##,## at the end of the query...

boeing747fp
01-23-2010, 12:01 AM
i figured it out. added `lat` and `lon` fields to table `accounts`, filled them with the lat/lon values for those zipcodes, and then this query works :D


$mylat=##########; //viewer lat
$mylon=##########; //viewer lon
mysql_query("SELECT *,3956*2*ASIN(SQRT(POWER(SIN((".$mylat." -
abs(`lat`))*pi()/180/2),2)+COS(".$mylat."*pi()/180)*COS(abs(`lat`)
*pi()/180)*POWER(SIN((".$mylon."-`lon`)*pi()/180/2),2))) as distance
FROM `accounts2` WHERE zipcode != '' AND zipcode >0 ORDER BY
distance ASC");

Old Pedant
01-23-2010, 12:01 AM
Good luck.

I think this is going to run slower than all get out *and* eat up *tons* of CPU resources.

You may have to resort to creating a semi-temp table in order of distance from the viewer and then using LIMIT on the records in it.

boeing747fp
01-23-2010, 12:07 AM
well it took 0.054 seconds for 14,000 results which isnt too bad. this page is not going to be hit constantly so it should be alright

Old Pedant
01-23-2010, 01:11 AM
I am, quite frankly, amazed by that number. Good old MySQL continues to surprise.

bazz
01-23-2010, 06:17 AM
i've found a query that can do what i want however inside the query there is a large math equation and some of the values of the equation need to come from a separate table based on the current queried-row's zipcode....

Sounds to me like you are trying to make your database fit the query but, as Old Pedant said in a post, I have been trying to do something loosely similar. He is a wise one and despite my temporary inability to grasp the thing sufficiently quickly - even though I had use the basic (POW) structure for grid refs, - he helped me greatly. Now it works (which doesn't surprise me).

Anyway, have you tried to get a list of the central points of all your relevant zip codes? If you have those, OP's suggestion near the beginning will work for you.


bazz