Hello and welcome to our community! Is this your first visit?
Register
Enjoy an ad free experience by logging in. Not a member yet? Register.
Results 1 to 5 of 5
  1. #1
    Regular Coder
    Join Date
    Jun 2010
    Posts
    278
    Thanks
    63
    Thanked 8 Times in 8 Posts

    How to speed up my query?

    I have a couple of tables that I am querying together (a join or inner join? Not sure what it's called).

    Anyway, here's the background:

    I have a grid of cartesian points, each point has a name, and various other attributes. Note that this grid is changeable.

    I have a set of items which sit at various grid points. Note that when the grid chages, the item locations need to change with the grid (i.e. each item is referenced to a grid point itself and not to its location).

    There are two tables, created as follows:

    Firstly, the grid of points:
    Code:
    create table gridPoints (
      pointId int(32) unsigned not null auto_increment primary key,  -- index into the table.
      gridX double precision,
      gridY double precision,
      ... (various other bits and pieces).
    );
    Secondly, the grid of items:
    Code:
    create table gridItems (
      pointId int(32) unsigned not null,  -- The reference to the item's location.
      ... (various other bits and peices).
    );
    So that's straightforward.

    Now, I have written a bit of PHP which returns the items that are within a certain distance of a user-selected cartesian coordinate.

    The MySQL query is trivial, it's just pythagoras:

    Code:
    select gi.*, gp.* from gridItems gi, gridPoints gp where
    sqrt(pow(gp.gridX - coordX, 2) + pow(gp.gridY - coordY, 2)) < distance
    and gi.pointId = gp.pointId;
    -- coordX and coordY are numerical values created by the PHP code.
    You'll agree that that is extremely simple,

    *HOWEVER*

    There is a problem!!! It's very simple ... and very slow. The reason is that gridPoints contains in excess of 3,000,000 (yes, 3 million) rows!! gridItems contains about 10,000 rows. The huge number of rows in gridPoints makes the distance query run very very slowly.

    One obvious solution is to change gridItems so that it looks like this:

    Code:
    create table gridItems (
      pointId int(32) unsigned not null,  -- The reference to the item's location.
      gridX double precision,
      gridY double precision,
      ... (various other bits and peices).
    );
    I.e., so that when an item row is added, it's grid X and Y components are copied from the gridPoints table, that way I avoid having the join on a table with 3,000,000 rows. That will make the query much faster. BUT, it also means that the position data is duplicated, which means that when the grid positions change, I need to change all the grid X and grid Y positions in the gridItems table too ... it's clear that's an accident waiting to happen! (in my experience, duplicating data is rarely a good idea).

    So ...is there a method that one of you MySQL gurus knows that I can use to speed up my query as it is, without needing to duplicate data?

    Thanks in advance

  • #2
    Super Moderator guelphdad's Avatar
    Join Date
    Mar 2006
    Location
    St. Catharines, Ontario Canada
    Posts
    2,631
    Thanks
    4
    Thanked 147 Times in 138 Posts
    please run a SHOW CREATE TABLE statement on each of your tables and show us the output here. Paste between CODE tags.

    While you may think 3 million rows is a lot, it is still relatively small in database terms.

    If you are missing useful indexes that would be the first step in improving performance.

  • #3
    Regular Coder
    Join Date
    Jun 2010
    Posts
    278
    Thanks
    63
    Thanked 8 Times in 8 Posts
    One learns a new command every day!!

    Ok here, we go. Note that this is on the real tables, not the example ones as shown in my original post. tw_geolocations is the real gridPoints table and tw_members is the real gridItems table.

    easting and northing in the geolocations are WGS84 grid easting and northing coordinates (analagous to x and y).

    Code:
    mysql> show create table tw_geolocations;
    +-----------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    | Table           | Create Table                                                                                                                                                                                                                                                                                 |
    +-----------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    | tw_geolocations | CREATE TABLE `tw_geolocations` (
      `row_id` int(32) unsigned NOT NULL,
      `outcode` varchar(4) NOT NULL,
      `incode` varchar(3) NOT NULL,
      `easting` double NOT NULL,
      `northing` double NOT NULL,
      `district` text NOT NULL,
      PRIMARY KEY (`row_id`)
    ) ENGINE=MyISAM DEFAULT CHARSET=latin1 |
    +-----------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    1 row in set (0.01 sec)
    
    mysql> show create table tw_members;
    +------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    | Table      | Create Table                                                                                                                                                                                                                                                                                                                                                                                             |
    +------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    | tw_members | CREATE TABLE `tw_members` (
      `row_id` int(32) unsigned NOT NULL AUTO_INCREMENT,
      `when_created` datetime NOT NULL,
      `username` text NOT NULL,
      `email` text NOT NULL,
      `passwordEncrypted` text NOT NULL,
      `actid` text NOT NULL,
      `access` enum('pending','waiting','member','admin','su','suspended') DEFAULT NULL,
      PRIMARY KEY (`row_id`)
    ) ENGINE=MyISAM AUTO_INCREMENT=41692 DEFAULT CHARSET=latin1 |
    +------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    1 row in set (0.00 sec)
    Last edited by XmisterIS; 12-19-2011 at 03:02 PM.

  • #4
    Supreme Master coder! Old Pedant's Avatar
    Join Date
    Feb 2009
    Posts
    25,020
    Thanks
    75
    Thanked 4,323 Times in 4,289 Posts
    I have done something very similar to this when selecting zip codes to be within X miles of a home zip code.

    The "trick" is to PRE-select the points that have any possibility at all of being in range.

    And, of course, that is simple to do.

    You just exclude those where the x or y coordinate is more than distance away from your target point.

    And now, if the x and y coordinates are indexed, you end up having to do the actual distance calculation on a much smaller set of points.

    The rough idea of the code would be this:
    Code:
    SELECT primaryKey
    FROM gridPoints 
    WHERE gridx BETWEEN coordX-distance AND coordX+distance
    AND gridy BETWEEN coordY-distance AND coordY+distance
    Make that a NESTED SELECT of your main query.

    That is:
    Code:
    SELECT list, of, fields 
    FROM gridItems AS gi, 
         gridPoints AS gp,
         ( SELECT primaryKey
           FROM gridPoints 
           WHERE gridx BETWEEN coordX-distance AND coordX+distance
           AND gridy BETWEEN coordY-distance AND coordY+distance ) AS PRE
    WHERE
    gp.primarykey = PRE.primarykey
    AND (pow(gp.gridX - coordX, 2) + pow(gp.gridY - coordY, 2)) < distanceSquared
    AND gi.pointId = gp.pointId;
    Notes:
    (1) Don't use SELECT * (even SELECT gi.*, gp.*), please! Select ONLY the fields you actually need in your PHP code.
    (2) Calculate the values shown in red in PHP code, ahead of time.
    (3) Of course make sure that gridX and gridY are INDEXED in your table.
    (4) BETWEEN *can and will* make use of the indexes on gridX and gridY better than other conditions would.
    (5) It's a minor gain, but don't do the SQRT calculation. Square the distance, as shown, to avoid the need for that.

    Give it a shot. How much it will improve performance will, of course, depend on how many points lie within the square area that the inner SELECT limits you to but are then outside the circular radius, compared to how many points you eliminate with the pre-selecton of the square.
    An optimist sees the glass as half full.
    A pessimist sees the glass as half empty.
    A realist drinks it no matter how much there is.

  • #5
    Supreme Master coder! Old Pedant's Avatar
    Join Date
    Feb 2009
    Posts
    25,020
    Thanks
    75
    Thanked 4,323 Times in 4,289 Posts
    p.s.: I think that doing DESCRIBE tablename is easier to read than SHOW CREATE tablename
    An optimist sees the glass as half full.
    A pessimist sees the glass as half empty.
    A realist drinks it no matter how much there is.


  •  

    Posting Permissions

    • You may not post new threads
    • You may not post replies
    • You may not post attachments
    • You may not edit your posts
    •