12-07-2011, 08:26 PM
I have a table with over 50 Million rows. Two of the columns were Text fields, storing URLs and UserAgents. I just completed a change wherein I moved the URLs to a separate table and changed the field to Int, to store ID numbers pointing to the associated URLs in the other table.
In doing this I expected the filesize of the table to plummet, but it barely changed. The table was 16 GB, and I assumed that storing a URL for each row (with an average length of 40-50 characters, but sometimes 300+ characters) was about 40% of the total size. After changing over to IDs the table is still over 15GB. What kind of sense does that make?
The table is MyISAM and I understand that Text data is stored outside of the row itself, but it's still obviously part of the table and contained in the MYD file, so it should factor into the total size just the same. But it doesn't?
12-07-2011, 08:48 PM
Maybe you need to compact the database? If you did all this "in place" in the current database, MySQL may not have been able to recover the now unused space because the data in the table is so badly fragmented. Expecially, for example, if you still have the UserAgent fields in there as TEXT fields. Consider: Record 1718 has two TEXT fields. So MySQL allocates space in another disk block for those two TEXT fields, puts the data there, and then stores just a pointer to the data in the main record. You come along and (effectively) delete one of the two TEXT fields. But the other one is still there in that same disk block. And now there is a lot of unused space in that disk block, but until you compact that table, MySQL has no way to recover that space. (It might not even have a good algorithm for reusing the space...that I don't know.)
You might try running mysqlcheck to see if it tells you anything. and/or try running it to "repair" the tables. I don't know if it will then compact them or not. It's quite possible that the only way to compact them would be to dump them and rebuild them.
12-07-2011, 09:06 PM
Hmm, well what I did is I replaced the text data with IDs over the course of a few days on a cronjob. Left the fields as Text datatype at first. Before it was quite finished I saw Overhead of around 500 MB and ran Optimize Table, which got rid of that (it did a Repair By Sorting). When i was fully finished and changed the data type, that caused a Repair By Sorting as well, so I would have thought that the table was fully rebuilt.
I need as close to 100% uptime as possible so I hesitate to run a Check Table which I think causes about 30 minutes of downtime.
12-07-2011, 09:54 PM
I honestly don't know if Optimize Table recovers unused TEXT space or not. It would only be able to do so if it consolidated disk blocks containing TEXT, and since that would be a pretty intensive process, if the optimize took less than a few minutes I doubt that it did so.
*IF* MySQL (MyISAM) is smart enough to be able to reuse the now unused TEXT space, then at least as you add data you shouldn't find the table growing quite as fast. But MyISAM isn't the brightest DB engine in the world, so I dunno if it can reuse the space or not.