...

View Full Version : Efficient way of inserting thousands of rows into database?



XtremeGamer99
12-01-2010, 05:57 AM
I'm building an application that parses chatlog files line by line and inserts them into a database. I currently have this bit of code:


if (!empty($this->goesIntoDatabase)) {
foreach ($this->goesIntoDatabase as $id => $text) {
$data = explode('|', $id);
$this->DB->e("
INSERT INTO `data` (channelID, time, author, text)
VALUES (?, UNIX_TIMESTAMP(?), ?, ?)",
$channelInternalID,
$data[0],
$data[1],
$text);
}
}

It basically runs through an array that contains data, and inserts a row into the database.

This is all fine and dandy, but in some extreme circumstances this data could be thousands of lines long. I have one example here of 9,100 lines, and it takes 6.523 seconds to run through it (which, for the amount of data, isn't too bad).

I'm trying to refine it a bit. I was wondering if there was a more efficient way of doing bulk INSERTS rather than doing them one at a time. I've read that transactions might be very useful in this situation. I've also heard about dumping the data into a file in a certain format and then loading that via a SQL statement. I'm looking for more opinions/advise/thoughts on what can be done.

Thanks!

Lamped
12-01-2010, 11:00 AM
This may not be entirely helpful, but a few years ago I had to put around 100,000 rows into a database. phpMyAdmin took around 5 minutes. Running mysql from the terminal and piping the SQL directly into it took around 5 seconds. I don't know why there's such an extreme difference in speed...

Have you looked into INSERT DELAYED?
http://dev.mysql.com/doc/refman/5.1/en/insert-delayed.html

I doubt the data will actually be added any faster, but at least your script should finish pretty quickly and you can just leave MySQL to get on with it.

XtremeGamer99
12-01-2010, 11:19 PM
Well, that kinda helped, but not enough. It dropped it down about 1-2 seconds... so instead of 6-7 secs it's about 5ish seconds

EDIT: Also, while I'm currently using MyISAM, I plan to switch to Inno for transaction support down the road... and DELAYED doesn't work with Inno

Lamped
12-01-2010, 11:26 PM
Two suggestions then:

1. If you have shell access, dump it to a file and load it with the mysql cli client. You can do this by writing SQL statements to the file. You can also use the SQL function to load a CSV-like file as documented here: http://dev.mysql.com/doc/refman/5.1/en/load-data.html - though I find a lot of hosts have the required permissions for this feature disabled by default.

2. If you're using INSERT DELAYED, the amount of data may be an issue. You can reduce this by combining multiple VALUES into one statement like so:


INSERT DELAYED INTO `table` (`fields`, `fields`...) VALUES (...), (...), (...);

You'll have to figure out how big you can make the individual INSERT statements before PHP runs out of memory or otherwise causes performance issues.

You might also consider storing time() in a variable, and passing that directly instead of UNIX_TIMESTAMP() - it's extra parsing and system calls each time.

XtremeGamer99
12-02-2010, 01:18 AM
I'm trying to create a portable app, so using shell isn't helpful.

I also tried combining multiple values into one statement, but I went past the max packet size or whatever. I'm sure I can increase this, but again, many people cannot, and so I'd rather not mess with it.

Also, each insert has a different time, which is why I have UNIX_TIMESTAMP running, utherwise I would do as you suggested.

But I think I have it under control now. I tried LOAD DATA INFILE and that seems to work like a charm! Where 9,000 rows used to take 6 seconds, it now takes, on average, 600-700ms. Very doable. I may also be able to optimize that just a tad bit more...

=)

Lamped
12-02-2010, 09:07 AM
Great, that's good news.

One note about LOAD DATA INFILE and portability: you might want to check for errors after running it, and fall back to the old slow insert. Last time I used LOAD DATA INFILE was in cPanel (arguably one of the most popular web server control panels) and I had to go playing around with the user permissions to get it working.



EZ Archive Ads Plugin for vBulletin Copyright 2006 Computer Help Forum