I *think* FouLu is agreeing with me. He wrote
This is pretty much exactly what I was thinking as well, minus the staging table. I can see your point on this one, but treating it as staging then truncating it is a good idea.
So... It's really a variation on your original plan.
Yes, you have a separate table that you import the CSV file into.
But you use your master table to keep *ALL* records *EVER* imported. And you use that flag, isCurrentlyActive
(or any other name you prefer, of course), to indicate "live" records. That way, all your prior records are always there. And having all in a single table is *highly* preferable. For example, it means you can search both historical and current data in a single query by simply ignoring th isCurrentlyActive
Do you understand the steps involved in this:
UPDATE maintable SET isCurrentlyActive = 0;
INSERT INTO maintable (primaryKeyField, field1, field2, field3, isCurrentlyActive )
SELECT primaryKeyField, field1, field2, field3, 1 FROM stagingtable
ON DUPLICATE KEY UPDATE isCurrentlyActive = 1;
TRUNCATE TABLE stagingTable;
It's really pretty simple:
First, you import the CSV file into your "staging" table. (That step is not shown above.)
Then, you change *ALL* records in the main table to indicate that they are historic. That is, they are *NOT* currently active.
Now you merge your "staging" records into the main table. But the ON DUPLICATE KEY
simply says "if I try to merge a record from staging into the main table *AND* the KEY for that record already exists in the main table, then forget about the merge and instead simply mark the record in the main table as *ACTIVE* again." Doesn't that make sense? You first made all the records inactive and then, on the merge, you "resurrect" any records that are being re-imported.
Once the merge is done, you wipe out all the records in the staging table, ready to do it all again next week.