ConfusedOfLife
08-23-2004, 12:20 PM
Hi
Lemme explain the full scenario for you at first. I’m writing an online investment program. Clients should be able to put their money in the web site for a special period, let’s say a month, and after it by the confirmation of the administrator, their money + their interest will be returned to them. The problem that I’m facing is the same old 2 concurrent threads running against each other problem, something that’s been the major problem in OS developments!
Let’s say I have a field called “paid” in my investment table. Tonight when the admin gets online and visits his administration page, he sees that 4 persons are eligible to get their money back. He checks the checkboxes for all these 4 ppl and confirms that they get their money back. When the form is posted to the server, my PHP script checks to see if the “paid” field for these records is not “1”, if it’s “0”, then it starts to pay the money from web site account to these ppl’s credit cards. Let’s say at the same time (before the admin selects these records and submits the payment form) he opens the same page in another browser. So, he’d see the same 4 ppl in there too, and he checks these ppl and submits these 2 forms at the same time?! This is a slight possibility that my script pays a person twice! Let’s say how:
First of all I have two big problems that I think do not let me use transactions; or better to say transactions become completely worthless in my case:
1- I’m communicating with the bank gateway (e-gold), and I’m sending money to it, when I send money to a person’s e-gold account, there is NO WAY to get it back! So, I can’t do any ROLLBACK!
2- There are multiple threads running concurrently, which means even if I wasn’t communicating with another gateway, transactions couldn’t help me, because transactions follow the ACID rules, and the “I” here means “Isolations”, which means before a transaction is committed/rollbacked, no changes are visible to other threads!
Basically my problem is that I’m really afraid that 2 threads read “paid” field at the same time when it’s still zero and start their work to pay the user!
What did I come up with myself? Well, I thought of several solutions for my problem:
1- I use another field for accounts table like “block”, so, when a thread gets a payment request, it firsts checks to see if this field is 1, if yes, it should abort, if not, it sets it to 1 and starts the payment process. But this solution doesn’t work either, because: 1- We still have the same problem! What if 2 threads read it simultaneously when it’s zero, and both UPDATE it to 1?! So, they can still start the same process at the same time! 2- What if the working thread forgets to UPDATE the block field to 0 when a problem comes up that it can not pay the money? This way the poor person never gets his money back because his thread is locked forever! Of course I can run a cron job every night for example to UPDATE investments SET block=0 WHEN block_date > 24 hours ago! But it’s dirty work, I mean I wana do everything at database level and it’s what databases are for!
2- I use GET_LOCK() function. By using GET_LOCK() at the beginning of my thread, I can lock the whole database till my process is finished and I can be sure that no other thread can unlock the database till the whole job is done. This method is so safe because MySQL itself promises that it does everything by itself, But you know what? It’s better if I could lock the records I needed, not the whole database! This way the poor admin won’t be able to have 2 concurrent processes at the same time even for those records that really do not go against each other!
3- I use another table called “block”. This table has 2 fields: account_id, date_time; by account_id being the primary key. So, whenever my script receives several record IDs to work on, first of all it tries to insert them all in “block” table. If the operation is successful it means that all the records we want are free and the thread can work with them! So, it starts its work and after the whole job is finished, it DELETE FROM blocks WHERE account_id IN ( $IDsArray). Yes, as you noticed yourself this way is really close to my first solution, but it has one big advantage: instead of using UPDATE to set the block field to 1, I use another table and use INSERT. What’s the difference? Well, MySQL guarantees that even if 100000000 threads are running at the same time for the same set of ID, they can’t insert it all at the same time in “block” table, because “account_id” is primary key and doesn’t let duplication! So, the first lucky thread can insert the ID and start its work, but other threads get an error message which means they have to wait! I still have the same problem of the CRON job, that should check this table every 24 hours and deletes the records that are older than 24 hours old. Even though this trick make my program error prone almost 99.99999999%, but classically we still have the same problem: what if a process takes a time more then 24 hours?! In this method I’m really relying on the fact that my PHP process doesn’t take 24 hours!
What do you think guys? I know this is a really really long post, but this is a design architecture that I wanted to share with you to see your point of views and potential solutions.
Thanks a lot
Lemme explain the full scenario for you at first. I’m writing an online investment program. Clients should be able to put their money in the web site for a special period, let’s say a month, and after it by the confirmation of the administrator, their money + their interest will be returned to them. The problem that I’m facing is the same old 2 concurrent threads running against each other problem, something that’s been the major problem in OS developments!
Let’s say I have a field called “paid” in my investment table. Tonight when the admin gets online and visits his administration page, he sees that 4 persons are eligible to get their money back. He checks the checkboxes for all these 4 ppl and confirms that they get their money back. When the form is posted to the server, my PHP script checks to see if the “paid” field for these records is not “1”, if it’s “0”, then it starts to pay the money from web site account to these ppl’s credit cards. Let’s say at the same time (before the admin selects these records and submits the payment form) he opens the same page in another browser. So, he’d see the same 4 ppl in there too, and he checks these ppl and submits these 2 forms at the same time?! This is a slight possibility that my script pays a person twice! Let’s say how:
First of all I have two big problems that I think do not let me use transactions; or better to say transactions become completely worthless in my case:
1- I’m communicating with the bank gateway (e-gold), and I’m sending money to it, when I send money to a person’s e-gold account, there is NO WAY to get it back! So, I can’t do any ROLLBACK!
2- There are multiple threads running concurrently, which means even if I wasn’t communicating with another gateway, transactions couldn’t help me, because transactions follow the ACID rules, and the “I” here means “Isolations”, which means before a transaction is committed/rollbacked, no changes are visible to other threads!
Basically my problem is that I’m really afraid that 2 threads read “paid” field at the same time when it’s still zero and start their work to pay the user!
What did I come up with myself? Well, I thought of several solutions for my problem:
1- I use another field for accounts table like “block”, so, when a thread gets a payment request, it firsts checks to see if this field is 1, if yes, it should abort, if not, it sets it to 1 and starts the payment process. But this solution doesn’t work either, because: 1- We still have the same problem! What if 2 threads read it simultaneously when it’s zero, and both UPDATE it to 1?! So, they can still start the same process at the same time! 2- What if the working thread forgets to UPDATE the block field to 0 when a problem comes up that it can not pay the money? This way the poor person never gets his money back because his thread is locked forever! Of course I can run a cron job every night for example to UPDATE investments SET block=0 WHEN block_date > 24 hours ago! But it’s dirty work, I mean I wana do everything at database level and it’s what databases are for!
2- I use GET_LOCK() function. By using GET_LOCK() at the beginning of my thread, I can lock the whole database till my process is finished and I can be sure that no other thread can unlock the database till the whole job is done. This method is so safe because MySQL itself promises that it does everything by itself, But you know what? It’s better if I could lock the records I needed, not the whole database! This way the poor admin won’t be able to have 2 concurrent processes at the same time even for those records that really do not go against each other!
3- I use another table called “block”. This table has 2 fields: account_id, date_time; by account_id being the primary key. So, whenever my script receives several record IDs to work on, first of all it tries to insert them all in “block” table. If the operation is successful it means that all the records we want are free and the thread can work with them! So, it starts its work and after the whole job is finished, it DELETE FROM blocks WHERE account_id IN ( $IDsArray). Yes, as you noticed yourself this way is really close to my first solution, but it has one big advantage: instead of using UPDATE to set the block field to 1, I use another table and use INSERT. What’s the difference? Well, MySQL guarantees that even if 100000000 threads are running at the same time for the same set of ID, they can’t insert it all at the same time in “block” table, because “account_id” is primary key and doesn’t let duplication! So, the first lucky thread can insert the ID and start its work, but other threads get an error message which means they have to wait! I still have the same problem of the CRON job, that should check this table every 24 hours and deletes the records that are older than 24 hours old. Even though this trick make my program error prone almost 99.99999999%, but classically we still have the same problem: what if a process takes a time more then 24 hours?! In this method I’m really relying on the fact that my PHP process doesn’t take 24 hours!
What do you think guys? I know this is a really really long post, but this is a design architecture that I wanted to share with you to see your point of views and potential solutions.
Thanks a lot