5

I need a simple table that acts as a Queue. My MySQL server restriction is I can't use InnoDB tables, only MyISAM.

Clients/workers will work at the same time and they will need to receive differents jobs each time.

My idea is to do the following (pseudo-code):

$job <- SELECT * FROM queue ORDER BY last_pop ASC LIMIT 1;
UPDATE queue SET last_pop WHERE id = $job->id
return $job

I had tried table lock and "GET_LOCK" but nothing happends, workers sometimes receives same jobs.

0

3 Answers 3

13

You need to turn your ordering around so there is no timing window.

Consumer POP (each consumer has a unique $consumer_id)

Update queue 
set last_pop = '$consumer_id' 
where last_pop is null 
order by id limit 1;

$job = 
  Select * from queue 
  where last_pop = '$consumer_id' 
  order by id desc 
  limit 1;

Supplier PUSH

insert into queue 
  (id, last_pop, ...) 
values 
  (NULL, NULL, ...);

The queue is ordered in time by the id column and assigned upon POP by to the consumer_id.

Sign up to request clarification or add additional context in comments.

7 Comments

+1 - I was going with a status field, but your idea was along the same of mine.
I'm pretty sure this solution is wrong. There's a race condition between the UPDATE and the SELECT. Imagine you have two parallel requests running this code. What if this is the order it gets executed in: UPDATE 1, UPDATE 2, SELECT 1, SELECT 2. You will end up SELECTing the same exact row.
@OlegKikin, UPDATE2 cannot possibly collide with UPDATE1, as last_pop cannot be NULL after UPDATE1.
A failure after the UPDATE and before the SELECT will result in an orphaned item. START TRANSACTION and COMMIT are necessary to avoid that risk.
@ethanpil, The last_pop field will be NULL until the item is claimed, and then will contain the unique consumer ID when selected for processing. There is no timing window because the select does not need to be coincident with the update.
|
1

Just for information, there's another option that is using Gearman instead of a table to make the queue: Rasmus Lerdorf wrote a very nice article about it.

Comments

0

Oleg,

The solution is correct. $consumer_id must be a unique identifier for the processor. If you had a couple cron jobs on one machine, for example, you could use their pid as the consumer ID .

The UPDATE is atomic, so it marks exactly one row in the queue as being consumed by your ID.

For some applications I also have a status field for finished, so that if last_pop's consumer_id is set, but the finished flag is not set and the job is older than X, it can be marked to be restarted.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.