1

I've got an incoming message queue implemented in a SQL database (currently MS SQL Server), accessible via a web service. This web service runs a legacy protocol we have no control over, and must return either a 'no messages in queue' response, or both:

  • Details of the oldest message in the queue
  • The number of other messages in the queue

The algorithm used for this is currently very crude:

  1. Select all messages in queue
  2. Number of messages remaining = row count - 1
  3. Update oldest message to set collected flag

We are moving to a scenario where this may now be concurrently accessed, and have obviously run into concurrency problems where the same message gets picked up twice.

The issue we have is that we need a portable solution with minimal deployment requirements. Ideally that means no stored procedures, and no engine-specific locks.

Does anyone have any bright ideas?

Thanks!

2 Answers 2

2

Since you are already using SQL server you should look into using SQL Server Service Broker.

However this ties you to SQL server...

One thing you might consider is you do a select... get a ROW ID ... update the ROW to mark it as collected with a new rowID (guid or such) that you generated... the select it AGAIN... if the row id is the one you set then you are the one who got it... if not someone got in before you and you have to get go get another.

Basically you are implementing dirty read protection yourself... because you don't want to use a transaction or stored proc...

Sign up to request clarification or add additional context in comments.

1 Comment

Actually correction... you should update the rowID where the ROWID is still the same as the one you got... then you can use the number of rows updated (row count) or select the thing back again to see if you got it first... as I originally stated it... you could still get concurency problems...
1

Ultimately, given the constraints you describe, dequeuing must be done one message at time - however there's a lot more involved in getting the message to the subscriber than just the database transaction (all the webservice handling code, network and HTTP latency).

The current system does not obviously implement much in the way of transactional control - so I assume that's not an issue for the concurrent solution either.

If it were me I'd set up a stored procedure (why do you think they're not portable?) with the operations in a transaction with a serializable lock - I'd also cache the number of queue messages with a TTL based on the number of operations / age of the cached value to reduce the number of costly aggregate operations on the underlying table. This could be done by issuing DML an operation at a time from the controllnig language - but the net effect is the operation will take significantly longer.

An alternative approach (which makes implementing locking more easy) would be to shard the data - not sure if MSSQL supports named locks - certainly its not a very portable solution - so splitting the data across seperate tables and using table locks. This makes implementing transactional control a lot simpler - but there are problems with load-blanacing / in-sequence processing (but again, the concurrent approach applies that the latter is not a big issue).

2 Comments

I'll assume that stored procs aren't portable because not all DBs have them... and if they PORT to another db solution it wouldn't work.
Basically yeah - we're concerned about the functionality not being available on another SQL engine, as well as adding complexity to deployment requirements.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.