Microservices distributed lock mechanism

Question

I am using spring boot with mongo db (azure cosmos db) in my microservices. Currently I have an Orders collection that stores Orders. These documents have a field userId that is null when the document is created.

Afterwards, a request is made from a user to assign a order to him and this field is populated. The issue is I want to make sure an user has at most one order assigned to him. If a receive two requests at the same instant, for the same user, only one of them should assign the order to the user and both should return the assigned order. And I have several replicas of the microservice deployed.

Duo to some performance constraints I cannot guarantee this constraint with unique indexes.

I was thinking in putting a distributed lock mechanism like redis where it would be locked by user, before assigning the order to that user.

Is there a better approach to meet this requirement?

Do you want to make sure a user has no more than one order assigned to them, or do you want to make sure an order is only assigned to one user? The perspective makes a difference. Why can't a user work on more than one order at a time? The business impact might be fulfilling the same order more than once. — Greg Burghardt
– Greg Burghardt, Commented Apr 13, 2023 at 20:40
User cannot have more than one order assigned to him, that is the business requirement — Andre Silva
– Andre Silva, Commented Apr 13, 2023 at 22:25
whatever you end up designing will probably be a reinvention of something that relational databases already do (such as unique indexes, as you mentioned, but probably transactions) — Stack Exchange Broke The Law
– Stack Exchange Broke The Law, Commented Apr 14, 2023 at 8:03

J_H · Accepted Answer · 2023-04-13 19:26:54Z

want to make sure an user has at most one order assigned

Due to some performance constraints I cannot guarantee this constraint with unique indexes.

I flat out don't believe that, absent benchmark results. But let's assume there exists no technology that can support your required read and write transaction rates.

I will design for ten billion user records, and steady-state one thousand orders per second.

... receive two requests at the same instant, for the same user, only one of them should assign the order to the user and both should return the assigned order. And I have several replicas of the microservice deployed.

Use Postgres or any other major RDBMS. Partition your user table, and add a NULLable currentOrder attribute to it. Here the microservice replicas are not interesting, but the number of nodes hosting your partitioned relation is.

... and both should return the assigned order.

That's fine, at the API level.

The database transaction will be an UPDATE of the user WHERE currentOrder IS NULL, and it will fail if two overlapping transactions are attempted, which is what we want. The loser can re-query a moment later to retrieve the winning order ID, and return that as the API response.

There's nothing fancy going on here, as hash of user ID has brought both clients to interact with the node hosting the user of interest. We don't need 2PC or ancillary log tables or app-visible distributed locking, since a SQL UPDATE suffices.

tl;dr: Rely on RDBMS features (partition across nodes, high availability, locking, transactions) to deliver required API functionality at required performance level.

EDIT

what do I win with [an RDBMS] approach?

ACID transactions are kind of a big deal. They let the infrastructure sweat the details, rather than the app developer. Being well tested, they are more likely to get the details right.

Here is what I know about your chosen NoSQL persistence layer:

https://en.wikipedia.org/wiki/Cosmos_DB#Partitioning

Cosmos DB added automatic partitioning capability in 2016 with the introduction of partitioned containers. Behind the scenes, partitioned containers span multiple physical partitions with items distributed by a client-supplied partition key.

That sounds like it should be enough to accomplish what you want, which is getting clients in a horizontally-scaled high-availability setup to agree to send their updates to a single arbiter node. Does Cosmos expose a powerful enough API to meet your needs, in the way that Postgres and other ACID solutions do? I don't know, I have never used that product.

If your chosen NoSQL approach needs help from other technologies on the side, this might be a good juncture to list out the pros & cons, weighing whether relaxing ACID guarantees is a good match for your business needs.

Redis is a nice cache that I have used to good effect. I have seen operational issues when it is viewed as a Source of Truth for an entity which appears only in Redis.

https://redis.io/docs/manual/transactions

What about rollbacks?

Redis does not support rollbacks of transactions ...

Maybe this is an easy-to-use paradigm which app developers seldom get wrong. I couldn't say, as I have not used that aspect of Redis. I definitely find some value in how the various mainstream ACID technologies have been battle tested in racy conditions. Reduced risk and reduced testing cost may tip the ROI scales.

We're worried about races happening when updating a User record. It is perfectly fine for currentOrder to be an opaque Mongo guid -- there's no need for a FK relationship from User to Order. Storing user records in a backend which lacks convenient transactional support sounds like you're choosing to shoulder some additional burdens up at the app layer. Whether that is worth it is a business tradeoff. Some teams are good at testing for races. History shows that it's not an easy thing to get right.

thankyou for your reply. I cannot use Indexes to guarantee this constraint because azure cosmos db does not support partial nor sparse indexes. The idea of saving the record in a relational database could work but what do I win with this approach? I would have to save and then update the record when the order is deleted. isn't it easier to just use redis has a lock for the user, before assigning the order to the user ? — Andre Silva
– Andre Silva, Commented Apr 13, 2023 at 17:47
I understand that acid transactions are a big deal. But in this case the Order is persisted in non relational database (mongo db) and this cannot be changed. It has to continue in non relational database. In this database I cannot guarantee the constraint that I want, and this is the reason I was proposing a redis before updating the record in mongodb. (this redis would simply be a lock by user) I would not save anything to it. In this case I could use postgres instead of redis, but it would be only to be used as a lock. That is why I am asking what would be the advantage of this solution? — Andre Silva
– Andre Silva, Commented Apr 13, 2023 at 19:21
@AndreSilva, why is it that you're having to work with inappropriate technologies? I don't accept that this cannot be changed - there's no sane business that would task a developer of your experience, with tackling the challenge of transactional consistency without using an existing technology designed for it. Relational database technologies are ubiquitous, free (if necessary), and they've had billions of pounds spent on their development, decades of theoretical analysis, and decades of real-world use by the largest corporations, and they solve problems nobody thinks of until they hit them. — Steve
– Steve, Commented Apr 15, 2023 at 8:44

Hans-Martin Mosner · Accepted Answer · 2023-04-13 12:07:45Z

So your users are not the ones who create the order, but are your employees ("you" = the organization who will use the software).

First question: How will those users be able to request two orders be assigned to them at the same time? Login in at two different terminals side-by-side, hitting the enter key at the same moment? Or is this a possible race condition that could theoretically happen once in a blue moon?

Second question: What are the consequences when a user has two orders assigned to them, and when will it be noticed? Is it some minor unfairness as one of their coworkers could not work on that order at the time? Does it cause severe havoc within the whole order workflow?

Third question: Why store the user/order connection with the order when the order microservice may be replicated? As @DavidT has pointed out, it might be more sensible to store the current order processed by a user in the UserService. You may also have an "AssignedOrdersService" that is not replicated and which stores these relationships to avoid possible race conditions.

thankyou for you reply. It is not frequent but it might happen and we need to avoid it. it should be protected from backend side. If a user is assigned with 2 orders the consequences (business wise) are very severe. Right now we have an Orders collection (mongo db) where orders are stored. They are initially created without being assigned to an user. Then there is an endpoint to assign the first order available to an user (setting userId field) This flow cannot be changed now. What I would like to know is the best way to avoid an user to get 2 orders assigned to him — Andre Silva
– Andre Silva, Commented Apr 13, 2023 at 17:54

DavidT · Accepted Answer · 2023-04-15 01:10:28Z

The condition you are trying to enforce is that a user has exactly one order.

This seems to be a (currentOrder) attribute of the User. Therefore you should be able to do a findAndModify() on the user that includes a condition that the currentOrder is not set.

You have said that this is a Microservice architecture - does that mean that User object is controlled by a seperate (Users) service? If so, you could still implement an "Assign Current Order" endpoint on the User service which does the findAndModify() for you.

I am a little worried that you are creating Orders without a UserId - because you may end up with Orphans - if it later turns out that another Order has been assigned to a given user - is there any way you can create Orders with the UserId set in the initial version.

If you can create all orders with the user assigned, you still may end up with multiple Orders assigned to the same User in the Orders collection, but there will only be one "currentOrder in the Users collection, hence you can:

Identify any cases where there are more than one Order.
Use the Users service to identify which one is the real Order.
Clean up the other Orders.

Edit: Providing A Second Solution

Given the pre-condition that we are not allowed to use the User collection to solve the problem (Previous Solution).

The race condition you are concerned about only exists for a finite amount of time (lets call that amount of time T), specifically how long it takes to run two MongoDB operations:

Check to see if an Order exists that has already has been assigned to user U.
Update an Order to assign it to user U.

Time T may be longer than you initially, expect for example you may need to allow for replication delays within the MongoDB cluster. Anyway the critical observation is that we only need to protect against the race condition for time T - after that, the record will exist in MongoDB and no protection is required.

Therefore all you need is a solution that prevents an additional requests from trying to assign any further Orders to user U for T seconds, after an existing assignment has been attempted for User U.

Note: The text below is speculative (only for illustrative purposes), the true solution is the previous line.

I think even the MemCache add command can be used to solve that problem:

https://github.com/memcached/memcached/wiki/Commands#add

I.e.:

Create a new Memcache instance - single node.
Every time a request to assign a user to an order comes in try to "add" the user to Memcache (the user id would be the key - the value doesn't matter).
Set the memcached record to expire in T * 10 (whatever value makes sense).
If the add fails because the record already exists in Memcache return a failure from your REST service.
If the Memcache add succeeds:
- Check to make sure no order already exists in Mongo - this check could also fail.
- If all good - assign the user to the order
Add a health-check to Memcache so it is auto-reprovisioned, if it fails.
Ensure the re-provisioning takes more than T seconds.

Our business requirements demand creating orders that are unassigned (userId field is null). And this microservice that owns the orders collection has an endpoint that allows to assign the first order available to that user. — Andre Silva
– Andre Silva, Commented Apr 13, 2023 at 11:44
If I receive a request to assign orders for the same user at exactly the same time How can I prevent from that user to end up with 2 orders? findAndModify will not work, because both replicas will execute it, and user will end up with 2 orders assigned to him — Andre Silva
– Andre Silva, Commented Apr 13, 2023 at 11:46

Greg Burghardt · Accepted Answer · 2023-04-13 21:00:54Z

If you were using a relational database, creating a stored procedure that returns an error code if the order is already assigned to a user would be a good solution. This eliminates race conditions, because the database will handle concurrency, and therefore race conditions would not be a factor.

You mentioned MongoDB, which doesn't have stored procedures, but Atlas Functions serve the same purpose. Create an atlas function in the database that does the actual assigning of the order to a user. Since this executes at the database level you let MongoDB handle the concurrency issue. This function could return an error code if the order has already been assigned to a user.

It could be as simple as return order.userId; no matter what. If the order is already assigned to a user, return the userId of that user. If the order has not been previously assigned, return the userId of the user that was passed to the atlas function. The client just needs to check that the userId it sent is the same one it got back. This serves two purposes:

The operation of assigning an order to a user can be called multiple times safely without unassigning an existing user.
The client is provided with information so that it can communicate upstream that the order has already been assigned, and optionally inform an upstream system which user that is.

Ideally, each micro service should be designed so duplicate requests or messages get ignored. Sometimes the micro service is not the best place to handle this. Use any tool available to that micro service, which in your case could be Atlas Functions in MongoDB.

Stack Exchange Network

Microservices distributed lock mechanism

4 Answers 4

Edit: Providing A Second Solution

Your Answer

Hot Network Questions

Microservices distributed lock mechanism

4 Answers 4

Edit: Providing A Second Solution

Your Answer

Sign up or log in

Post as a guest

Related

Hot Network Questions