2

I have a question concerning redis in a distributed architecture.

Assume I have n clients, either windows desktop applications or asp.net web/web api servers.

One of the clients, lets say client A, hits the cache for a data and has a miss (the data is not in the cache). The client then starts to get the real data (from lets say a database) and then sets it in the cache when it's done.

Client B comes along and wants the same data, does a fetch to the cache and since it's a miss, does the same processing.

Is there a way for Client B to ...(N) not to do the processing (i.e go to the database) until the data is in the cache and fetch the data from the cache instead when it's available?

I understand that on a single app (or web server), using threads it's easy to check that, but in a distributed architecture?

Is this also a correct way of thinking as well? for the wait process that is If so then could Client A put a flag somewhere stating that he's loading Data X and that all other clients should wait until he's done?

Otherwise, the idea then would be something along the lines of :

Client A requests Data X
Miss in cache
Processes Data X
Looks if Data X is now in cache
If not, add Data X to cache, otherwise, use it and don't store it in cache

Thanks!

1
  • 1
    What you're describing is a cache stampede. If you look at the Mitigation section of that article you'll see the basic ideas, and if you look around you can find some libraries that attempt to deal with this in Redis. Commented Aug 10, 2016 at 20:19

1 Answer 1

3

As Kevin said, it's called cache stampede. One of the best documents to do with this problem I have read is Using memcached: How to scale your website easily (comes from Josef Finsel):

What we need in this instance is some way to tell our program that another program is working on fetching the data. The best way to handle that is by using another memcached entry as a lock.

When our program queries memcached and fails to find data, the first thing it attempts to do is to write a value to a specific key. In our example where we are using the actual SQL request for the key name we can just append ":lock" to the SQL to create our new key.

What we do next depends on whether the client supports returning success messages on memcached storage commands. If it does, then we attempt to ADD the value. If we are the first one to attempt this then we’ll get a success message back. If the value exists then we get a failure indication and we know that another process is trying to update the data and we wait for some predetermined time before we try to get the data again.

When the process that’s updating the cache is done, it deletes the lock key.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.