1

I am using Redis as a data store/cache for my application. I am pushing data to the Redis instance after pickling it into a string. My data is a Python Class object (ie, key-value pairs, but pickled into a string). I am using the Redis lib in Python.

My data gets pushed periodically, and it is possible that data from a certain host can stop getting pushed due to the host going down, etc. I want to be able to purge the data from that host once the host goes down. I have a trigger in place that notifies my app about the host going down, etc.

However, I am unsure as to how to purge data from Redis in an efficient way by un-pickling the data and checking for a certain key-value pair in the data. I would like to do this in place if possible. Any help with this will be truly appreciated!

EDIT:

This is what I use to push data to redis:

self.redis.zadd("mymsgs", pickle.dumps(msg), int(time.time()+360))

The message itself is off the format:

{'hostname': 'abc1', 'version': 'foo', 'uptime': 'bar'}
4
  • Can't you add the hostname somehow in the Redis key? That way you could use things like hostname-foo*? Or maybe you could study if a timer to auto-delete the key after some time would help? (I'm referring to redis.io/commands/expire) ? Commented Oct 24, 2016 at 19:44
  • @BorrajaX So here's the thing. The data I am storing in Redis, is a pickled version of a dict that contains the hostname and its value as a kvp. I am new to Redis as such. Is there no other way to delete from the data store? Commented Oct 24, 2016 at 19:57
  • When you say kvp... Is that the Redis key? Because, AFAIK, you can use wildcards only in the Keys, not in the Values. How does your key looks like? Can you edit the question to provide an example? Commented Oct 24, 2016 at 20:07
  • Please see my edit above. kvp does not refer to Redis key here. My understanding is that mymsgs would be the Redis key in my case? Please correct me if I am wrong .... Also, within a key, is there a way to delete data based on a string pattern match? Commented Oct 24, 2016 at 20:12

1 Answer 1

1

If I understood correctly, what I would recommend (if possible, of course) is that you change a bit the format of the keys. Instead of using a generic mymsgs as a key, I would recommend adding somehow the hostname to the key itself. For instance, it could be mysgs_from_HOSTNAME.

Since you can use wildcards to fetch keys, when you wanna get all the messages, you could just list the keys matching mysgs_from_* and then get the values of those keys. That way, when you know that the hostname called HOSTNAME is down, you could quickly purge all its entries by doing a delete("mysgs_from_HOSTNAME")`

See this example:

import redis
import time
import pickle

redis_connection = redis.Redis(host='localhost', port=6379, db=0)

# This "for" loop is just a simple populator, to put a bunch of key/values in Redis
for hostname in ['abc1', 'foo2', 'foo3']:
    msg = {'hostname': hostname, 'version': 'foo', 'uptime': 'bar'}

    # Step 1, store the data using a key that contains the hostname:
    redis_key = "messages_from_host_%s" % hostname
    redis_connection.zadd(redis_key, pickle.dumps(msg), int(time.time() + 360))

# Ok... I have some sample data in Redis now...
# Shall we begin?...

# Let's say I wanna get all the messages from all the hosts:
# First, I find all the keys that can contain messages from hosts
matching_keys = redis_connection.keys("messages_from_host_*")
print "Got these keys that match what I wanna get: %s" % matching_keys
# Then I iterate through the keys and get the actual zrange (~value) of each 
print "These are the messages from all those hosts:"
for matching_key in matching_keys:
    messages = [pickle.loads(s) for s in redis_connection.zrange(matching_key, 0, -1)]
    print messages

# Let's say that now, I discover that host called `foo2` is down, and I want
# to remove all its information:
redis_connection.delete("messages_from_host_foo2")

# All the entries referred to the host `foo2` should be gone:
print "Now, I shouldn't bee seing information from `foo2`"
matching_keys = redis_connection.keys("messages_from_host_*")
for matching_key in matching_keys:
    messages = [pickle.loads(s) for s in redis_connection.zrange(matching_key, 0, -1)]
    print messages

Which outputs:

Got these keys that match what I wanna get: ['messages_from_host_foo2', 'messages_from_host_foo3', 'messages_from_host_abc1']
These are the messages from all those hosts:
[{'uptime': 'bar', 'hostname': 'foo2', 'version': 'foo'}]
[{'uptime': 'bar', 'hostname': 'foo3', 'version': 'foo'}]
[{'uptime': 'bar', 'hostname': 'abc1', 'version': 'foo'}]
Now, I shouldn't bee seing information from `foo2`
[{'uptime': 'bar', 'hostname': 'foo3', 'version': 'foo'}]
[{'uptime': 'bar', 'hostname': 'abc1', 'version': 'foo'}]
Sign up to request clarification or add additional context in comments.

2 Comments

Thanks! Will give this a try and mark the question as answered if I can get it to work :)
I hope it helps :)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.