Looking to optimize Redis memory usage for caching many JSON API results

Question

I'm brand new to Redis, and am just experimenting with caching some data and seeing how memory usage/performance compares to other options like Memcached. I'm using ServiceStack.Redis client library via IRedisClient

I have been testing Redis, and 25k key/value objects is pushing around 250MB of memory, with a 100MB dump.rdb file. I need to cache a lot more than this, and am looking to reduce the memory consumption if possible. My best guess is that each cache item's text (JSON blob) is around 4k in size, but if my basic math is correct, each item is consuming around 10k in Redis from a memory footprint point of view at least. The vast difference between the dump size and the in memory size is a bit alarming to me.

I'm also running on 64bit VM right now, which I understand wastes a lot of extra space compared to 32bit, so i'll look into that as well. Looks like redis needs 2x the memory for each pointer (per key/value cached?). Could this be where the 2.5x disk:memory ratio is coming from?

I understand I can write code on my side to deal with the compression/decompression of data on the way in/out of Redis, but just curious if there is some way to configure the client library to do something similar with say StreamExtensions.

Usage pattern is ready heavy, with infrequent writes, and/or batch cache refresh writes.

Anyway, looking for any suggestions on how to get more cache items for a given amount of memory.

You should read this interesting article about Redis memory usage : nosql.mypopescu.com/post/1010844204/redis-memory-usage. Did you try using Hashes ? — Emmanuel Tabard
– Emmanuel Tabard, Commented Nov 4, 2012 at 21:36
Emmanuel, this looks promising. Do you know if ServiceStack provides some method to generate these hashes or needs to be done manually? — JesseP
– JesseP, Commented Nov 4, 2012 at 21:55
You may also want to read this wiki - github.com/sripathikrishnan/redis-rdb-tools/wiki/… — Sripathi Krishnan
– Sripathi Krishnan, Commented Nov 5, 2012 at 3:15

Didier Spezia · Accepted Answer · 2012-11-05 09:54:20Z

There are multiple points you need to consider. In the following, I suppose your data are stored in strings each of them containing a JSON object.

The first point is 4 KB JSON objects are stored. The overhead of Redis due to dynamic data structure and pointers is absolutely negligible compared to the size of the useful data. This overhead would be high if you had plenty of very small objects (it is about 80 bytes per key), but with 4 KB objects it should not be a problem.

So using a 32 bit version (reducing the size of pointers) will not help.

The second point is the difference between memory footprint and dump file size is easily explained by the fact strings in the dump file are compressed using the LZF algorithm (and JSON does compress quite well). The memory footprint is generally much larger than the dump file size for non compressed data.

Now the difference you see between the real size of your data and the memory footprint is probably due to the allocator internal fragmentation. Generally, people only consider external fragmentation (i.e. the one which is commonly referred as memory fragmentation by most people), but in some situations, internal fragmentation can also represent a major overhead. See the definitions here.

In your situation, the 4 KB objects are actually one of this worst cases. Redis uses the jemalloc allocator, featuring well-defined allocation classes. You can see that 4 KB is an allocation class and the next one is 8 KB. It means if a number of your objects weight a bit more than 4 KB (including the Redis string overhead of 8 bytes), 8 KB will be allocated instead of 4 KB, and half of the memory will be wasted.

You can easily check this point by only storing objects a bit smaller than 4 KB, and calculate the ratio between the memory footprint and the expected size of the useful data. Repeat the same operation with objects a bit larger than 4 KB and compare the results.

Possible solutions to reduce the overhead:

client side compression. Use any lightweight compression algorithm (LZF, LZO, quicklz, snappy). It will work well if you can maintain the size of most of your objects below 4 KB.
change the memory allocator. Redis makefile also supports tcmalloc (Google allocator) as a memory allocator. It could reduce the memory overhead for these 4 KB objects since the allocation classes are different.

Please note with other in-memory stores, you will also get the same kind of overhead. For instance with memcached, it is the job of the slab allocator to optimize memory consumption, and minimize internal and external fragmentation.

What are your thoughts on hashing of the key as in the other answer? Does that impact size or just search?
Using hash objects as a memory optimization is only interesting if your values are very small (a few bytes). In that case it brings a major gain in term of size, for a very small overhead in CPU at search time. If your values are 4 KB objects, it will not help at all.

Emmanuel Tabard · Accepted Answer · 2012-11-04 22:01:50Z

1

I had myself a hard time understanding how to use Redis efficiently. Especially when you come from Memcache(get/set) VS Redis (strings, hashes, lists, sets & sorted sets).

You should read this article about Redis memory usage : http://nosql.mypopescu.com/post/1010844204/redis-memory-usage. Old article (2010), but still interesting.

I see two solutions here :

Compile and use 32 bit instances. Dump files are compatible between 32bbit and 64bit and you can switch later if you need to.
Using Hashes looks better by me : http://redis.io/topics/memory-optimization. Read the section "Using hashes to abstract a very memory efficient plain key-value store on top of Redis". ServiceStack.Redis provides a RedisClientHash. It should be easy to use !

Hope it can help you !

answered Nov 4, 2012 at 22:01

Emmanuel Tabard

6965 silver badges14 bronze badges

1 Comment

JesseP Over a year ago

It looks like there is no overload to provide an expiration date for IRedisClient.SetEntryInHash() like there is for the IRedisClient.SetEntry() method... :(

Collectives™ on Stack Overflow

Looking to optimize Redis memory usage for caching many JSON API results

2 Answers 2

2 Comments

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

2 Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related