4

I know this may have been asked a zillion times before but I cannot seem to find the golden solution for my exact use case.

I only have one data structure, a map where the key is a string. The objects of the map are maps themselves but this time the values are simple objects/primitives such as string, int, double, etc. So a map of maps. The keys of the innermost map is constant, i.e. no entries are ever added/removed from the innermost map except when created. So it is kind of like a traditional table, albeit each row may have arbitrary columns.

I need this data structure to be persistent and replicated.

Here are my requirements:

  1. Pure Java solution
  2. The disk map is only used in case of re-start. Hence there are never any reads from disk and all the writing is only done by one application)
  3. Embedded.
  4. Performance. It is the UPDATE performance of existing records that is important. UPDATEs will happen potentially 100k times per second (but more likely 20-50k per second). As for INSERTs/DELETEs they do of course happen but probably only a few times per day. Hence I do not worry too much about INSERT/DELETE performance.
  5. Replicated. For resilience I need the disk copy of the map to be replicated. The replication from master to slave does not need to be part of the original transaction, i.e. I can sacrifice some ACIDness for performance.
  6. Number of records is expected to be 100k-200k, but not much higher. The size of each record is probably 100-200 KBytes so really not that much data in total. I'm guessing the total size of the data file will be below 100 MBytes and that is probably an estimate on the high side.
  7. The total amount of data is not more than it can always fit in memory. (this is why I can guarantee that there will be no disk reads, except during startup)
  8. My application is not distributed. At any given point in time there's only one active process that writes to disk.
  9. Liberal open source license. (Apache, BSD, LGPL, should be fine)

The application in question never needs to store anything but the above data structure, i.e. it will not have a future uncovered need for other persistent data structures. Hence it sounds fair to optimize based on this particular data structure.

I've looked at Berkeley DB Java edition but it fails on requirement #6. I've looked at TokyoCabinet/KoyotoCabinet but it fails on requirement #1.

So what would you recommend?

4
  • I use ehcache. Commented Jan 30, 2013 at 15:27
  • 1
    Wouldn't replication of Ehcache disk cache need the Enterprise version? (hence it breaks requirement #9). Or have I misunderstood ? Commented Jan 30, 2013 at 15:47
  • You may well be right. I missed that requirement. Commented Jan 30, 2013 at 16:03
  • Have you looked at Hazelcast? Commented Feb 4, 2013 at 5:20

5 Answers 5

1

There are several options, but neo4j seems to match what you want. HBase and Cassandra are also options, but more than you probably need.

Sign up to request clarification or add additional context in comments.

1 Comment

Thx. My bad that I had never heard about neo4j. Wouldn't I require the Enterprise license if I need replication ? (and hence break my requirement #9). Caveat: I've only looked briefly at their website.
0

Have you looked at Redis? Its an in-memory "database" (key-value store) that, IMHO, meets all your needs.

2 Comments

How does Redis play with req #1 or req #3 above ??
My apologies - I thought you were asking for a Java solution ... missed the pure Java one. Redis has a Java library, but I guess that's not what you are looking for.
0

Have a look at HazelCast. It meets most of your requirements, except that it's distributed.

Comments

0

I think Chronicle Map is a good match for your case

Comments

0

I would suggest

  1. Nitrite database. Very easy and simple to use, feature rich, fast, Key-value storage (not pure Java)
  2. Java object Serialization: use an object containing a map to store everything and serialize it to a file on exit. Deserialize it on startup.
  3. Jackson serialization. Similar to 2. Much faster than Java Serialization, pure text output but produces much bigger file.

.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.