I'm searching various alternatives for a write intensive application when it comes to selecting a Java data structure. I know that ONE data structure could not provide a single universal solution solution to a write intensive application but I'm surprised by the lack of discussion out there on the topic.
There are many people talking about read-intensive-rate-writes or concurrent-read-only applications but I cannot find any conversation around the data structures used for a write intensive application.
Based on the following requirements
- key/value pairs -
Map - Unsorted - for the sake of simplicity
- 1000+ writes per minute / negligible reads
- All data are stored in memory
I am thinking of the following approaches
- Simple
ConcurrentHashMap: Although based on this from official Oracle docs
[...] even though all operations are thread-safe, retrieval operations do not entail locking
It must be better suited for read intensive applications
- Combination of a
BlockingQueueand a set ofConcurrentHashMaps. In batches, the queue is drained of all its elements and then the updates are appropriately allocated in the underlying maps. In this approach though I would need an additional map to identify which maps are included to every map - acting like an orchestrator - Use a
HashMapand synchronize on the API level. Meaning that every write related method is going to be synchronized
synchronized void aWriteMethod(Integer aKey,String aValue) {
thisWriteIntensiveMap.put(aKey,aValue);
}
It'd be great if this question did not just receive criticism on the aforementioned options but also suggestions about new and better solutions.
PS: Apart from the integrity of the data, order of operations and throttling issues what else needs to be taken into account into choosing the "best" approach for a write intensive.
I know that this might look a bit open ended but it'd be interesting to hear how people think on this problem.
ArrayBlockingQueueuses a lock, whereas JCTools' offers bounded lock-free queues. Either way there is complexity of sharding and draining the queues. Your write rate of 16+/s is lowish, as ConcurrentHashMap can do 50M writes/s on a Zipfian distribution. You should benchmark via JMH, as you'll probably be fine with the stock hashmap.