1

I setup simple cluster with 2 Db servers and 2 Coordinators. When I'm performing a simple select query I see significant performance degradation even with minimum data versus single machine setup.

FOR key IN @keys 
FOR user IN User FILTER user.UserId == key 
RETURN user

I have hash index setup for UserId. Even with 100 Users in collection and @keys containing 2 keys this query takes ~300ms vs ~4ms on single machine configuration.

Users collection has 4 shards setup by _key.

1 Answer 1

2

Clustering involves more network connections and thus more network latencies. Data has to be de/serialized (which involves parsing etc.). Shards etc. have to be managed in a centralized manner.

Depending on your query, (i.e. a sub query that leans on the sorted result of its bearer) parts of the query have to be distributed across the cluster with several round trips involving even more communication.

Clustering is intended to give you rather a higher throughput and access to more computing resources and not the low latency a single server environment can provide.

As long as a single machine can scale to your workload, clustering simply isn't the proper solution. This will change with our upcoming 3.0 version, in which the new synchronous replication gives you fault tolerance and high availability in addition to scalability. Currently you can distribute query load to several machines using replication.

Read more about ArangoDB Cluster performance in Max Blog article which scales to a big environment whilst keeping the latency reasonably low.

Sign up to request clarification or add additional context in comments.

1 Comment

Looks like waiting for 3.0 release as current cluster implementation looks pretty unusable and unreliable. Thanks

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.