At first let me introduce my use case - I have a collection with documents where I store my XML requests and corresponding responses. Also each document has plenty of accompanying properties and some of them are indexed, but request and response aren't.
Whenever I search using indexed field the performance is sufficient. But there are situations where I have to prepare search using a regular expression basing on the request or response value.
For now I do something like this:
db.traffic.find(
{ $or:
[ { request: { $regex: "some.* code=\"123\"} },
{ response: { $regex: "some.* code=\"123\"} }] })
and I translated this into Java code. But querying is to slow and it takes significant amount of time comparing to other queries.
I can see two solutions:
- indexing
requests andresponses - but I suppose that this is not a good idea as they are really long and most probably index will be huge. querying using some
indexedfield first and then applying already mentioned query but in the descending order of found records and picking the very first found so I would like to do something likedb.traffic.find({"conversationID": { $regex: "vendorName" }}).sort({"counter": -1}) .findOne( { $or: [ { request: { $regex: "some.* code=\"123\"} }, { response: { $regex: "some.* code=\"123\"} }] })
So wrapping up - my question is: should I choose the simpler solution which is indexing requests and responses? And what impact will it have on the size of my index?
Or should I choose the second way? But is my code correct and does it what I want?