Between your two options (i.e. one nested doc per class vs. one nested doc per class and method pair), there should not be a noticeable difference in search times. Personally, I would prefer the first option, since it seems a better model of your data. Plus, it means fewer documents in total. (Keeping in mind, that a "nested" doc in ES is really just another true document in Lucene, under the hood. ES simply manages keeping the nested docs located directly next to your parent doc for efficient relationship management)
Internally, ES treats every value as an array, so it is certainly suited to handle the first option. Assuming an example mapping like this:
PUT /my_index/
{
"mappings": {
"my_type": {
"properties": {
"someField": { "type": "string" },
"classes": {
"type": "nested",
"properties": {
"class": { "type":"string", "index":"not_analyzed" },
"method": { "type": "string", "index":"not_analyzed" }
}
}
}
}
}
}
You can then input your documents, such as:
POST test_index/my_type
{
"someField":"A",
"classes": {
"class":"Java.lang.class1",
"method":["myMethod1","myMethod2"]
}
}
POST test_index/my_type
{
"someField":"B",
"classes": {
"class":"Java.lang.class2",
"method":["myMethod3","myMethod4"]
}
}
In order to satisfy your sample query, you can simply use a bool filter inside a nested query. For example:
GET test_index/my_type/_search
{
"query": {
"nested": {
"path": "classes",
"query": {
"bool": {
"filter": [
{ "term": {"classes.class":"Java.lang.class2"} },
{ "term": {"classes.method":"myMethod3"} }
]
}
}
}
}
}
This would return the second document from my example.
nesteddocs...