Your dataset (1000 products/100 cities) is very small. If you do not expect it to scale to be much larger, you can probably use a nested data structure (which is the most obvious solution here). Your mapping would look something like this:
{
"product": {
"properties": {
"product": {"type": "keyword"},
"cities": {
"type": "nested",
"properties": {
"name": {"type": "keyword"},
"available": {"type": "integer"}
}
}
}
}
}
Then you would index documents that look like this:
{
"product": "product1",
"cities": [
{
"name": "city1",
"available": 0
},
{
"name": "city2",
"available": 1
}
]
}
However, nested queries and aggregations are expensive/slow, so if you expect your dataset to grow substantially, you may want to consider denormalizing your data. In your case, I can see a few possible ideas for this, which will depend on how you want to query your data.
Simple flattening (one doc per city/product combo):
Doc 1:
{
"product": "product1",
"city": "city1",
"available": 0
}
Doc 2:
{
"product": "product1",
"city": "city2",
"available": 1
}
The down side here is that you can't easily search by product (since the products are duplicated). You may be able to resolve that by keeping a separate index of products to query when you need to query in that way.
In case you never expect to get more cities than 100 (or 1000), you could have one field per city, like this:
{
"product": "product1",
"city1": 0,
"city2": 1,
...
}
Note that in case you do this, you don't actually need to have all the cities in each source document -- missing keys are fine. The "down side" of this is that you need to know in advance the name of the cities you're interested in (in order to query), in order to query. Probably this is not the right solution for you, but it is useful in some use cases.
In case your available numbers are always low, and you expect this to always be the case (like if you never expect to have more than 10 available), you could do something like this:
{
"product": "product1",
"available": {
"0": ["city1", "city2"],
"1": ["city2"],
"2": [],
...
}
}
So if you want to see if city1 has the product (regardless of whether they're available), you can query available.0, and if you want to see if it has at least 1 available, you can query available.1, etc. If you want to see cities where product1 has at least 1 available, you can do a terms aggregation on available.1. In case you are using this kind of a data structure, you would probably want to add another field, which will contain the exact numbers for each city (not nested, so not very useful for querying, but for convenience after you've retrieved the data).