0

I have a production_order document_type

i.e.

{
  part_number: "abc123",
  start_date: "2018-01-20"
},
{
  part_number: "1234",
  start_date: "2018-04-16"
}

I want to create a commodity document type i.e.

{
  part_number: "abc123",
  commodity: "1 meter machining"
},
{
  part_number: "1234",
  commodity: "small flat & form"
}

Production orders are datawarehoused every week and are immutable.

Commodities on the other hand could change over time. i.e abc123 could change from 1 meter machining to 5 meter machining, so I don't want to store this data with the production_order records.

If a user searches for "small flat & form" in the commodity document type, I want to pull all matching records from the production_order document type, the match being between part number.

Obviously I can do this in a relational database with a join. Is it possible to do the same in elasticsearch?

If it helps, we have about 500k part numbers that will be commoditized and our production order data warehouse currently holds 20 million records.

3
  • Elasticsearch does allow parent-child relationships between docs in the same index. This has evolved over different versions and the latest looks like this elastic.co/guide/en/elasticsearch/reference/current/… Commented Jun 20, 2018 at 2:41
  • So joins are possible and depending on your ES version the semantics are a bit different and tend to be slower as the size increases. Commented Jun 20, 2018 at 2:43
  • Thank you. Looking at the link, I don't understand the process. Searching google and youtube all tutorials seem to be between 2 and 4 years old, which suggests developers are avoiding using joins. Do you know of a link to a tutorial that uses the current language version? Commented Jun 20, 2018 at 7:16

2 Answers 2

1

I have found that you can indeed now query between indexs in elasticsearch, however you have to ensure your data stored correctly. Here is an example from the 6.3 elasticsearch docs

Terms lookup twitter example At first we index the information for user with id 2, specifically, its followers, then index a tweet from user with id 1. Finally we search on all the tweets that match the followers of user 2.

PUT /users/user/2
{
    "followers" : ["1", "3"]
}

PUT /tweets/tweet/1
{
    "user" : "1"
}

GET /tweets/_search
{
    "query" : {
        "terms" : {
            "user" : {
                "index" : "users",
                "type" : "user",
                "id" : "2",
                "path" : "followers"
            }
        }
    }
}

Here is the link to the original page

https://www.elastic.co/guide/en/elasticsearch/reference/6.1/query-dsl-terms-query.html

In my case above I need to setup my storage so that commodity is a field and it's values are an array of part numbers. i.e.

{
  "1 meter machining": ["abc1234", "1234"]
}

I can then look up the 1 meter machining part numbers against my production_order documents

I have tested and it works.

Sign up to request clarification or add additional context in comments.

Comments

0

There is no joins supported in elasticsearch.

You can query twice first by getting all the partnumbers using "small flat & form" and then using all the partnumbers to query the other index.

Else try to find a way to merge these into a single index. That would be better. Updating the Commodities would not cause you any problem by combining the both.

1 Comment

Thank you. I thought that might be the case. I think i will try the query twice approach, as I have many other data sets that also relate to the production_orders type and things could get busy and messy quite quickly. I will wait till morning and if I don't get any other answer I will accept yours.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.