1

I'm trying to load a JSON file into Redshift using the COPY command together with a JSONPath. From what I understood, for each record in the JSON file, the COPY command generates one record to SQL.

I need to generate multiple records to SQL from one record in JSON, but I am unclear how to do that.

Here is an example. Say we have following JSON file:

{
    {
        "id": 1,
        "value": [1, 2, 3, 4],
        "other": "ops"
    },
    {
        "id": 2,
        "value": [5, 6, 7, 8]
    }
}

I want to generate following rows to store in SQL:

id value
1  1
1  2
1  3
1  4
2  5
2  6
2  7
2  8

What the should the JSONPath file should look like? Is it doable or not?

In a related SO post, the solution is to somehow generate data with right schema before it loading into Redshift. I could preprocess the JSON file to flatten it out somehow and store it back to S3. But that complicates things a lot.

Another related question is, how could I set a default value if one field is missing in one record (e.g. the "other" field in the second record of the aforementioned example)?

1 Answer 1

2

You can't perform transformation in copy command. Use ETL tools instead of direct copy to RedShift. Once you use JSON format the default value will be assigned on basis of table DDL.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.