0

I want to extract only val2 from the below column1 of a table, I'm not sure how to do it with SQL though I've tried with regexp_subtr/instr, the length of column values are dynamic but the contents of val2 is the one that's to be extracted

Input:

column1
[{'val1': '54', 'val2': 'luis long, F (Peter)', 'val3': '[email protected]', 'val4': 'xxxxyyy://somevalue', 'val5': 'Category', 'val6': 'some other value'}]

Output:

column 1 column 2
[{'val1': '54', 'val2': 'luis long, F (Peter)', 'val3': '[email protected]', 'val4': 'xxxxyyy://somevalue', 'val5': 'Category', 'val6': 'some other value'}]luis long, F (Peter) luis long, F (Peter)
5
  • Please show us what you have tried. Commented Feb 26 at 7:36
  • @DaleK, I tried with regexp_instr & substring as shared in the post but didn't get expected result, would you be able to help on this ? Commented Feb 26 at 7:40
  • A mean actually show us your attempt Commented Feb 26 at 7:45
  • 1
    Redshift has JSON functions, why aren't you using those? docs.aws.amazon.com/redshift/latest/dg/json-functions.html Commented Feb 26 at 7:47
  • 1
    Hi @DaleK, had found the way out using Tim's method of JSON parser, but just a thought - since it's already mentioned that regex has been tried, rather than checking further it'd be really helpful to share your approach/answer to the question in the meantime in future for others questions! anyways thanks! Commented Feb 26 at 10:50

1 Answer 1

1

Rather than using regex to parse JSON, which is not fool-proof, you should rely on Redshift's JSON API here:

SELECT
    col1,
    JSON_EXTRACT_PATH_TEXT(JSON_EXTRACT_ARRAY_ELEMENT_TEXT(col1, 0), 'val2') AS col2
FROM yourTable;

Edit:

Based on your feedback in the comments section, it appears that Redshift's JSON parser strictly expects keys and values to be enclosed in double quotes, not single quotes. One possible workaround here (with a caveat given below) would be to simply do a blanket replacement of all single quotes to double quotes. Hence, the following might work:

SELECT
    col1,
    JSON_EXTRACT_PATH_TEXT(
        JSON_EXTRACT_ARRAY_ELEMENT_TEXT(REPLACE(col1, '''', '"'), 0), 'val2') AS col2
FROM yourTable;

The caveat here is that replacing all single quotes might alter the JSON structure if it contains literal single quotes. The best fix here would he to go back to your JSON source and fix it there.

Sign up to request clarification or add additional context in comments.

9 Comments

Thanks @Tim Biegeleisen, ran into an error [Amazon](500310) Invalid operation: JSON parsing error Details: error: JSON parsing error code:8001 context: invalid json array object, could you help with this please? Thanks!
Your column 1 appears to be a JSON array...in case it's not, try: JSON_EXTRACT_PATH_TEXT(col1, 'val2')
Yes it seems json only, but even JSON_EXTRACT_PATH_TEXT(col1, 'val2') is giving same error, any other ways please?
If you have malformed JSON, then you need to fix it first. Try using an online JSON formatter/parser on your content. If it chokes, let it highlight where the issue is.
I'll check that, but Is it possible to extract using a non-json method for this? Thanks!
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.