2

I have data in table:

id question
1 1.1 Covid-19 [cases]
2 1.1 Covid-19 [deaths]

I want to split the data into columns. To get below output:

id questionid question_name sub_question_name
1 1.1 Covid-19 cases
2 1.1 Covid-19 deaths

Is any function to get above output.?

1
  • What have you researched and tried so far with the documented string functions? Add your current query to your question and explain where you are having trouble. Commented Jun 4, 2022 at 19:29

2 Answers 2

5

One way of doing this is using the much useful PostgreSQL SPLIT_PART function, which allows you to split on a character (in your specific case, the space). As long as you don't need brackets for the last field, you may split on the open bracket and remove the last bracket with the RTRIM function.

SELECT id, 
       SPLIT_PART(question, ' ', 1)             AS questionid,
       SPLIT_PART(question, ' ', 2)             AS question_name,
       RTRIM(SPLIT_PART(question, '[', 2), ']') AS sub_question_name
FROM  tab

Check the demo here.

You can deepen your understanding of these functions on PostgreSQL official documentation related to the string functions.


EDIT: For a more advanced matching, you should consider using regex and PostgreSQL pattern matching:

SELECT id, 
       (REGEXP_MATCHES(question, '^[\d\.]+'))[1]         AS questionid,
       (REGEXP_MATCHES(question, '(?<= )[^[]+'))[1]      AS question_name,
       (REGEXP_MATCHES(question, '(?<=\[).*(?=\]$)'))[1] AS sub_question_name
FROM  tab

Regex for questionid Explanation:

  • ^: start of string
  • [\d\.]+: any existing combination of digit and dots

Regex for question_name Explanation:

  • (?<= ): positive lookbehind that matches a space before the match
  • [^[]+: any existing combination of any character other than [

Regex for sub_question_name Explanation:

  • (?<=\[): positive lookbehind that matches an open bracket before the match
  • .*: any character
  • (?=\]$): positive lookahead that matches a closed bracket after the match

Check the demo here.

Sign up to request clarification or add additional context in comments.

2 Comments

It working. but what if question_name length two or more. e.g. 1.2 Total population by 2022[male]
Check the updates on this answer. @Sudarshan
2

You can also use regexp_replace, in this example, the regexp_replace will replace the square brackets (first and third groups) group 1 -> ^(\[), group 3 -> (\])$ by the second group (.*). the third argument \2 in the end of the function indicates what group should remain in the text.

select 
    id,  
    split_part(question, ' ', 1) p1,
    split_part(question, ' ', 2) p2,
    regexp_replace(split_part(question, ' ', 3), '^(\[)(.*)(\])$', '\2') p3
from 
    covid;

Here is the example

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.