Fetching strings before special characters from a postgres column

Question

I have a column with following values in a postgres table.

col1
uniprotkb:P62158(protein(MI:0326), 9606 - Homo sapiens)
uniprotkb:O00602-PRO_0000009136(protein(MI:0326), 9606 - Homo sapiens)

I would like to extract a value from above column values.

col2
P62158
O00602

I am using following regexp match on my column

select 

        uniprotkb:(.*)\-|\([a-zA-Z].* as col2

from table;

But the above regexp capture the text before the last '-'. I want to capture the text between uniprotkb: and before the first occurence of either '(' or '-'. Any suggestion here would be helpful.

Well, it seems the requirement you mention does not quite match your pattern. Did you mean to use uniprotkb:(.*?)[-(][a-zA-Z].*? The greedy * is only a part of the problem, isn't it? — Wiktor Stribiżew
– Wiktor Stribiżew, Commented Mar 2, 2020 at 11:12

Wiktor Stribiżew · Accepted Answer · 2020-03-02 11:32:35Z

1

You may use

uniprotkb:(.*?)[-(][a-zA-Z]
           ^^^ ^^^^

See the regex demo.

Details

uniprotkb: - a literal string
(.*?) - Group 1: any 0+ chars as few as possible
[-(] - a - or (
[a-zA-Z] - a letter.

PostgresSQL test:

SELECT (REGEXP_MATCHES (
      'uniprotkb:P62158(protein(MI:0326), 9606 - Homo sapiens)',
      'uniprotkb:(.*?)[-(][a-zA-Z]'
   ))[1]

Outputs:

answered Mar 2, 2020 at 11:32

Wiktor Stribiżew

631k41 gold badges502 silver badges632 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Fetching strings before special characters from a postgres column

1 Answer 1

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related