1
select regexp_matches('Hi, I am Harry Potter', '^[a-zA-Z0-9]*\W+\w+');
select regexp_matches('Hi, I am Harry Potter', '\w+\W+\w+');

Both way returns {Hi, I} But expect {Hi I}. Related Question: extract first word in an String: extract the first word from a string - regex

2 Answers 2

2

You cannot match disjoint (non-adjoining) parts of a string into a single group.

You can use REGEXP_REPLACE to capture the first two words into separate groups and then use two backreferences to the group in the replacement pattern to get what you need:

select regexp_replace('Hi, I am Harry Potter', '^\W*(\w+)\W+(\w+).*', '\1 \2');

See the online demo. The regex means

  • ^ - start of string
  • \W* - zero or more non-word chars
  • (\w+) - Group 1 (\1): one or more word chars
  • \W+ - one or more non-word chars
  • (\w+) - Group 2 (\2): one or more word chars
  • .* - the rest of the string.
Sign up to request clarification or add additional context in comments.

2 Comments

I wonder why cannot match {Hi, I} together?
@JianHe "You cannot match disjoint (non-adjoining) parts of a string into a single group." A matched string is Group 0 value.
0

You can use this pattern:

select regexp_match(
          'Hi, I am Harry Potter',
          '^([[:alnum:]]+)[^[:alnum:]]+([[:alnum:]]+)'
       );

 regexp_matches 
════════════════
 {Hi,I}
(1 row)

The pattern matches the first sequence of alphanumerical characters, then a sequence of non-alphanumerical characters and another sequence of alphanumerical characters. The result is an array with the first and third expression, which are parenthesized in the pattern.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.