0

I have the below string

abc-12d-ef-oy-5678-xyz--**--20190120075439322am--**--ghi-66d-ef-oy-8877-sdf--**--sfdfdsgfg--**--20190120075765487am

It is kind of multi character delimited string, delimited by '--**--' I am trying to extract the first and second words which has the -oy- tag in it. This is a column in a table. I am using the regex_extract method but i am not able extract the string which contains a string and ends with a string.

Here is one pattern that i tried .*(.*oy.*)--

7
  • Here is one patter that i have tried .*(.*oy.*)-- Commented Feb 28, 2019 at 16:31
  • The correct pattern you need to use is ^(.*?)--\*\*--(.*?)--\*\*-- where group1 and group2 captures your respectively text. Demo Commented Feb 28, 2019 at 16:33
  • Hi @PushpeshKumarRajwanshi The above pattern is grouping the text between delimiters but how can we add the condition that it contains -oy- as a part of it. Thank you so much for the help. Will try from my end as well Commented Feb 28, 2019 at 16:42
  • Do you want to reject match if any of the groups does not contain -oy-? Commented Feb 28, 2019 at 16:44
  • yes exactly that's what i am looking for Commented Feb 28, 2019 at 16:44

3 Answers 3

2

If the -oy- can not be at the start or at the end, you could use this pattern to match the 2 hyphen delimited strings with -oy-:

[a-z0-9]+(?:-[a-z0-9]+)*-oy(?:-[a-z0-9]+)+

Regex details

  • [a-z0-9]+ Match 1+ times a-z0-9
  • (?: Non capturing group
    • -[a-z0-9]+ Match - and 1+ times a-z0-9
  • )* Close group and repeat 0+ times
  • -oy Match literally
  • (?:-[a-z0-9]+)+ Repeat 1+ times a group which will match - and 1+ times a-z0-9

You can extend the character class [A-Za-z0-9] to allow what you want to match like uppercase chars.

Regex demo | Java demo

If the matches should be between delimiters, you could use a positive lookbehind and positive lookahead and an alternation:

(?<=^|--\\*\\*--)[a-z0-9]+(?:-[a-z0-9]+)*-oy(?:-[a-z0-9]+)+(?=--\\*\\*--|$)

See a Java demo

Sign up to request clarification or add additional context in comments.

2 Comments

This does not answer what OP wants. He wants to reject the match if text in group1 and group2 strictly between the delimiters does not contain -oy-. Your regex does not even consider the delimiters OP has mentioned.See this
@PushpeshKumarRajwanshi I added a version which uses lookarounds to take the delimiter into account. The question stated I am trying to extract the first and second words which has the -oy- tag in it. and that is what my answer is based on.
1

You can use this regex which will match string containing -oy- and capture them in group1 and group2.

^.*?(\w+(?:-\w+)*-oy-\w+(?:-\w+)*).*?(\w+(?:-\w+)*-oy-\w+(?:-\w+)*)

This regex basically matches two strings delimiter separated containing -oy- using this (\w+(?:-\w+)*-oy-\w+(?:-\w+)*) to capture the text.

Demo

2 Comments

Thank you so much. This helped solving the problem
In the Demo link why isn't the 1st and 3rd line didn't have a match. can you please explain a bit more
1

Are you able to select values from capture groups?

(?:--\*\*--|^)(.*?-oy-.*?)(?:--\*\*--|$)

?: - Non-capture group, matches the delimiter, begin of line, or end of line but does not create a capture group

*? - Lazy match so you only grab the contents of the field

https://regex101.com/r/aUAvcx/1

--- Second stab at this follows ---

This is convoluted. Hopefully you can use Lookahead and Lookbehind. The last problem I had was the final record was being "Greedy" and sucking up the field before it too. So I had to add an exclusion in the capture group for your delimiter.

See if this works for you.

(?<=--\*\*--|^)((?:(?:(?!--\*\*--).)*)-oy-(?:(?:(?!--\*\*--).)*))(?=--\*\*--|$)

https://regex101.com/r/aUAvcx/3

Basically the (?: are so we are not getting too many capture groups to work with.

There are three parts to this:

  1. The lookbehind - Make sure the field is framed by the delimiter (or start of line)
  2. The capture group - Grab the contents of the field, making sure a delimiter isn't sucked up into it
  3. The lookahead - Make sure the field is framed by the delimiter (or end of line)

As far as the capture group goes, I check the left and right side of the -oy- to make sure the delimiter isn't there.

1 Comment

Thank you so much @curtis with your pattern the required string are coming as 2 matches and the first group of each match can we get things into two groups and if my -oy- string is in the second word of the delimited string, It is not able to match properly

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.