4

I want to extract some from a string that KEYs are separated from VALUEs by colon(:) and s separated by comma(,). The problem is that VALUEs can contain comma. As an example:

category:information technology, computer,publisher:Elsevier (EV),subject:Ecology, Evolution, Behavior and Systematics

In this example the KEYs that must extract are: category, publisher and subject. The final result must be as follow:

category = information technology, computer
publisher = Elsevier (EV)
subject = Ecology, Evolution, Behavior and Systematics

I tried to write a recursive regex but it doesn't work :

(category|publisher|subject):(.*?)(?:,(?R)|.?)

Can someone help to solve this problem. Thanks.

0

1 Answer 1

5

Well, if you can add a comma to the end of the string, I think this works:

(\w+):([^:]+),

Edit:

Jonathan Kuhn is totally right:

(\w+):([^:]+)(?:,|$)

This works

Sign up to request clarification or add additional context in comments.

2 Comments

If you replaced the trailing comma with (?:,|$) it would match a comma or the end of the string in a non-grouping match. Eliminating the need to add a comma to the end of the string. regex101.com/r/gG9kF3/1
Thanks. this regex works correctly but can you analyse below regex that also works : (\b\w+):(.*?(?=,\w+:|$))

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.