1

I want to extract value from below string using regular expression

"a:4:{i:0;s:24:\"hello \"tejo krishna\"!!!`\";i:1;s:11:\"hello \"xyz\"\";i:2;s:6:\"defeat\";i:3;s:7:\"pattern\";}"

above string I want to extract italic format text. any help appreciated.

Thanks,

1
  • 2
    Regex differs (slightly) it's implementation and structure from language to language. So please always mention the Language you're interested in. Commented Jul 17, 2016 at 4:24

1 Answer 1

1

The exact constraints of the acceptable characters are not clear, also you don't tell about the language. But in Python, with your example, the regex below works. If you expect more types of characters in the input, just extend the classes:

import re

myre = re.compile(r'\\"([\sa-zA-z0-9]+\\?"?[\sa-zA-z0-9]+\\?"?[!`]*)\\"')
s = r'"a:4:{i:0;s:24:\"hello \"tejo krishna\"!!!`\";'\
    r'i:1;s:11:\"hello \"xyz\"\";i:2;s:6:\"defeat\";i:3;'\
    r's:7:\"pattern\";}"'
match = myre.findall(s)
# results
# ['hello \\"tejo krishna\\"!!!`', 'hello \\"xyz\\"', 
#  'defeat', 'pattern']

Note: in Python, the backslash (\) is an escape character, so need to be escaped in strings, thus the double backslashes in the output. In regex, backslash is also an escape character, thus the double backslashes in the regex. There because it is defined as raw string (note the r in front of the string r'...'), Python does not need us to escape, we escape for the regex engine. Otherwise you could use 4 backslashes in normal string: '\\\\"([\\sa-zA-z0-9]+\\\\?"?[\\sa-zA-z0-9]+\\\\?"?[!]*)\\"'`. You need to do this if in your programming language no raw string is available.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.