4

Can anybody help with completing my regex?

I strings which are formatted like this:

<FC012D>{p:19}Ja?<BF093C> Du möchtest<BC>zur Königin?<BC><BF040027><BF07>{p:20}<F8012D>Hmm...<BF093C><BC>Du bist gekommen um den<BC>Titel Kriegerin<BC>zu erhalten?<BD><BC>Verstehe.<BF093C> Das ist ganz<BC>schön tapfer für so<BC>eine junge Dame.<BD><BC>Die Königin wird sicher<BC>auch sehr<BC>überrascht sein.<BD><BC>{t:19}Bitte sehr,<BC>geh direkt hinein.<BD><FF>{t:20}Treibe Dich hier nicht<BC>herum, wenn Du hier<BC>nichts zu suchen hast!<BD><FF>

I need to split them into an array with preg_match_all to get 3 types of array-elements:

  • Strings with <>
  • Strings with {}
  • Anything else between the other two options as separate elements.

Here's what I have so far:

preg_match_all("/<[^>]*>|{(.*?)}|(\(.*?)\)/", $input_lines, $output_array);

I need some help with the last option, capturing strings in between. http://www.phpliveregex.com/p/kdW

2
  • 1
    Do you need <> and {} in the results? What is the expected output? Also, do you need to keep empty items in the resulting array? Commented May 26, 2017 at 9:30
  • @WiktorStribiżew Yes, I need them in the results, have a look at phpliveregex.com/p/kdW -the only thing missing there is the text between the <> and {} matches. Thank You! Commented May 26, 2017 at 9:41

1 Answer 1

3

Use preg_split with PREG_SPLIT_DELIM_CAPTURE | PREG_SPLIT_NO_EMPTY flags and the following regex:

'~(<[^<>]*>|{[^{}]*})~'

See the regex demo. It matches and captures into Group 1 two types of substrings:

  • <[^<>]*> - < followed with 0+ chars other than < and > and then >
  • {[^{}]*} - { followed with 0+ chars other than { and } and then }

The PREG_SPLIT_DELIM_CAPTURE will include all matches into the resulting array. The PREG_SPLIT_NO_EMPTY will remove unnecessary empty elements.

See the PHP demo:

$s = '<FC012D>{p:19}Ja?<BF093C> Du möchtest<BC>zur Königin?<BC><BF040027><BF07>{p:20}<F8012D>Hmm...<BF093C><BC>Du bist gekommen um den<BC>Titel Kriegerin<BC>zu erhalten?<BD><BC>Verstehe.<BF093C> Das ist ganz<BC>schön tapfer für so<BC>eine junge Dame.<BD><BC>Die Königin wird sicher<BC>auch sehr<BC>überrascht sein.<BD><BC>{t:19}Bitte sehr,<BC>geh direkt hinein.<BD><FF>{t:20}Treibe Dich hier nicht<BC>herum, wenn Du hier<BC>nichts zu suchen hast!<BD><FF>';
$res = preg_split('~(<[^<>]*>|{[^{}]*})~', $s, -1, PREG_SPLIT_DELIM_CAPTURE | PREG_SPLIT_NO_EMPTY);
print_r($res);
Sign up to request clarification or add additional context in comments.

4 Comments

If you need to also match (...) substrings, just add an alternative - '~(<[^<>]*>|{[^{}]*}|\([^()]*\))~'
That's great! Thank you very much @wiktor. I can 100% work with that. I wish you a great weekend!
@WiktorStribiżew Now i got to know what you were trying to say... sorry my bad.. perfect answer, today i have learnt PREG_SPLIT_DELIM_CAPTURE, +1 for you.
Dang it, I had family time... I was going to experiment with these flags for this question! Well done @WiktorStribiżew as always.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.