1

I have this as an input to my command line interface as parameters to the executable:

-Parameter1=1234 -Parameter2=38518 -param3 "Test \"escaped\"" -param4 10 -param5 0 -param6 "TT" -param7 "Seven" -param8 "secret" "-SuperParam9=4857?--SuperParam10=123"

What I want to is to get all of the parameters in a key-value / associative array with PHP like this:

$result = [
    'Parameter1' => '1234',
    'Parameter2' => '1234',
    'param3' => 'Test \"escaped\"',
    'param4' => '10',
    'param5' => '0',
    'param6' => 'TT',
    'param7' => 'Seven',
    'param8' => 'secret',
    'SuperParam9' => '4857',
    'SuperParam10' => '123',
];

The problem here lies at the following:

  • parameter's prefix can be - or --
  • parameter's glue (value assignment operator) can be either an = sign or a whitespace ' '
  • some parameters may be inside a quote block and can also have different, both separators and glues and prefixes, ie. a ? mark for the separator.

So far, since I'm really bad with RegEx, and still learning it, is this:

/(-[a-zA-Z]+)/gui

With which I can get all the parameters starting with an -...

I can go to manually explode the entire thing and parse it manually, but there are way too many contingencies to think about.

4
  • Why you not using getopt()? Commented Jan 30, 2018 at 17:28
  • It's not a thing passed to a PHP CLI. It's a string which I recieve from elsewhere and have to parse it into an array. Commented Jan 30, 2018 at 17:29
  • Then change the title of the question. Commented Jan 30, 2018 at 17:30
  • What about -(\w+)[= ](?|"((?:[^"\\]|\\.)*)"|(\d+))? Commented Jan 30, 2018 at 18:00

2 Answers 2

2

You can try this that uses the branch reset feature (?|...|...) to deal with the different possible formats of the values:

$str = '-Parameter1=1234 -Parameter2=38518 -param3 "Test \"escaped\"" -param4 10 -param5 0 -param6 "TT" -param7 "Seven" -param8 "secret" "-SuperParam9=4857?--SuperParam10=123"';

$pattern = '~ --?(?<key> [^= ]+ ) [ =]
(?|
    " (?<value> [^\\\\"]*+ (?s:\\\\.[^\\\\"]*)*+ ) "
  |
    ([^ ?"]*)
)~x';

preg_match_all ($pattern, $str, $matches);
$result = array_combine($matches['key'], $matches['value']);
print_r($result);

demo

In a branch reset group, the capture groups have the same number or the same name in each branch of the alternation.

This means that (?<value> [^\\\\"]*+ (?s:\\\\.[^\\\\"]*)*+ ) is (obviously) the value named capture, but that ([^ ?"]*) is also the value named capture.

Sign up to request clarification or add additional context in comments.

Comments

2

You could use

--?
(?P<key>\w+)
(?|
    =(?P<value>[^-\s?"]+)
    |
    \h+"(?P<value>.*?)(?<!\\)"
    |
    \h+(?P<value>\H+)
)

See a demo on regex101.com.


Which in PHP would be:

<?php

$data = <<<DATA
-Parameter1=1234 -Parameter2=38518 -param3 "Test \"escaped\"" -param4 10 -param5 0 -param6 "TT" -param7 "Seven" -param8 "secret" "-SuperParam9=4857?--SuperParam10=123"
DATA;

$regex = '~
            --?
            (?P<key>\w+)
            (?|
                =(?P<value>[^-\s?"]+)
                |
                \h+"(?P<value>.*?)(?<!\\\\)"
                |
                \h+(?P<value>\H+)
            )~x';

if (preg_match_all($regex, $data, $matches)) {
    $result = array_combine($matches['key'], $matches['value']);
    print_r($result);
}
?>


This yields

Array
(
    [Parameter1] => 1234
    [Parameter2] => 38518
    [param3] => Test \"escaped\"
    [param4] => 10
    [param5] => 0
    [param6] => TT
    [param7] => Seven
    [param8] => secret
    [SuperParam9] => 4857
    [SuperParam10] => 123
)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.