You can use this regex to capture each line:
/^(\d+)\s+(.*)$/m
This regex starts on each line, captures one or more digits, then matches one or more space characters, then captures anything until the end of line.
Then, with preg_match_all(), you can get the data you want:
preg_match_all( '/^(\d+)\s+(.*)$/m', $input, $matches, PREG_SET_ORDER);
Then, you can just parse out the data from the $matches array, like this:
$data = array();
foreach( $matches as $match) {
list( , $num, $word) = $match;
$data[] = array( $num, $word);
// Or: $data[$num] = $word;
}
A print_r( $data); will print:
Array
(
[0] => Array
(
[0] => 1
[1] => foo
)
[1] => Array
(
[0] => 2
[1] => ba_r
)
[2] => Array
(
[0] => 3
[1] => foo bar
)
[3] => Array
(
[0] => 4
[1] => fo-o
)
)