0

I need to pass the data to an array by blocks, how can I make this? Do I need to use regex? My script gives me errors because I can not separate it as I wish. Does anyone have any ideas?

Data:

~0 
11111111
~1 
222222222
~2 
3333333333

        ~end 
~0 
aaaaaaaaaaa
~1 
bbbbbbbbbb
~2 
cccccccccc
~3 
ddddddddddd 

        ~end 



~0 
yyyyyyyyyyy
xxxxxxxx
ffffffffff
~1 
rrrrrrrrrrrr
        ~end 

I need it like this:

Array ( 
  [0] => Array
                (
                    [0] => 11111111

                    [1] => 222222222 

                    [2] => 3333333333 


                )

        ),

  [1] => Array
                (
                    [0] => aaaaaaaaaaa

                    [1] => bbbbbbbbbb 

                    [2] => cccccccccc 

                    [3] => ddddddddddd 
                )

        ),

  [2] => Array
                  (
                      [0] => yyyyyyyyyyy
xxxxxxxx
ffffffffff

                      [1] => rrrrrrrrrrrr 

                  )

          ),



)

My code (Fail):

$texto = "~0 
11111111
~1 
222222222
~2 
3333333333

        ~end 
~0 
aaaaaaaaaaa
~1 
bbbbbbbbbb
~2 
cccccccccc
~3 
ddddddddddd 

        ~end 



~0 
yyyyyyyyyyy
xxxxxxxx
ffffffffff
~1 
rrrrrrrrrrrr
        ~end";

preg_match_all("/(?ms)^~0.*?~end/", $texto, $coincidencias);

foreach ( $coincidencias[0] as $bloque ){
    preg_match_all("/\~.*\n/", $bloque, $sub_bloques);
    $hola[] = $sub_bloques;
}
4
  • I'm no sure I understood the requirements correctly, could you please confirm? "Each non-empty line NOT starting with the caracter ~, should be one entry in the array" Commented Dec 21, 2016 at 15:31
  • @Dragos from "~0" to "~end" are one block (they are 3 blocks now) , and per block text under ~0, ~1, ~2 to array position (only text) Commented Dec 21, 2016 at 15:32
  • I'd rather work in 2 steps: 1. $level1 = explode('~end', $data) 2. foreach ($level1 as $subItem) { $matches = preg_match_all('^(\w*)$', $subItem) } Commented Dec 21, 2016 at 15:41
  • @Dragos print_r($matches) => 0 Commented Dec 21, 2016 at 16:01

2 Answers 2

3

Here is one non-regex way: split the string into lines and iterate over them. Check for the conditions you've specified and add each line to a sub-array if it meets the conditions. Then when you get to an ~end line, append the sub-array to the main array.

$sub_bloques = [];
$hola = [];

foreach(array_map('trim', explode("\n", $texto)) as $line) {
    if ($line && substr($line, 0, 1) !== '~') {
        $sub_bloques[] = $line;
    }
    if ($line == '~end') {
        $hola[] = $sub_bloques;
        $sub_bloques = [];
    }
}

For a regex solution, start by exploding on ~end to break the main text into sections, then preg_match_all on the sections to find lines that meet your conditions.

foreach (explode('~end', $texto, -1) as $section) {
    preg_match_all('/\n *(?!~)(\w+)/', $section, $matches);
    if ($matches[1]) $result[] = $matches[1];
}

(?!~) is a a negative lookbehind to exclude lines that start with ~. Maybe there's some way to do the whole thing with one big cool regex, but I'm not that good at it.

Sign up to request clarification or add additional context in comments.

5 Comments

this ist great, but with (~0 yyyyyyyyyyy xxxxxxxx ffffffffff ) ist one text, not new line, the text have more \n ....
I'm sorry, I'm not sure I understand what you mean. Would you mind trying to explain it to me a bit more?
the last block ~0 yyyyyyyyyyy xxxxxxxx ffffffffff , there ist 1 text, not 3 positions
@dont-panic [2] => Array ( [0] => yyyyyyyyyyy xxxxxxxx ffffffffff [1] => rrrrrrrrrrrr )
Hmm, okay. I thought that's what you meant, but that's not what I'm getting. 3v4l.org/ro6Kb
0

Because you want to have your sub-blocks separated into blocks in your output array, there needs to be two-steps in the method. The reason is that your sub-blocks have differing capture group counts and regex will not permit this variability.

Code:

// This delivers the sub-blocks in their relative blocks as requested in the OP
foreach (preg_split('/\s+~end\s*/',$texto) as $bloque) {
    if(preg_match_all('/(?:\~\d+\s+)\K.+?(?:\s+\S+)*?(?=\s+\~|$)/',$bloque,$sub_bloques)){
        $hola[]=$sub_bloques[0];
    }
}
var_export($hola);

Output *reformatted/condensed to save space on this page (View Demo):

array(
    array('11111111','222222222','3333333333'),
    array('aaaaaaaaaaa','bbbbbbbbbb','cccccccccc','ddddddddddd'),
    array('yyyyyyyyyyy
xxxxxxxx
ffffffffff','rrrrrrrrrrrr')
)

Alternatively, if you want to have all sub-blocks listed in a 1-dim array (not divided by blocks) the output array can be built in one step:

if(preg_match_all("/(?:\~\d+\s*)\K.+?(?:\s+\S+)*?(?=\s+\~)/s", $texto, $coincidencias)){
    var_export($coincidencias[0]);
}

1-dim output:

array (
    0 => '11111111',
    1 => '222222222',
    2 => '3333333333',
    3 => 'aaaaaaaaaaa',
    4 => 'bbbbbbbbbb',
    5 => 'cccccccccc',
    6 => 'ddddddddddd',
    7 => 'yyyyyyyyyyy
xxxxxxxx
ffffffffff',
    8 => 'rrrrrrrrrrrr',
)

1 Comment

@VictorMoscosoLembcke If my answer satisfies, please award it the green tick (and potentially upvote it for being helpful). If something isn't quite right, please explain with a comment to me and I'll try to fix it up.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.