3

I am looking for a way to create new arrays in a loop. Not the values, but the array variables. So far, it looks like it's impossible or complicated, or maybe I just haven't found the right way to do it.

For example, I have a dynamic amount of values I need to append to arrays.
Let's say it will be 200 000 values. I cannot assign all of these values to one array, for memory reasons on server, just skip this part.
I can assign a maximum amount of 50 000 values per one array.
This means, I will need to create 4 arrays to fit all the values in different arrays. But next time, I will not know how many values I need to process.

Is there a way to generate a required amount of arrays based on fixed capacity of each array and an amount of values?
Or an array must be declared manually and there is no workaround?

What I am trying to achieve is this:

$required_number_of_arrays = ceil(count($data)/50000);

for ($i = 1;$i <= $required_number_of_arrays;$i++) {

 $new_array$i = array();

 foreach ($data as $val) {
  $new_array$i[] = $val;
 }

}

// Created arrays: $new_array1, $new_array2, $new_array3
21
  • 2
    You may need to start using php.net/manual/en/language.generators.overview.php to overcome memory limitations (>= 5.5.0). Commented Mar 7, 2018 at 21:59
  • 2
    @encrypted21 You can retrieve large amounts of data from a database using generators, an example in PDO Commented Mar 7, 2018 at 22:09
  • 1
    The concept of generators is that when yielding the next value, the previous one is removed from memory. This way you can easily read a 10Gb text file and only hold as much memory of each line of that text file that will unlikely ever breach the maximum allotted memory. Commented Mar 7, 2018 at 22:19
  • 1
    @encrypted21 Keep in mind, the yielded data from generators needs to be processed immediately. Storing that data in another array will negate the effect it is having. Commented Mar 7, 2018 at 22:39
  • 2
    If you have issues building your XML due to memory too, what you can do is have a single document you use as your "scratch" pad for building the XML, then perform a $doc->saveXML($node) using the DOMNode reference to just get that inner XML string and use fwrite append to add those entries to your output file. once finished closing the outer xml element manually. Commented Mar 7, 2018 at 23:14

6 Answers 6

1

A possible way to do is to extend ArrayObject. You can build in limitation of how many values may be assigned, this means you need to build a class instead of $new_array$i = array();

However it might be better to look into generators, but Scuzzy beat me to that punchline.

The concept of generators is that with each yield, the previous reference is inaccessible unless you loop over it again. It will be in a way, overwritten unlike in arrays, where you can always traverse over previous indexes using $data[4].

This means you need to process the data directly. Storing the yielded data into a new array will negate its effects.

Fetching huge amounts of data is no issue with generators but one should know the concept of them before using them.

Sign up to request clarification or add additional context in comments.

4 Comments

Hmm, maybe you're right. As @Scuzzy said, it actually could solve the problem without the use of regular arrays. The problem is that those data cause the single array to exceed the memory. I can override this by ini_set, but it's not a solution, since it will just drastically abuse the server. I wanted to process each array, destroy it, and then the loop would give the remaining arrays. But generators sound like the real solution.
Let me know how it works out, I usually script\program as efficiently as possible and never had these issues to resolve so I'm curious as well. But I suspect it will resolve the situation obviously.
Sure, I'll let you know, probably tomorrow when I'll be coding again :)
I was able to properly test everything only a few days ago, so I'm reporting now. Generators have worked :)
1

Based on your comments, it sounds like you don't need separate array variables. You can reuse the same one. When it gets to the max size, do your processing and reinitialize it:

$max_array_size = 50000;

$n = 1;
$new_array = [];

foreach ($data as $val) {
    $new_array[] = $val;

    if ($max_array_size == $n++) {
        // process $new_array however you need to, then empty it
        $new_array = [];
        $n = 1;
    }
}
if ($new_array) {
    // process the remainder if the  last bit is less than max size
}

3 Comments

My idea was to generate an array with the first chunk of values. Then, process it in that loop, get rid of that variable to free the memory. And then, let the loop repeat to process the remaining arrays. Since generators have been suggested, I will try them out since they do solve the memory problem
Oh, well if that's the case you wouldn't really need separate array variables. you could just reuse the same one.
Thanks, I will try this method and decide which works best for my code :)
0

You could create an array and use extract() to get variables from this array:

$required_number_of_arrays = ceil($data/50000);
$new_arrays = array();
for ($i = 1;$i <= $required_number_of_arrays;$i++) {
   $new_arrays["new_array$i"] = $data;
}
extract($new_arrays);

print_r($new_array1);
print_r($new_array2);
//...

1 Comment

I'm guessing that this would leave a huge memory imprint on the variable table of PHP. Not just that, all the values still recede in the memory. Not resolving, but only making it worse. But again, I'm guessing looking it it logically.
0

I think in your case you have to create an array that holds all your generated arrays insight.

so first declare a variable before the loop.

$global_array = [];

insight the loop you can generate the name and fill that array.

$global_array["new_array$i"] = $val;

After the loop you can work with that array. But i think in the end that won't fix your memory limit problem. If fill 5 array with 200k entries it should be the same as filling one array of 200k the amount of data is the same. So it's possible that you run in both ways over the memory limit. If you can't define the limit it could be a problem.

ini_set('memory_limit', '-1');

So you can only prevent that problem in processing your values directly without saving something in an array. For example if you run a db query and process the values directly and save only the result.

You can try something like this:

foreach ($data as $key => $val) {
   $new_array$i[] = $val;
   unset($data[$key]);
}

Then your value is stored in a new array and you delete the value of the original data array. After 50k you have to create a new one.

Easier way use array_chunk to split your array into parts.

https://secure.php.net/manual/en/function.array-chunk.php

3 Comments

Thanks for your answer, however, this is exactly the thing, it will not solve the problem. What I wanted to do is to create smaller arrays, process one, destroy it, and let the loop to generate the other ones and so on. It's a bad idea to override memory limit since it will only overload the server's performance.
Then run through your $data array save the value in the new array and pop the value out of the main array. I update my answer.
Won't it just move all the memory exceeding data to the new array? I could move 50000 values to a new array, but it's too complicated to implement. I think generators are a great idea. They process each value and then destroy the old one to keep only one value in memory.
0

There's non need for multiple variables. If you want to process your data in chunks, so that you don't fill up memory, reuse the same variable. The previous contents of the variable will be garbage collected when you reassign it.

$chunk_size = 50000;
$number_of_chunks = ceil($data_size/$chunk_size);
for ($i = 0; $i < $data_size; $i += $chunk_size) {
    $new_array = array();
    foreach ($j = $i * $chunk_size; $j < min($j + chunk_size, $data_size); $j++) {
        $new_array[] = get_data_item($j);
    }
}

$new_array[$i] serves the same purpose as your proposed $new_array$i.

7 Comments

Thank you, but it is the same as appending all the data into one single array. The memory usage won't shrink by implementing variable variables.
This is not one single array for all the data. The top-level array is distinct from the arrays that it refers to in its elements.
What is $data? You write ceil($data/50000), so it must be a number. But then you write foreach ($data as $val) so it must be an array. Which is it?
And that foreach loop is no different from writing $new_array$i = $data;, it just makes a copy of the array.
If you're going to destroy the array after processing it, why do they need to be in different variables? Use the same variable and just reinitialize it, which will free up the memory from the previous array.
|
0

You could do something like this:

$required_number_of_arrays = ceil(count($data)/50000);
for ($i = 1;$i <= $required_number_of_arrays;$i++) {
 $array_name = "new_array_$i";
 $$array_name = [];
 foreach ($data as $val) {
  ${$array_name}[] = $val;
 }
}

4 Comments

I don't quite understand how it works or what exactly it does :D
It's a "variable variable". Basically, $$array_name takes the value of $array_name and convert is to a variable. So in the first pass, $$array_name will be the equivalent to $new_array_1, in the second pass it'll be equivalent to $new_array_2, and so on. Check php.net/manual/en/language.variables.variable.php
I will stick with generators for now and will also try emptying one single array in the process as suggested. Your suggestion looks a little more confusing to me as I'm not very familiar with using double $ etc. But if anything, I will try it too :) Thank you!
Variable variables, imho something to avoid. Cause quite frankly, you never know which variable is defined (reading the source code) and noor can any IDE help you to figure that out. It also allows for deformed syntax, $0 is not allowed in PHP, unless defined using this method. Noor will it help you with the memory issue as still every value to ceil(count($data)/50000) is stored with the memory of PHP, excluding the rest of the 150k values.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.