11

I am trying to split an array into n parts. Sometimes these parts are of the same size, sometimes they are of a different size.

I am trying to use:

split = np.split(list, size)

This works fine when size divides equally into the list, but fails otherwise. Is there a way to do this which will 'pad' the final array with the extra 'few' elements?

4 Answers 4

39

Are you looking for np.array_split? Here is the docstring:

Split an array into multiple sub-arrays.

Please refer to the ``split`` documentation.  The only difference
between these functions is that ``array_split`` allows
`indices_or_sections` to be an integer that does *not* equally 
divide the axis.

See Also
--------
split : Split array into multiple sub-arrays of equal size.

Examples
--------
>>> x = np.arange(8.0)
>>> np.array_split(x, 3)
    [array([ 0.,  1.,  2.]), array([ 3.,  4.,  5.]), array([ 6.,  7.])]

http://docs.scipy.org/doc/numpy-1.10.0/reference/generated/numpy.array_split.html

Sign up to request clarification or add additional context in comments.

Comments

3
def split_padded(a,n):
    padding = (-len(a))%n
    return np.split(np.concatenate((a,np.zeros(padding))),n)

3 Comments

what is what in this answer?
It doesn't work if the sizes are big.
Writing an answer without any explanation on StackOverflow isn't a recommended practice, I would implore you to understand how to write a good answer.
2

short method: use : numpy.array_split instead of numpy.split

but the best way to go about this is to split the array using partial splits

def cs(chunksize, fs):
    sa = []
    num =chunksize
    while(num< fs):
        sa.append(num)
        num += chunksize
    sa.append(num+fs%chunksize)
    return sa

pass the function as a parameter in the split

for chunk in np.split(df, cs(chunksize,fs)):
    chunk.to_excel('{}/{}_{:02d}.xlsx'.format(output_folder,split_name, i), index=False)
    i +=1

Comments

1

You can split arrays into unequal chunks by passing indices as a list Example

**x = np.arange(10)**
x
(0,1,2,3,4,5,6,7,8,9)
np.array_split(x,[4])
[array([0,1,2,3],dtype = int64),
       array([4,5,6,7,8,9],dtype = int64)**

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.