0

I would like to create automatically in Python n of numpy arrays from my pandas dataframe columns. I can do this manually using for example:

numpy_array_1 = data_frame.column_1.values
numpy_array_2 = data_frame.column_2.values
...
numpy_array_n = data_frame.column_n.values

But I can not know how I should write code to create those arrays automatically.

4 Answers 4

1

You can simply use a for and loop through it. Remember that using (list(data_frame)) returns a list of the column names in the dataframe:

np_array = []
for i in list(data_frame):
    np_array.append(data_frame[i].values)

The expected output is a list that contains sublists of values. Where each sublist matches the position of the columns in the dataframe. Therefore you can either make a dictionary, or a tuple out of it. Dictionary example:

np_array_dict = {}
for i in list(data_frame):
    np_array_dict[i] = data_frame[i].values
Sign up to request clarification or add additional context in comments.

1 Comment

Thank you for your answer. A solution with a dictionary is perfect for my needs.
0

You can get matrix of all dataframe rows and columns values as simple as df.values Do you really need distinct array for each column?

1 Comment

No, it will be enough if I will be having arrays in one array as in the example of @Celius Stingher
0

Suppose we have a simple df:

df = pd.DataFrame({"0":[1,2], "1":[3,4]})
df 
   0  1
0  1  3
1  2  4

Then you can run:

for (key,value) in df.to_dict("list").items():
    exec("numpy_array_{} = np.array({})".format(key, value))

You'll get:

numpy_array_0
array([1, 2])

numpy_array_1
array([3, 4])

and so on.

Alternatively:

for col in list(df):
    exec("numpy_array_{} = df[str({})].values".format(col,col))

1 Comment

Thank you for your answer. I will be use this solution to another case.
0

This can be done without using loops:

df = pd.DataFrame({"0":[1,2], "1":[3,4], "2":[5,6]})
print(df)

   0  1  2
0  1  3  5
1  2  4  6

and then:

[*np.transpose(df.values)]

results in:

[array([1, 2]), array([3, 4]), array([5, 6])]

and if a dictionary is desired one just needs to proceed as follows:

dict(zip(range(df.shape[1]), [*np.transpose(df.values)]))

which gives:

{0: array([1, 2]), 1: array([3, 4]), 2: array([5, 6])}

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.