0

Edit: As explained below in @floydian's comment, the problem was that calling a = np.array(a, dtype=d) creates an a double array which was causing the problem.

I am aware that this has been already asked multiple times, and in fact am looking at Creating a Pandas DataFrame with a numpy array containing multiple types answer right now. But I still seem to have a problem while converting. It must be something very simple that I am missing. I hope that someone can be so kind and point it out. Sample code below:

import numpy as np
import pandas as pd

a = np.array([[1, 2], [3, 4]])
d = [('x','float'), ('y','int')]
a = np.array(a, dtype=d)

# Try 1
df= pd.DataFrame(a)
# Result - ValueError: If using all scalar values, you must pass an index

# Try 2
i = [1,2]
df= pd.DataFrame(a, index=i)
# Result - Exception: Data must be 1-dimensional
5
  • What is your expected output? Commented Mar 6, 2018 at 19:41
  • DataFrame with specified column names and data types Commented Mar 6, 2018 at 19:42
  • I mean, yeah, but what does it look like? Is the first column int and the second float? Commented Mar 6, 2018 at 19:44
  • 1
    For example, are you looking for pd.DataFrame(a.ravel())? Commented Mar 6, 2018 at 19:44
  • oh right, yes that's the idea: the first column int and the second float Commented Mar 6, 2018 at 19:45

2 Answers 2

2

I would define the array like this:

a = np.array([(1, 2), (3, 4)], dtype=[('x','float'), ('y', 'int')])
pd.DataFrame(a)

gets what you want.

Sign up to request clarification or add additional context in comments.

3 Comments

Unfortunately in my original problem the array is defined in the way I did it above.
a = np.array(a, dtype=d) creates two 2x2 arrays, a['x'] with float as dtype applied to all values and a['y'] with int as dtype applied to all values. I am guessing this is the reason why you're getting all those errors.
That was indeed the reason. Thank you for pointing me out in the right direction.
1

One option to separate it after the fact could be e.g.

pd.DataFrame(a.astype("float32").T, columns=a.dtype.names).astype({k: v[0] for k, v in a.dtype.fields.items()})

Out[296]: 
     x  y
0  1.0  3
1  2.0  4

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.