2

I have the following numpy arrays which are of different shape. I want to use pandas to create a dataframe so that I can display it neatly as shown below:

numpy arrays:

et_arr:  [  8.94668401e+01   1.66449935e+01  -4.44089210e-14]
ea_arr:  [ 100.           21.84087363    1.04031209]
it: 
[[ 0.1728      1.0688      1.4848      1.6008    ]
 [ 1.36746667  1.62346667  1.63946667  0.        ]
 [ 1.64053333  1.64053333  0.          0.        ]
 [ 1.64053333  0.          0.          0.        ]]

resulting dataframe:

enter image description here

One way is to loop around among all 3 arrays and collect based on the index. I have tried numpy.column_stack and zip and map to some extent but to not the desired result.

I always have used pandas dataframe to display results and it was easy. This one seems a little tricky. How can I achieve this.

2
  • What information do you have in your data that tells you it's the first entry (column 1) in ea_arr that is missing? Commented Dec 23, 2017 at 18:35
  • Its by default 100% . ea stands for error approximation. Therefore as you can see my ea_arr[0] is 100. Commented Dec 23, 2017 at 19:05

1 Answer 1

2

If you have put the arrays into a dict data, you can loop over keys and add as you go:

data = {"et_arr":[8.94668401e+01,1.66449935e+01,-4.44089210e-14],
        "ea_arr":[100.,21.84087363,1.04031209],
        "it":[[0.1728,1.0688,1.4848,1.6008],
              [1.36746667,1.62346667,1.63946667,0.],
              [1.64053333,1.64053333,0.,0.],
              [1.64053333,0.,0.,0.]]}

# To keep track of the order of dict indices we'll capture them as we loop:
indices = []
df = pd.DataFrame()

for k in data.keys():
    df = pd.concat([df, pd.DataFrame(data[k]).T], ignore_index=True).fillna(0)
    if k == "it":
        indices.extend([f"n={i+1}" for i in range(len(data[k]))])
    else:
        indices.append(k)

df.index = indices
df.columns = df.columns + 1

df
                1          2             3         4
et_arr   89.46684  16.644994 -4.440892e-14  0.000000
ea_arr  100.00000  21.840874  1.040312e+00  0.000000
n=1       0.17280   1.367467  1.640533e+00  1.640533
n=2       1.06880   1.623467  1.640533e+00  0.000000
n=3       1.48480   1.639467  0.000000e+00  0.000000
n=4       1.60080   0.000000  0.000000e+00  0.000000

Alternately, you can mash it all together by hand, but that's less scalable:

df = pd.DataFrame(it)
arr_df = pd.DataFrame([et_arr,ea_arr])
df = pd.concat([df, arr_df], ignore_index=True).fillna(0)
df.columns = range(1,5)
df.columns.name = "iter"
df.index = ["n=1","n=2","n=3","n=4","et","ea"]
Sign up to request clarification or add additional context in comments.

2 Comments

is there a way to not display anything at all instead of fillna(0). It will give the wrong impression that et and ea and others are 0.0%. Whereas, in actuality they were not necessary to be calculated.
You can use fillna(''), but note that this will convert any column with NaN values to type object (as '' is a string and not a number).

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.