1

Suppose I have a data frame of length 5800000 which is a concatenation of 100 files where each file has 58000 rows. I have an array fv of shape (100, 10, 58000) which I want to add to the data frame by adding 10 columns. df has a length of 5800000 with two columns but only focuses on the first column index, i.e df.shape[0]

list_ = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j']  
fv = np.zeros(len(data), len(list_), int(df.shape[0]/len(files))
def add_fv_to_dataframe(_data, list_):
    for index in range(len(_data):
        for name_index, name_in_list in enumerate(list_):
            calculate something
            calcs_ = _data[index]
            fv[index, name_index, :] = calcs_
            # add the calculated values to the dataframe

            df['fv_{}'.format(name_in_list)] = pd.Series(fv.reshape(-1, (10,1)), index=df.index)


I would like to have my final data frame in the form;

df[0] df[1] fv_a fv_b fv_c fv_d fv_e fv_f fv_g fv_h fv_i fv_j
1 1 1 1 1 1 1 1 1 1 1 1
: : : : : : : : : : : :
5800000 5800000 5800000 5800000 5800000 5800000 5800000 5800000 5800000 5800000 5800000 5800000

1 Answer 1

1

Use np.swapaxes:

for i, data in enumerate(np.swapaxes(fv, 0, 1)):
    df[f"fv_{i}"] = np.ravel(data)
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.