2

I have several dataframes which have the same look but different data.

DataFrame 1

                          bid
                        close
time                         
2016-05-24 00:00:00       NaN
2016-05-24 00:05:00  0.000611
2016-05-24 00:10:00 -0.000244
2016-05-24 00:15:00 -0.000122

DataFrame 2

                          bid
                        close
time                         
2016-05-24 00:00:00       NaN
2016-05-24 00:05:00  0.000811
2016-05-24 00:10:00 -0.000744
2016-05-24 00:15:00 -0.000322

I need to build a list of the dataframes, then pass that list of dataframes to a function that can take a list of dataframes and converts it to a numpy array. So below, each entry in the matrix is the elements of the dataframe ('bid close') column. Notice I don't need the index 'time' column

data = np.array([dataFrames])

returns this (example not actual data)

[[-0.00114415  0.02502565  0.00507831 ...,  0.00653057  0.02183072
  -0.00194293] `DataFrame` 1 is here ignore that the data doesn't match above
 [-0.01527224  0.02899528 -0.00327654 ...,  0.0322364   0.01821731
  -0.00766773] `DataFrame` 2 is here ignore that the data doesn't match above
 ....]]

2 Answers 2

1

Try

master_matrix = pd.concat(list_of_dfs, axis=1)
master_matrix = master_matrix.values.reshape(master_matrix.shape, order='F')

if each row in the final matrix corresponds to the same date

master_matrix = pd.concat(list_of_dfs, axis=1).values

otherwise.

Edit to address the newly added example. In this case, you can use np.vstack on columns returned from each dataframe.

import pandas as pd
import numpy as np
from io import StringIO

df1 = pd.read_csv(StringIO(
'''
time                bid_close
2016-05-24 00:00:00       NaN
2016-05-24 00:05:00  0.000611
2016-05-24 00:10:00 -0.000244
2016-05-24 00:15:00 -0.000122
'''), sep=r' +')

df2 = pd.read_csv(StringIO(
'''
time                bid_close
2016-05-24 00:00:00       NaN
2016-05-24 00:05:00  0.000811
2016-05-24 00:10:00 -0.000744
2016-05-24 00:15:00 -0.000322
'''), sep=r' +')

dfs = [df1, df2]

out = np.vstack(df.iloc[:,-1].values for df in dfs)

Result:

In [10]: q.out
Out[10]:
array([[      nan,  0.000611, -0.000244, -0.000122],
       [      nan,  0.000811, -0.000744, -0.000322]])
Sign up to request clarification or add additional context in comments.

5 Comments

that returns a numpy array?
bad naming; fixed. in general, df.values returns a numpy array.
That is pretty cool and close, but not quite what I need. This is the result [ 6.11097531e-04 -7.07217396e-05 -9.88878916e-05 -6.22477917e-05 -1.05367416e-05] I need the entire dataframe across in each row. So it is as wide as there are data in the dataframes, and tall as there are dataframes.
Can you provide one additional input, and the desired result for the case with 2 inputs?
I clarified the original post. The entire bid column of dataframe 1 becomes a row in the numpy array[0]. The entire bid column of dataframe 2 becomes a row in numpy array[1] and so on...
1

Setup

import pandas as pd
import numpy as np

df1 = pd.DataFrame([1, 2, 3, 4],
                   index=pd.date_range('2016-04-01', periods=4),
                   columns=pd.MultiIndex.from_tuples([('bid', 'close')]))
df2 = pd.DataFrame([5, 6, 7, 8],
                   index=pd.date_range('2016-03-01', periods=4),
                   columns=pd.MultiIndex.from_tuples([('bid', 'close')]))
print df1

             bid
           close
2016-04-01     1
2016-04-02     2
2016-04-03     3
2016-04-04     4

print df2

             bid
           close
2016-03-01     5
2016-03-02     6
2016-03-03     7
2016-03-04     8

Solution

df = np.concatenate([d.T.values for d in [df1, df2]])

print df

[[1 2 3 4]
 [5 6 7 8]]

Note

The indices were not required to line up. This just takes the raw np.array from each dataframe and uses np.concatenate to do the rest.

1 Comment

Thanks. Not sure which to use above or this.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.