Generate 3D "matrix" with Pandas, based on comparing two dataframes [Python]

Question

Good morning everyone. I am working with Python and Pandas.

I have two DataFrames, of the following type:

df_C = pd.DataFrame(data=[[-3,-1,-1], [5,3,3], [3,3,1], [-1,-1,-3], [-3,-1,-1], [2,3,1], [1,1,1]], columns=['C1','C2','C3'])

   C1  C2  C3
0  -3  -1  -1
1   5   3   3
2   3   3   1
3  -1  -1  -3
4  -3  -1  -1
5   2   3   1
6   1   1   1


df_F = pd.DataFrame(data=[[-1,1,-1,-1,-1],[1,1,1,1,1],[1,1,1,-1,1],[1,-1,-1,-1,1],[-1,0,0,-1,-1],[1,1,1,-1,0],[1,1,-1,1,-1]], columns=['F1','F2','F3','F4','F5'])

   F1  F2  F3  F4  F5
0  -1   1  -1  -1  -1
1   1   1   1   1   1
2   1   1   1  -1   1
3   1  -1  -1  -1   1
4  -1   0   0  -1  -1
5   1   1   1  -1   0
6   1   1  -1   1  -1

I would like to be able to "cross" these two DataFrames, to generate or one in 3D, as follows:

The new data that is generated must compare the values of the df_F with the values of the df_C, taking into account the following:

If both values are positive, generate 1
If both values are negative, generate 1
If one value is positive and the other negative, it generates 0
If any of the values is zero, it generates None (NaN)

True table

Comparison of the data df_C vs df_F

df_C vs df_F = 3D
  +       +     1
  +       -     0
  +       0     None
  -       +     0
  -       -     1
  -       0     None
  0       +     None
  0       -     None
  0       0     None

You, who are experts in programming, could you please guide me, as I generate this matrix, I compare the values. I wish to do it with Pandas. I have done it with loops (for) and conditions (if), but it is visually unpleasant and I think that with Pandas it is more efficient and elegant.

Thank you.

Shubham Sharma · Accepted Answer · 2021-05-22 09:01:42Z

3

Numpy `broadcasting` and `np.select`

Broadcast and multiply the values in df_C with the values from df_F in such a way that the shape of the resulting product matrix will be (3, 7, 5), then test for the condition where the values in the product matrix are positive, negative or zero and assign the corresponding values 1, 0 and NaN where the condition holds True

a = df_C.values.T[:, :, None] * df_F.values
a = np.select([a > 0, a < 0], [1, 0], np.nan)

array([[[ 1.,  0.,  1.,  1.,  1.],
        [ 1.,  1.,  1.,  1.,  1.],
        [ 1.,  1.,  1.,  0.,  1.],
        [ 0.,  1.,  1.,  1.,  0.],
        [ 1., nan, nan,  1.,  1.],
        [ 1.,  1.,  1.,  0., nan],
        [ 1.,  1.,  0.,  1.,  0.]],

       [[ 1.,  0.,  1.,  1.,  1.],
        [ 1.,  1.,  1.,  1.,  1.],
        [ 1.,  1.,  1.,  0.,  1.],
        [ 0.,  1.,  1.,  1.,  0.],
        [ 1., nan, nan,  1.,  1.],
        [ 1.,  1.,  1.,  0., nan],
        [ 1.,  1.,  0.,  1.,  0.]],

       [[ 1.,  0.,  1.,  1.,  1.],
        [ 1.,  1.,  1.,  1.,  1.],
        [ 1.,  1.,  1.,  0.,  1.],
        [ 0.,  1.,  1.,  1.,  0.],
        [ 1., nan, nan,  1.,  1.],
        [ 1.,  1.,  1.,  0., nan],
        [ 1.,  1.,  0.,  1.,  0.]]])

edited May 22, 2021 at 9:01

answered May 22, 2021 at 8:46

Shubham Sharma

71.8k6 gold badges26 silver badges58 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Federica F. Over a year ago

Shubham, thank you very much, Your solution is flawless, with a perfect explanation. Very elegant!

Collectives™ on Stack Overflow

Generate 3D "matrix" with Pandas, based on comparing two dataframes [Python]

1 Answer 1

Numpy `broadcasting` and `np.select`

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Numpy broadcasting and np.select

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related

Numpy `broadcasting` and `np.select`