2

I have a pandas data frame df1

Time   sat1  sat2 sat3  sat4    val1  val2  val3   val4
10      2     4    2     4       0.1  -1.0   1     2.0
20      3     1    1     3       1.6   0     2.1   -0.7
30      12    8    8     16      0.5   1.1   0.6    2.0
40      2     1    2     12      1.0   1.2   0.4    3.7

I want to compare sat1,sat2 with sat3 and sat4 at all time instant. If there is match between these two columns ,I want to get number of matched elements and subtract matched elements values columns.

Expected Output:

 match_count         Reslt_1           Reslt_2
 2                 val1-val3         val2-val4
 2                 val1-val4         val2-val3
 1                 Nan               val2-val3
 1                 val1-val3          Nan       ( w.r.t match found in sat1 or sat2)          

These data are sample data and number of columns may increase . Data in sat1,sat2 are toggling in sat3 & sat4 and that is why subtraction will happen accordingly.

How can I obtain above expected output using pandas. I obtained above dataframe using pandas concat function.

1 Answer 1

2

You can compare with eq, but if no match is necessary add new column with assign for NaNs. Then get position of columns with argmax, extract values in val columns and subtract:

#remove trailing whitespaces in columns names
df.columns = df.columns.str.strip()

a = df[['sat3','sat4']].eq(df['sat1'], axis=0).assign(no = True)
a1 = a.values.argmax(axis=1)
df['Reslt_1'] =  df['val1'] - df[['val3','val4']].assign(no = np.nan).values[df.index, a1]

b = df[['sat3','sat4']].eq(df['sat2'], axis=0).assign(no = True)
b1 = b.values.argmax(axis=1)
df['Reslt_2'] =  df['val2'] - df[['val3','val4']].assign(no = np.nan).values[df.index, b1]

df['match_count'] = a.sum(1) - 1 + b.sum(1) - 1

print (df)

   Time  sat1  sat2  sat3  sat4  val1  val2  val3  val4  Reslt_1  Reslt_2  \
0    10     2     4     2     4   0.1  -1.0   1.0   2.0     -0.9     -3.0   
1    20     3     1     1     3   1.6   0.0   2.1  -0.7      2.3     -2.1   
2    30    12     8     8    16   0.5   1.1   0.6   2.0      NaN      0.5   
3    40     2     1     2    12   1.0   1.2   0.4   3.7      0.6      NaN   

   match_count  
0            2  
1            2  
2            1  
3            1  
Sign up to request clarification or add additional context in comments.

7 Comments

Maybe some typo in columns names, what return print (df.columns.tolist()) ?
There is space in sat3 , in the end. So try first df.columns = df.columns.str.strip()
I add it to answer.
Super ! Instead of subtracting i.e val1-val3 I want to subtract its original values.
@Thanks . I have understood :)
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.