Adding multiple columns to dataframe and skip empty values

Question

I have a dataframe like this:

s = {'B1': ['1C', '3A', '41A'], 'B2':['','1A','28A'], 'B3':['','','3A'],
     'B1_m':['2','2','2'], 'B2_m':['2','4','2'],'B3_m':['2','2','4'], 
     'E':['0','0','0']}
s = DataFrame(s)
print(s)

    B1   B2  B3 B1_m B2_m B3_m  E
0   1C             2    2    2  0
1   3A   1A        2    4    2  0
2  41A  28A  3A    2    2    4  0

and I add these multiple columns to a new colum Results by this format:

s['Results'] = s['B1']+s['B1_m']+'-'+s['B2']+s['B2_m']+'-'+s['B3']+s['B3_m']+'-'+s['E']
print(s)

    B1   B2  B3  B1_m  B2_m  B3_m   E            Results
0   1C              2     2     2   0          1C2-2-2-0
1   3A   1A         2     4     2   0        3A2-1A4-2-0
2  41A  28A  3A     2     2     4   0    41A2-28A2-3A4-0

But, what I want is skip the item if there's empty value in B1-B3, like this:

    B1   B2  B3  B1_m  B2_m  B3_m   E            Results
0   1C              2     2     2   0              1C2-0
1   3A   1A         2     4     2   0          3A2-1A4-0
2  41A  28A  3A     2     2     4   0    41A2-28A2-3A4-0

Is there any ways for conditionally skip those empty value?
Thanks in advance

dataista · Accepted Answer · 2018-11-19 04:47:38Z

2

Using numpy.where is the most pythonic way I can think to solve this:

import numpy as np

s['Results'] = s['B1']+s['B1_m'] + \
                  np.where(s['B2'], '-'+s['B2']+s['B2_m'], "") + \
                  np.where(s['B3'], '-'+s['B3']+s['B3_m'], "") +'-'+s['E']

Will obtain the results you want:

print(s)
    B1   B2  B3 B1_m B2_m B3_m  E          Results
0   1C             2    2    2  0            1C2-0
1   3A   1A        2    4    2  0        3A2-1A4-0
2  41A  28A  3A    2    2    4  0  41A2-28A2-3A4-0

(Note that the \ are required to insert a line break during the long statement).

answered Nov 19, 2018 at 4:47

dataista

3,5471 gold badge19 silver badges23 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Space Impact · Accepted Answer · 2018-11-19 04:28:16Z

2

One way is to str.replace single digits using regex and concat column E as:

s['Results'] = s['Results'].str.replace(r'\b\-[0-9]\b','')+'-'+s['E']

Or:

s['Results'] = s['Results'].str.replace(r'\b\-\d\b','')+'-'+s['E']

print(s)
    B1   B2  B3 B1_m B2_m B3_m  E          Results
0   1C             2    2    2  0            1C2-0
1   3A   1A        2    4    2  0        3A2-1A4-0
2  41A  28A  3A    2    2    4  0  41A2-28A2-3A4-0

If the digits are more than one then use:

s['Results'] = s['Results'].str.replace(r'\b\-\d+\b','')+'-'+s['E']

answered Nov 19, 2018 at 4:28

Space Impact

13.3k26 silver badges51 bronze badges

Collectives™ on Stack Overflow

Adding multiple columns to dataframe and skip empty values

2 Answers 2

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related