2

I have a dataframe with two columns and want to modify one column based on value of other column.

Example

unit        name
feet        abcd_feet
celcius     abcd_celcius
yard        bcde_yard
yard        bcde

If the unit is feet or yard and the name ends with it then I wanna remove it from the column.

unit        name
feet        abcd
celcius     abcd_celcius
yard        bcde
yard        bcde

1 Answer 1

1

There are two possible ways of solving your problem:

First method, the faster, as pandas is column-based:

UNITS_TO_REMOVE = {'feet', 'yard'}

df['value_'], df['unit_'] = df['name'].str.split('_').str
values_to_clean = (df['unit_'].isin(UNITS_TO_REMOVE)) & (df['unit_'] == df['unit'])
df.loc[values_to_clean, 'name'] = df.loc[values_to_clean, 'value_']
df.drop(columns=['unit_', 'value_'], inplace=True)

Here is the result,

    unit    name
0   feet    abcd
1   celcius abcd_celcius
2   yard    bcde
3   yard    bcde

Performances: 20 ms ± 401 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) (on a (4000, 2) dataframe)


Second method, using apply (which is sometimes the only available solution):

UNITS_TO_REMOVE = {'feet', 'yard'}

def remove_unit(unit, value):
    if unit not in UNITS_TO_REMOVE or '_' not in value:
        return value
    else:
        row_value, row_unit = value.split('_')
        if row_unit == unit:
            return row_value
        else:
            return value

df['name'] = df.apply(lambda row: remove_unit(row['unit'], row['name']), axis=1)

Output:


    unit    name
0   feet    abcd
1   celcius abcd_celcius
2   yard    bcde
3   yard    bcde

Performances: 152 ms ± 3.95 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.