2

There is a list of values

weather = ['cloudy', 'sunny']

I've got a dataframe with an old column "weather". We switched to 2 newer columns with boolean values, so all the old columns need to be accounted for.

Here is my dataframe now:

[In]
data = [['cloudy', False, False], ['sunny', False, False]]
df = pd.DataFrame(data, columns=['old', 'cloudbool', 'sunbool'])
df
[Out]
     old  cloudbool sunbool
0   cloudy  False   False
1   sunny   False   False

Desired output:

[In]
data = [['cloudy', True, False], ['sunny', False, True]]
df = pd.DataFrame(data, columns=['old', 'cloudbool', 'sunbool'])
[Out]
    old   cloudbool sunbool
0   cloudy  True    False
1   sunny   False   True

I know I could do something like what I've got below, but I've got a list of "weather types" much longer than 2.

df.loc[df['old'] == 'cloudy', ['cloudbool']] = True

I hope I conveyed that properly. Thank you

2 Answers 2

1

Let's try str.get_dummies to create dummy indicator variables, then join it with original dataframe:

df[['old']].join(df['old'].str.get_dummies().astype(bool).add_suffix('bool'))

      old  cloudybool  sunnybool
0  cloudy        True      False
1   sunny       False       True
Sign up to request clarification or add additional context in comments.

Comments

1
  1. I know that the get_dummies method is built for this, but another way to do this would be to create a series from list comprehension that compares the weather values in your old column to your bool column names (assuming they already exist as in your example). Then, covert it to a list in preparation for adding it to a datframe.
  2. You don't have a direct match on the names, so I have omitted the last two characters, e.g. cloudy would be clou and sunny would be sun. I don't think any weather could have a suffix > 2 characters? Again this is why this wouldn't be as robust as get_dummies. You could also make your column names match your values, e.g. cloudybool and sunnybool:

s = df.apply(lambda x: [x['old'][:-2] in col for col in df.columns[1:]], axis=1).to_list()
df1 = pd.concat([df['old'],pd.DataFrame(s, columns=df.columns[1:])], axis=1)
df1
Out[1]: 
      old  cloudbool  sunbool
0  cloudy       True    False
1   sunny      False     True

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.