I'm very new to python so there may be a simple solution here. I'm trying to clean a data set about rent prices/square footage within a panda data frame. My data column for bedrooms includes information about bedrooms AND square feet. Most of the entries are formatted like "/ 1br - 950ft²" but some are "/ 1br" and some are "/950ft²". I'm trying to create a clean column with just bedrooms, but because of formatting I can't just split the string after a certain character.
I've decided I need to create a function to test for if the string contains "br", but I'm getting an error.
Here's my code:
def cleaned_bedrooms(x):
if df[df['bedrooms'].str.contains('br')]:
df['bedrooms'] = df['bedrooms'].str.split('-').str[0]
else:
return None
df['bedrooms'].map(cleaned_bedrooms)
I seem to have set up a boolean function though (I assume triggered by the if statement), because the error I'm getting is "ValueError: The truth value of a DataFrame is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all()." for the line containing the .map(cleaned_bedrooms)