def Clean_Data(df):
df.replace({ r'\A\s+|\s+\Z': '', '\n' : ' ', '\w\s+\w|\w\n\w': '\w\s\w'}, regex=True, inplace=True)
return df
I would like to clean my dataframe before I work on it. I need to get rid of:
double whitespace
whitespace + linebreak
-> and replace it with a single whitespace.
As well I want to check if there is more than one whitespace between two words (letters or numbers) and reduce it to a single whitespace.
And at least Check if there ae whitespaces between words and signs (, or .) and replace with ''.
But I have literally no idea of regex and getting already an error for bad escape \w