I have quite a complicated problem, and I wondered if any of you coding wizards would be able to give me a hand :p
I want to use two regex patterns using one lambda expression.
The code is applied to a column of a pandas Dataframe.
We loop over all the elements in the column. If the string contains a '[' ,square bracket, one regex pattern has to be executed. If the string doesn't contain the square bracket the other regex pattern has to be executed.
The two working regex patterns can be found below.
For the moment they are separated, but I want to combine them.
I have following code which works fine:
chunk['http'] = chunk.loc[chunk['Protocol'] == 'HTTP', 'Information'].apply(
lambda x: re.sub(r'\b[^A-Z\s]+\b', '', x))
chunk['http'] = chunk.loc[chunk['Protocol'] == 'HTTP', 'Information'].apply(
lambda x: re.sub(r'\[(.*?)\]', '', x))
The first expression only keeps the values in CAPS. The second expression only keeps the values between square brackets.
I have tried to combine both of them in the next piece of code:
chunk['http'] = chunk.loc[chunk['Protocol'] == 'HTTP', 'Information'].apply(
lambda x: re.sub(r'\b[^A-Z\s]+\b', '', x)) \
if '[' in x == False\
else re.sub(r'\[(.*?)\]', '', x)
However this returns following error:
NameError: free variable 'x' referenced before assignment in enclosing scope