2

I am trying to modify data frame values and mask IP addresses using regex.

I have a list of IP addresses and I am trying to mask them in the data frame:

This is what I have:

123.123.123.123 and I am expecting to get 12X.XXX.XXX.X23

23.123.123.123 and I am expecting to get 23.XXX.XXX.X23

So I am always leaving 2 first and 2 last elements of IP, the rest of IP I am trying to hide.

2 Answers 2

2

You can use regular expressions to replace anything but a dot for X, except for the first two and last two characters.

import pandas as pd
import re

df = pd.DataFrame({'ip': ['123.123.123.123', '23.123.123.123']})
df['ip_masked'] = [re.sub('(?<!^)(?<!^.)[^\.](?=.{2,}$)', r'X', x) for x in df.ip]
print(df)

                ip        ip_masked
0  123.123.123.123  12X.XXX.XXX.X23
1   23.123.123.123   23.XXX.XXX.X23
Sign up to request clarification or add additional context in comments.

Comments

1

this should help

df['ip_masked']=df.ip.str[:2]+df.ip.apply(lambda x: re.sub('\d','X',x)[2:-2])+df.ip.str[-2:]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.