1

I am creating a program to list out the ip address and users connected in the LAN. I done by getting the data by using nmap. Next i want to change the result data to a certain data frame using pandas or any other way. How to do it.

Here's the code:

import pandas as pd
import subprocess
from subprocess import Popen, PIPE
import re

def ipget():
    i = 'nmap -sP 192.168.1.*'
    output = subprocess.getoutput(i)
    a = str(output).replace("Nmap","").replace("Starting  7.01 ( https://nmap.org ) at","").replace("scan report for","").replace("Host is up","").replace("latency","").replace("done: 256 IP addresses ","")
    data = re.sub(r"(\(.*?\)\.)", "", a)
    print(data)
#df = pd.DataFrame(data, columns = ['User', 'IP_Address']) 

#print (df) 
ipget()

the output stored in data and it is a string:

2019-05-21 18:19 IST 
android-eb20919729f10e96 (192.168.1.8)

smackcoders (192.168.1.9)

princes-mbp (192.168.1.10)

shiv-mbp (192.168.1.15)

(4 hosts up) scanned in 18.35 seconds

Required output to be created in dataframe:

User                            IP_Address
android-eb20919729f10e96        192.168.1.8
smackcoders                     192.168.1.9
princes-mbp                     192.168.1.10
shiv-mbp                        192.168.1.15

3 Answers 3

4

Saying you have text:

2019-05-21 18:19 IST 
android-eb20919729f10e96 (192.168.1.8)

smackcoders (192.168.1.9)

princes-mbp (192.168.1.10)

shiv-mbp (192.168.1.15)

(4 hosts up) scanned in 18.35 seconds

Use regex to find the data you need:

>>> ms = re.findall(r'\n([^\s]*)\s+\((\d+\.\d+\.\d+\.\d+)\)', text)
>>> ms

[('android-eb20919729f10e96', '192.168.1.8'),
 ('smackcoders', '192.168.1.9'),
 ('princes-mbp', '192.168.1.10'),
 ('shiv-mbp', '192.168.1.15')]

>>> df = pd.DataFrame(ms, columns=['User', 'IP_Address'])

Comparison to other answers:

  1. Regex is short.
  2. Regex only runs though your text once.

str.replace runs once per call so the regex solution can gain huge efficiency for long logs.

Sign up to request clarification or add additional context in comments.

Comments

3

Use StringIO

import sys
if sys.version_info[0] < 3: 
    from StringIO import StringIO
else:
    from io import StringIO

import pandas as pd
a="""
android-eb20919729f10e96 (192.168.1.8)

smackcoders (192.168.1.9)

princes-mbp (192.168.1.10)

shiv-mbp (192.168.1.15)"""

TESTDATA = StringIO(a)

df = pd.read_csv(TESTDATA, sep=" ",names=['User','IP_Address'])

Add below line to remove ( and )

import re
df.IP_Address = df.IP_Address.map(lambda x:re.sub('\(|\)',"",x))

2 Comments

However, this does include the parenthesis in IP.
@knh190 was still editing the post. Thanks for the comment
2

Assuming your string is named s the following code does what you want:

line_list = []

# iterate over each line
for line in s.split("\n"):
    #remove empty lines
    if line == '':
        continue

    #replace ( and ) with empty strings 
    line = line.replace("(", "").replace(")", "")

    line_list.append(line)

# remove first and last line
line_list = line_list[1:-1]

array = []
# split lines by " "
for line in line_list:
    array.append(line.split(" "))

# create dataframe
pd.DataFrame(array, columns = ["User", "IP_Adress"])

Using listcomprehension you can do the same as a oneliner:

pd.DataFrame([line.replace("(", "").replace(")", "").split(" ") for line in s.split("\n") if line != ""][1:-1], columns = ["User", "IP_Adress"])

3 Comments

There's definitely no need for multiple lines. One line regex is enough and more efficient.
I got it as a oneliner ;). Still your solution is way more elegant! +1 for your answer.
Upvoted but had to say that str.replace runs though whole text once per call while regex takes care of result in one run.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.