0

I have data, example :

2017/06/07 10:42:35,THREAT,url,192.168.1.100,52.25.xxx.xxx,Rule-VWIRE-03,13423523,,web-browsing,80,tcp,block-url
2017/06/07 10:43:35,THREAT,url,192.168.1.101,52.25.xxx.xxx,Rule-VWIRE-03,13423047,,web-browsing,80,tcp,allow
2017/06/07 10:43:36,THREAT,end,192.168.1.100,52.25.xxx.xxx,Rule-VWIRE-03,13423047,,web-browsing,80,tcp,block-url
2017/06/07 10:44:09,TRAFFIC,end,192.168.1.101,52.25.xxx.xxx,Rule-VWIRE-03,13423111,,web-browsing,80,tcp,allow
2017/06/07 10:44:09,TRAFFIC,end,192.168.1.103,52.25.xxx.xxx,Rule-VWIRE-03,13423111,,web-browsing,80,tcp,block-url

How to parse that only get data columns 4,5,7, and 12 in all rows?

This is my code :

import csv

file=open('filename.log', 'r')
f=open('fileoutput', 'w')
lines = file.readlines()

        for line in lines:
        result.append(line.split(' ')[4,5,7,12])
        f.write (line)

f.close()
file.close()
1
  • You meant to write result, not line, right? And your input doesn't match your code-- it cannot be split on ' '. And ... Commented Sep 23, 2017 at 10:06

5 Answers 5

3

The right way with csv.reader and csv.writer objects:

import csv

with open('filename.log', 'r') as fr, open('filoutput.csv', 'w', newline='') as fw:
    reader = csv.reader(fr)
    writer = csv.writer(fw)
    for l in reader:
        writer.writerow(v for k,v in enumerate(l, 1) if k in (4,5,7,12))

filoutput.csv contents:

192.168.1.100,52.25.xxx.xxx,13423523,block-url
192.168.1.101,52.25.xxx.xxx,13423047,allow
192.168.1.100,52.25.xxx.xxx,13423047,block-url
192.168.1.101,52.25.xxx.xxx,13423111,allow
192.168.1.103,52.25.xxx.xxx,13423111,block-url
Sign up to request clarification or add additional context in comments.

2 Comments

I've tried, but can not. when i run .py appear comment this TypeError: 'newline' is an invalid keyword argument for this function
@baharudinyusuf, It seems that you are on Python 2.7, you should have mentioned that in your question
1

This is wrong:

line.split(' ')[4,5,7,12]

You want this:

fields = line.split(' ')
fields[4], fields[5], fields[7], fields[12]

4 Comments

I've tried, but can not. this my code import csv file=open('file.log', 'r') f=open('file', 'w') lines = file.readlines() for line in lines: fields = line.split(' ') fields[4], fields[5] f.write (line) f.close() file.close()
Cannot ah? Where got cannot?
File "parsingcolom.py", line 7 for line in lines: ^ IndentationError: unexpected indent
So you've entered code in the wrong format (unexpected indent). But then you post the code in a comment where we cannot see your indentation. We won't be able to help you like this.
1

a solution using pandas

import pandas as pd

df = pd.read_csv('filename.log', sep=',', header=None, index_col=False)
df[[3, 4, 6, 11]].to_csv('fileoutput.csv', header=False, index=False)

Note the use of [3, 4, 6, 11] instead of [4, 5, 7, 12] to account for 0-indexing in the dataframe's columns.

Content of fileoutput.csv:

192.168.1.100,52.25.xxx.xxx,13423523,block-url
192.168.1.101,52.25.xxx.xxx,13423047,allow
192.168.1.100,52.25.xxx.xxx,13423047,block-url
192.168.1.101,52.25.xxx.xxx,13423111,allow
192.168.1.103,52.25.xxx.xxx,13423111,block-url

Comments

0

You're on the right path, but your syntax is off. Here's an example using csv module:

import csv
log = open('filename.log')
# newline='\n' to prevent csv.writer to include additional newline when writing to file
log_write = open('fileoutput', 'w', newline='\n')
csv_log = csv.reader(log, delimiter=',')
csv_writer = csv.writer(log_write, delimiter=',')

for line in csv_log:
    csv_writer.writerow([line[0], line[1], line[2], line[3]]) # output first 4 columns

log.close()
log_write.close()

5 Comments

You forgot to add field separators and newlines. (Also, that's what the csv module is for)
@alexis thanks, now i want to parse the data to display the row whose second column is THREAT and the fourth column is ip 192.168.1.100 and 192.168.1.101. how to code to run it?
@yusuf, seriously? Learn how this site works. And ask questions that'll help you learn how to program, don't just slice your project into pieces and try to delegate it here.
@alexis I've tried and I've also asked here stackoverflow.com/questions/46380386/…
I checked your question. I'm sorry to say it but you can't expect a very friendly reception on this site with this kind of question. You need to help yourself, learn the basics so you can ask coherent questions. It's ok not to know things and to try to learn, but if there are multiple things wrong with your code that are not even what you're asking about, there's no good way to help you.
0

Looking at the list compressions, you could have something like this without necessarily using csv module

file=open('filename.log','r') 
f=open('fileoutput', 'w')
lines = file.readlines()
for line in lines:
    f.write(','.join(line.split(',')[i] for i in [3,4,6,11]))

f.close()
file.close()

Notice the indices are 3,4,6,11 for our zero index based list

output

cat fileoutput 
192.168.1.100,52.25.xxx.xxx,13423523,block-url
192.168.1.101,52.25.xxx.xxx,13423047,allow
192.168.1.100,52.25.xxx.xxx,13423047,block-url
192.168.1.101,52.25.xxx.xxx,13423111,allow
192.168.1.103,52.25.xxx.xxx,13423111,block-url

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.