Extract Data from file using python

Question

Input File:

["abc","on time","date","<a href='link'>11111</a>","time","2","2"],

["abc","on time","date","<a href='link'>11111</a>","time","2","6"],

["abc","on time","date","<a href='link'>11111</a>","time","2","9"],

["abc","on time","date","<a href='link'>11111</a>","time","2","0"],

["abc","on time","date","<a href='link'>11111</a>","time","2","5"]

output to be needed:

abc,on time,date,<a href='link'>11111</a>,time,2,2

abc,on time,date,<a href='link'>11111</a>,time,2,6

abc,on time,date,<a href='link'>11111</a>,time,2,9

abc,on time,date,<a href='link'>11111</a>,time,2,0

abc,on time,date,<a href='link'>11111</a>,time,2,5

Code tried:

import sys
import re

Lines = [Line.strip() for Line in open (sys.argv[1],'r').readlines()]



for EachLine in Lines:
    Parts = EachLine.split(",")
    for EachPart in Parts:

        EachPart = re.sub(r'[', '', EachPart)
        EachPart = re.sub(r']', '', EachPart)
print ' '.join(Parts)

Can anyone help me on this?? I am not getting what i desired. Thanks in advance.

anything is fine... i could redirect the output to a file also. — blackfury
– blackfury, Commented Sep 14, 2015 at 3:37

Azmi Kamis · Accepted Answer · 2015-09-14 03:59:36Z

1

I modified your initial solution to

import sys
import re

Lines = [Line.strip() for Line in open (sys.argv[1],'r').readlines()]

for EachLine in Lines:
    matches = re.findall(r'\"(.+?)\"',EachLine)
    print ','.join(matches)

My approach is to use regex to get all string in double quotes.

answered Sep 14, 2015 at 3:59

Azmi Kamis

9015 silver badges20 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Sait · Accepted Answer · 2015-09-14 04:09:55Z

0

As already mentioned, you can use eval().

with open('a.txt') as f:
    for line in f:
        line = line.replace(',\n', '\n').strip() # remove if there is `,` at the end
        if line:                                 # to tackle with empty lines
            print(','.join(eval(line.strip())))

Input:

["abc","on time","date","<a href='link'>11111</a>","time","2","2"],

["abc","on time","date","<a href='link'>11111</a>","time","2","6"],

["abc","on time","date","<a href='link'>11111</a>","time","2","9"],

["abc","on time","date","<a href='link'>11111</a>","time","2","0"],

["abc","on time","date","<a href='link'>11111</a>","time","2","5"]

Output:

abc,on time,date,<a href='link'>11111</a>,time,2,2
abc,on time,date,<a href='link'>11111</a>,time,2,6
abc,on time,date,<a href='link'>11111</a>,time,2,9
abc,on time,date,<a href='link'>11111</a>,time,2,0
abc,on time,date,<a href='link'>11111</a>,time,2,5

edited Sep 14, 2015 at 4:09

answered Sep 14, 2015 at 4:04

Sait

19.9k20 gold badges75 silver badges101 bronze badges

2 Comments

blackfury Over a year ago

i am getting the desired output, but having an error: print(','.join(eval(line.strip()))) File "<string>", line 1 ] ^ SyntaxError: unexpected EOF while parsing

Sait Over a year ago

@blackfury It works on my machine, can you check your input text file again and see if it is same with the original post? You can also print the last line in the for loop before it gives the error.

qwertyuip9 · Accepted Answer · 2015-09-14 04:17:42Z

0

Another option without using regex is:

for line in lines:
  formatted = ','.join(line).replace('"', '')
  print(formatted)

answered Sep 14, 2015 at 4:17

qwertyuip9

1,6522 gold badges18 silver badges25 bronze badges

Collectives™ on Stack Overflow

Extract Data from file using python

3 Answers 3

Comments

Input:

Output:

2 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

Input:

Output:

2 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related