0

Input File:

["abc","on time","date","<a href='link'>11111</a>","time","2","2"],

["abc","on time","date","<a href='link'>11111</a>","time","2","6"],

["abc","on time","date","<a href='link'>11111</a>","time","2","9"],

["abc","on time","date","<a href='link'>11111</a>","time","2","0"],

["abc","on time","date","<a href='link'>11111</a>","time","2","5"]

output to be needed:

abc,on time,date,<a href='link'>11111</a>,time,2,2

abc,on time,date,<a href='link'>11111</a>,time,2,6

abc,on time,date,<a href='link'>11111</a>,time,2,9

abc,on time,date,<a href='link'>11111</a>,time,2,0

abc,on time,date,<a href='link'>11111</a>,time,2,5

Code tried:

import sys
import re

Lines = [Line.strip() for Line in open (sys.argv[1],'r').readlines()]



for EachLine in Lines:
    Parts = EachLine.split(",")
    for EachPart in Parts:

        EachPart = re.sub(r'[', '', EachPart)
        EachPart = re.sub(r']', '', EachPart)
print ' '.join(Parts)

Can anyone help me on this?? I am not getting what i desired. Thanks in advance.

3
  • Just want to print it that format or save it in a file? Commented Sep 14, 2015 at 3:27
  • anything is fine... i could redirect the output to a file also. Commented Sep 14, 2015 at 3:37
  • What are you getting, then? Commented Sep 14, 2015 at 5:15

3 Answers 3

1

I modified your initial solution to

import sys
import re

Lines = [Line.strip() for Line in open (sys.argv[1],'r').readlines()]

for EachLine in Lines:
    matches = re.findall(r'\"(.+?)\"',EachLine)
    print ','.join(matches)

My approach is to use regex to get all string in double quotes.

Sign up to request clarification or add additional context in comments.

Comments

0

As already mentioned, you can use eval().

with open('a.txt') as f:
    for line in f:
        line = line.replace(',\n', '\n').strip() # remove if there is `,` at the end
        if line:                                 # to tackle with empty lines
            print(','.join(eval(line.strip())))

Input:

["abc","on time","date","<a href='link'>11111</a>","time","2","2"],

["abc","on time","date","<a href='link'>11111</a>","time","2","6"],

["abc","on time","date","<a href='link'>11111</a>","time","2","9"],

["abc","on time","date","<a href='link'>11111</a>","time","2","0"],

["abc","on time","date","<a href='link'>11111</a>","time","2","5"]

Output:

abc,on time,date,<a href='link'>11111</a>,time,2,2
abc,on time,date,<a href='link'>11111</a>,time,2,6
abc,on time,date,<a href='link'>11111</a>,time,2,9
abc,on time,date,<a href='link'>11111</a>,time,2,0
abc,on time,date,<a href='link'>11111</a>,time,2,5

2 Comments

i am getting the desired output, but having an error: print(','.join(eval(line.strip()))) File "<string>", line 1 ] ^ SyntaxError: unexpected EOF while parsing
@blackfury It works on my machine, can you check your input text file again and see if it is same with the original post? You can also print the last line in the for loop before it gives the error.
0

Another option without using regex is:

for line in lines:
  formatted = ','.join(line).replace('"', '')
  print(formatted)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.