2

I'm reading a file in Python that isn't well formatted, values are separated by multiple spaces and some tabs too so the lists returned has a lot of empty items, how do I remove/avoid those?

This is my current code:

import re

f = open('myfile.txt','r') 

for line in f.readlines(): 
    if re.search(r'\bDeposit', line):
        print line.split(' ')

f.close()

Thanks

4 Answers 4

11

Don't explicitly specify ' ' as the delimiter. line.split() will split on all whitespace. It's equivalent to using re.split:

>>> line = '  a b   c \n\tg  '
>>> line.split()
['a', 'b', 'c', 'g']
>>> import re
>>> re.split('\s+', line)
['', 'a', 'b', 'c', 'g', '']
>>> re.split('\s+', line.strip())
['a', 'b', 'c', 'g']
Sign up to request clarification or add additional context in comments.

5 Comments

forgot that string.split() with no argument split on runs of whitespace. +1
Great, it did remove almost all whitespace from the lines, except one at the beginning and one at the end, strange.
re.split('\s+', line.strip()) will fix that
@jcoon: or line.strip().split()
Nimbuz, line.split() will take care of stripping whitespace at the start/end.
2
for line in open("file"):
    if " Deposit" in line:
         line=line.rstrip()
         print line.split()

Update:

for line in open("file"):
    if "Deposit" in line:
         line=line.rstrip()
         print line[line.index("Deposit"):].split()

1 Comment

Note that " Deposit" in line is not equivalent to re.search(r'\bDeposit', line). The latter will match "this,Deposit", while the former won't.
1
linesAsLists = [line.split() for line in open('myfile.txt', 'r') if 'Deposit' in line)]

Comments

0

Why not do line.strip() before handling it? Also, you could use re.split to use a regex like '\s+' as your delimiter.

3 Comments

for line in f.readlines(): line.strip(); continue_processing...SO comments aren't friendly to Python code.
this will only remove whitespace from the head/tail of the string
If you do line.split() with no arguments, you get strip for free. ' a b c d '.split() == ['a', 'b', 'c', 'd']

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.