Removing all but one newline character from text file using python

Question

I have some data printed out by some software and it has given me too many extra new lines. I'm trying to remove all extra new line characters whilst maintaining the column format of the following data:

[atRA]_0    [Cyp26A1_mRNA]_0    
1   0   0

1.999   0   0

2.998   0   0

3.997   0   0

4.996   0   0

This code simply doesn't work

def remove_newline_from_copasi_report(self,copasi_data):
    with open(copasi_data) as f:
        lines=[]
        data = f.read()
        return data.rstrip()

Whereas this code removes all new lines and ruins the format:

def remove_newline_from_copasi_report(self,copasi_data):
    with open(copasi_data) as f:
        lines=[]
        data = f.read()
        return data.replace('\n','')

Does anybody know how to remove all but one newline character from each line of my text file?

Thanks

data.replace('\n\n','') maybe?

lynn
– lynn

2015-07-20 15:03:00 +00:00
Commented Jul 20, 2015 at 15:03 — lynn
– lynn, Commented Jul 20, 2015 at 15:03

John Coleman · Accepted Answer · 2015-07-20 15:12:56Z

3

lines = data.split('\n')
data = '\n'.join(line for line in lines if len(line) > 0)

should work

edited Jul 20, 2015 at 15:12

answered Jul 20, 2015 at 15:09

John Coleman

52.1k7 gold badges59 silver badges127 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

karthikr Over a year ago

I like this solution better, as it is not limited to just \n\n

Padraic Cunningham · Accepted Answer · 2015-07-20 15:44:24Z

2

You can iterate over the file object using if line.strip(), there is no need to read all the content into memory and then try to replace, just do it as you iterate:

lines = "".join([line for line in f if line.strip()])
print(lines)

[atRA]_0    [Cyp26A1_mRNA]_0    
1   0   0
1.999   0   0
2.998   0   0
3.997   0   0
4.996   0   0

To only store a line at a time just iterate in a loop applying the same logic or make the list a gen exp and ietarte over that:

for line in f:
    if line.strip():
        print(line)

edited Jul 20, 2015 at 15:44

answered Jul 20, 2015 at 15:14

Padraic Cunningham

181k30 gold badges264 silver badges327 bronze badges

5 Comments

Finwood Over a year ago

But now you are creating the whole list in memory, you can join() a generator expression as well: "".join(line for line in f if line.strip())

Padraic Cunningham Over a year ago

@Finwoodm no, python creates a list internally if you pass a generator to join so it is actually less efficient, the OP can iterate over the file object in a loop getting a line at a time using the same logic, I only used a list comp and join to show the output. Using the code above It is still more efficient than read.split then '\n'.join(line for line in lines if len(line) > 0)

Finwood Over a year ago

Oh, I didn't know that. Nice to know! :-)

Padraic Cunningham Over a year ago

@Finwood, join does two passes over the data so to be able to do that you could not use a generator so python first constructs a list if you pass a gen .

Padraic Cunningham Over a year ago

@Finwood, github.com/python/cpython/blob/master/Objects/stringlib/…

Finwood · Accepted Answer · 2015-07-20 15:06:01Z

2

Simply look for double new lines and replace them with single new lines:

In [1]: data = """[atRA]_0    [Cyp26A1_mRNA]_0    
   ...: 1   0   0
   ...: 
   ...: 1.999   0   0
   ...: 
   ...: 2.998   0   0
   ...: 
   ...: 3.997   0   0
   ...: 
   ...: 4.996   0   0"""

In[2]: print(data.replace('\n\n', '\n'))
[atRA]_0    [Cyp26A1_mRNA]_0    
1   0   0
1.999   0   0
2.998   0   0
3.997   0   0
4.996   0   0

answered Jul 20, 2015 at 15:06

Finwood

3,9911 gold badge23 silver badges39 bronze badges

Collectives™ on Stack Overflow

Removing all but one newline character from text file using python

3 Answers 3

1 Comment

5 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

1 Comment

5 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related