1

I have some data printed out by some software and it has given me too many extra new lines. I'm trying to remove all extra new line characters whilst maintaining the column format of the following data:

[atRA]_0    [Cyp26A1_mRNA]_0    
1   0   0

1.999   0   0

2.998   0   0

3.997   0   0

4.996   0   0

This code simply doesn't work

def remove_newline_from_copasi_report(self,copasi_data):
    with open(copasi_data) as f:
        lines=[]
        data = f.read()
        return data.rstrip()

Whereas this code removes all new lines and ruins the format:

def remove_newline_from_copasi_report(self,copasi_data):
    with open(copasi_data) as f:
        lines=[]
        data = f.read()
        return data.replace('\n','')

Does anybody know how to remove all but one newline character from each line of my text file?

Thanks

1
  • data.replace('\n\n','') maybe? Commented Jul 20, 2015 at 15:03

3 Answers 3

3
lines = data.split('\n')
data = '\n'.join(line for line in lines if len(line) > 0)

should work

Sign up to request clarification or add additional context in comments.

1 Comment

I like this solution better, as it is not limited to just \n\n
2

You can iterate over the file object using if line.strip(), there is no need to read all the content into memory and then try to replace, just do it as you iterate:

lines = "".join([line for line in f if line.strip()])
print(lines)

[atRA]_0    [Cyp26A1_mRNA]_0    
1   0   0
1.999   0   0
2.998   0   0
3.997   0   0
4.996   0   0

To only store a line at a time just iterate in a loop applying the same logic or make the list a gen exp and ietarte over that:

for line in f:
    if line.strip():
        print(line)

5 Comments

But now you are creating the whole list in memory, you can join() a generator expression as well: "".join(line for line in f if line.strip())
@Finwoodm no, python creates a list internally if you pass a generator to join so it is actually less efficient, the OP can iterate over the file object in a loop getting a line at a time using the same logic, I only used a list comp and join to show the output. Using the code above It is still more efficient than read.split then '\n'.join(line for line in lines if len(line) > 0)
Oh, I didn't know that. Nice to know! :-)
@Finwood, join does two passes over the data so to be able to do that you could not use a generator so python first constructs a list if you pass a gen .
2

Simply look for double new lines and replace them with single new lines:

In [1]: data = """[atRA]_0    [Cyp26A1_mRNA]_0    
   ...: 1   0   0
   ...: 
   ...: 1.999   0   0
   ...: 
   ...: 2.998   0   0
   ...: 
   ...: 3.997   0   0
   ...: 
   ...: 4.996   0   0"""

In[2]: print(data.replace('\n\n', '\n'))
[atRA]_0    [Cyp26A1_mRNA]_0    
1   0   0
1.999   0   0
2.998   0   0
3.997   0   0
4.996   0   0

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.