How to get rid of \n and \r in a string using python

Question

I have written a python(2.7) program to retreive data from a table in a database and copy it into a csv file. There are various data in non-printable format(unicode) which contain \n, \r. Because of \n, \r I am not able to retreive the data as it is in the table.

I have tried the following

str.replace('\n','').replace('\r',' ')
str.replace('\n','\\n').replace('\r', '\\r')

but it did not work out

csv code

 cur.execute('select * from db.table_name)
with open('test.csv','w') as csv_file:
    csv_writer=csv.writer(csv_file)
    for row in cur:
        print "row = ", count
        count = count + 1
        newrow=[];
        for index in range(0, len(row)):
            value= row[index]
            if(type(row[index])is str):
                 value=row[index].replace("\n"," ").replace("\r"," ")
            newrow.append(value)
       csv_writer.writerow(newrow)

I'm confused with that second replace line, what exactly do you want to happen there? — OneCricketeer
– OneCricketeer, Commented Jun 12, 2016 at 17:52
Why would you want to get rid of \r\n (they are linebreaks) and why wouldn't the replace work? please post some examples too — noteness
– noteness, Commented Jun 12, 2016 at 17:54
Show a small sample of code that generates your CSV incorrectly and we can likely show you how to fix it so these replacements are not needed. — Mark Tolonen
– Mark Tolonen, Commented Jun 12, 2016 at 17:57
Add a print(repr(value)) and add the output, does .replace("\\r"," ") have a different effect? — Padraic Cunningham
– Padraic Cunningham, Commented Jun 12, 2016 at 18:18

Mark Tolonen · Accepted Answer · 2016-06-12 17:54:03Z

3

str.replace() returns a new string, so you have to assign it to the original string to change it:

s = s.replace('\n','').replace('\r','')

answered Jun 12, 2016 at 17:54

Mark Tolonen

181k26 gold badges182 silver badges278 bronze badges

Sign up to request clarification or add additional context in comments.

5 Comments

kaikadi-bella Over a year ago

I am really sorry, I have used the same as above

kaikadi-bella Over a year ago

c'mon that was a simple thing that's why I didn't mention.

zondo Over a year ago

@kickbhatwoski: You'd be surprised how many times the problem is something simple like that.

Mark Tolonen Over a year ago

@kickbhatwoski You won't be surprised that people post very incomplete questions with insufficient information.

kaikadi-bella Over a year ago

Yes Sir my bad, but I corrected my question by editing it.

tdelaney · Accepted Answer · 2016-06-12 18:29:10Z

2

Unicode has external serialized representations such as UTF-8 and UTF-16 and language-dependent internal implementations such as WCHAR. Your database read appears to have given you a UTF-16 serialized version of the string and all you have to do is decode it. You certainly don't want to remove the \r and \n because they are part of the multi-byte sequence and not really carriage return or newline at all.

As a simple example, I can remove all the the database and looping stuff and just work with the string you posted:

>>> value = '\r\xaeJ\x92>J\xe7\x1d\n\x89`\xc6\xf8\x9c<\x18'
>>> decoded = value.decode('UTF-16')
>>> print repr(decoded)
u'\uae0d\u924a\u4a3e\u1de7\u890a\uc660\u9cf8\u183c'
>>> print decoded
긍鉊䨾ᷧ褊왠鳸ᠼ
>>>

answered Jun 12, 2016 at 18:29

tdelaney

77.9k6 gold badges91 silver badges129 bronze badges

8 Comments

kaikadi-bella Over a year ago

Thank you, but @Padraic Cunningham gave the same answer few minutes ago.

tdelaney Over a year ago

Padraic asked you to post the result of print(repr(value)) which is important to figure out how to interpret the back-slashes in the example string you gave us. You mentioned you are reading unicode data and I'm not convinced that you will solve the problem without decoding the unicode into a python unicode string.

kaikadi-bella Over a year ago

sorry I didn't get you

Mark Tolonen Over a year ago

That's a nonsense string. Definitely not UTF-16.

Mark Tolonen Over a year ago

It's a mix of Korean, Chinese, Mongolian and undefined codepoints...nothing coherent.

|

Brian Tompsett - 汤莱恩 · Accepted Answer · 2018-12-26 19:19:02Z

2

You can use regular expression to simplify your code:

For example:

import re
s = "Salut \n Comment ca va ?"
s = re.sub("\n|\r|\t", "",  s)

print(s)

Output will be as:

Salut Comment ca va ?

edited Dec 26, 2018 at 19:19

Brian Tompsett - 汤莱恩

5,92772 gold badges63 silver badges135 bronze badges

answered Dec 26, 2018 at 19:10

Sabita Nadar

191 bronze badge

Comments

Hemanth B · Accepted Answer · 2019-02-19 05:01:10Z

1

you can simply do it by adding .strip() at the end of input eg: n=input().strip() it will remove all '/r' in strings

answered Feb 19, 2019 at 5:01

Hemanth B

111 bronze badge

Collectives™ on Stack Overflow

How to get rid of \n and \r in a string using python

4 Answers 4

5 Comments

8 Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

5 Comments

8 Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related