2

I have an input file that contains a list of inputs, one per line. Each line of input is enclosed in double quotes. The inputs sometimes have a backslash or few double-quotes as within the enclosing double-quotes (check example below).

Sample inputs —

"each line is enclosed in double-quotes"
"Double quotes inside a \"double-quoted\" string!"
"This line contains backslashes \\not so cool\\"
"too many double-quotes in a line \"\"\"too much\"\"\""
"too many backslashes \\\\\\\"horrible\"\\\\\\"

I would like to take the above inputs and simply convert the ones with the escaped double quotes in the lines to a back-tick `.

I assume that there is a straightforward one-line solution to this. I tried the following but it doesn't work. Any other one-liner solution or a fix to the below code would be greatly appreciated.

def fix(line):
    return re.sub(r'\\"', '`', line)

It fails for input lines 3 and 5.

"each line is enclosed in double-quotes"
"Double quotes inside a `double-quoted` string!"
"This line contains backslashes \\not so cool\`
"too many double-quotes in a line ```too much```"
"too many backslashes \\\\\\`horrible`\\\\\`

Any fix I can think of breaks other lines. Please help!

2 Answers 2

2

This is not quite what you asked for as it replaces with " rather than `, but I'll mention it ... you could always leverage off csv to do \" conversion correctly for you:

>>> for line in csv.reader(["each line is enclosed in double-quotes",
...                         "Double quotes inside a \"double-quoted\" string!",
...                         "This line contains backslashes \\not so cool\\",
...                         "too many double-quotes in a line \"\"\"too much\"\"\"",
...                         "too many backslashes \\\\\\\"horrible\"\\\\\\",
...                         ]):
...         print(line)
...     
['each line is enclosed in double-quotes']
['Double quotes inside a "double-quoted" string!']
['This line contains backslashes \\not so cool\\']
['too many double-quotes in a line """too much"""']
['too many backslashes \\\\\\"horrible"\\\\\\']

If it is then important that they be actual `'s, you could simply do a replace on the text returned by the csv module.

Sign up to request clarification or add additional context in comments.

Comments

1

Add + after backslash.

return re.sub(r'\\+"', '`', line)

1 Comment

Still breaks for input line 3

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.