I have a text file of the following form:
('1', '2')
('3', '4')
.
.
.
and i'm trying to get it to look like this:
1 2
3 4
etc...
I've been trying to do this using the re module in python, by chaining together re.sub commands like so:
for line in file:
s = re.sub(r"\(", "", line)
s1 = re.sub(r",", "", s)
s2 = re.sub(r"'", "", s1)
s3 = re.sub(r"\)", "", s2)
output.write(s3)
output.close()
It seems to work great until I get near the end of my output file; then it becomes inconsistent and stops working. I am thinking it is because of the sheer SIZE of the file I am working with; 300MB or approximately 12 million lines.
Can anyone help me confirm that I'm simply running out of memory? Or if it is something else? Suitable alternatives, or ways around this?
ast.literal_evaleach line and usecsvto write it back out.output.write(re.sub(r"\(\s*'(\d+)',\s*'(\d+)'\s*\)", r"\1 \2", line)). But as I say, that's not your problem. You might need to show more of your code to get an answer to that particular issue.