I am new to Python programming.
My task is the following:
I have a HUGE txt file (20+GB) with a lot of data. The structure is this:
Crap
Crap
Crap
...
Crap
Crap
Useful Data = x y z
Useful Data 2 = x2 y2 z2
Crap
Crap
...
Crap
Crap
Useful Data = x' y' z'
Useful Data 2 = x2' y2' z2'
Crap
Crap...
And so on like this for 5000 objects
I have to take every x, y and z and put them in a file which should look like the following:
x y z x2 y2 z2
x' y' z' x2' y2' z2'
x'' y'' z'' x2'' y2'' z2''
......and so on (i should have 5000 rows).
I thought regular expressions would have been good for this task. I've written this but i'm a real noob and can't go on:
f_in_name="starout.txt" #input file
f_out_name="cmposvel" #output file
f_in = open(f_in_name)
for l in f_in:
if "system_time" in l:
time=re.compile('^ system_time =\s+(\S+)')
elif "com_pos" in l:
poscm=re.compile('^ com_pos =\s+(\S+)\s+(\S+)\s+(\S+)')
elif "com_vel" in l:
velcm=re.compile('^ com_vel =\s+(\S+)\s+(\S+)\s+(\S+)')
#how do I write t,x,y,z,vx,vy,vz in the output?
How do I write the (\S+) on the output? Also, does re.compile search only in the current line or in the whole document? I'm confused, Is someone able to help me? I really need this to make a plot and have no clues about how doing that.