I'm very new to programming and Python and I'm trying to convert a DLPOLY HISTORY file to an arc file. What I need to do is extract the lattice vectors (the 3x3 array under the word timestep), the x, y and z coordinates (the three entries on the line underneath each element) and the charge (the fourth entry on the line with the element).
Ideally I'd like to eventually be able to convert files of arbitrary size and frame length.
The two heading lines and first two frames of the DLPOLY HISTORY file that looks like this:
File Title
0 3 5 136 1906
timestep 0 5 0 3 0.000500 0.000000
3.5853000000 0.0000000000 0.0000000000
-1.7926500000 3.1049600000 0.0000000000
0.0000000000 0.0000000000 4.8950000000
Ca 1 40.078000 1.050000 0.000000
0.000000000 0.000000000 0.000000000
O 2 15.999400 -0.950000 0.000000
1.792650000 -1.034986100 1.140535000
H 3 1.007940 0.425000 0.000000
1.792650000 -1.034986100 1.933525000
O 4 15.999400 -0.950000 0.000000
-1.792650000 1.034987000 -1.140535000
H 5 1.007940 0.425000 0.000000
-1.792650000 1.034987000 -1.933525000
timestep 10 5 0 3 0.000500 0.005000
3.5853063513 0.0000000000 0.0000000000
-1.7926531756 3.1049655004 0.0000000000
0.0000000000 0.0000000000 4.8950086714
Ca 1 40.078000 1.050000 0.020485
-0.1758475885E-01 0.1947928245E-04 -0.1192033544E-01
O 2 15.999400 -0.950000 0.051020
1.841369991 -1.037431082 1.120698646
H 3 1.007940 0.425000 0.416965
1.719029690 -1.029327936 2.355541077
O 4 15.999400 -0.950000 0.045979
-1.795057186 1.034993005 -1.093028694
H 5 1.007940 0.425000 0.373772
-1.754959531 1.067269072 -2.320776528
So far the code I have is:
fileList = history_file.readlines()
number_of_frames = int(fileList[1].split()[3])
number_of_lines = int(fileList[1].split()[4])
frame_length = (number_of_lines - 2) / number_of_frames
number_of_atoms = int(fileList[1].split()[2])
lines_per_atom = frame_length / number_of_atoms
for i in range(3, number_of_lines+1, frame_length):
#maths for converting lattice vectors
#print statement to write out converted lattice vectors
for j in range(i+3, frame_length+1, lines_per_atom):
atom_type = fileList[j].split()[0]
atom_x = fileList[j+1].split()[0]
atom_y = fileList[j+1].split()[1]
atom_z = fileList[j+1].split()[2]
charge = fileList[j].split()[3]
print atom_type, atom_x, atom_y, atom_z, charge
I've can extract and convert the lattice vectors so that's not a problem. However when it comes to the second for loop it only executes once, it think that my range ending statement
frame_length+1
is incorrect, but if I change it to
i+3+frame_length+1
I get the following error:
charge = fileList[j].split()[3]
IndexError: list index out of range
Which I think means that I'm going over the end of an array.
I'm sure that I've overlooked something very simple but any help would be greatly appreciated.
I'm also wondering if there is a more efficient way of reading the file because as I understand it readlines reads the entire file into memory and HISTORY files can easily reach several GB in size.