0

I have a very huge xyz coordinates file as below:

C -0.847463930 1.503191118 0.986935030
N -0.849494834 0.360945118 1.290183500
- - - -
- - - -
- - - -
- - - -
C -0.409837378 -0.781300882 0.986935030
C -0.474783893 -0.837401882 -0.407860970
H -0.679839030 0.360945118 -2.206546970

I read this file using numpy (in following script I use list method) array with first column corresponds to x, second y, and third z. Now I want to write python code to subtract rows like the following fashion: 100th -1st, 200th - 101th, 300th-201th and so on till the end. I have tried to iterate over the rows with 100 gap but end with no luck. Are there someone to give me an idea?


filename = 'file.xyz'
xyz = open(filename, 'r')
atoms = []
coordinates = []
xyz.readline()
xyz.readline()

for line in xyz:
    atom, x, y, z = line.split()
    atoms.append(atom)
    coordinates.append([float(x), float(y), float(z)])

# iterate over rows
'''How can I do the iteration?'''

2
  • Your code shown is not using numpy. Do you have some other part that sets up the numpy arrays? Commented Jul 1, 2021 at 23:52
  • Not explicitly. But let's say I treat using lists. Is that make sense? Commented Jul 1, 2021 at 23:58

2 Answers 2

1

A suggestion: load this file into numpy with loadtext rather than doing it manually:

import numpy as np

data = np.loadtxt(open('file.xyz'), dtype=str)
atoms = data[:, 0]
coordinates = data[:, 1:4].astype(float)

For your question, if you want "wrap-around" functionality, you can use numpy.roll to create another array shifted by 100 places, then simply subtract the two:

coordinates_shifted = np.roll(coordinates, -100, axis=0)
result = coordinates - coordinates_shifted

Here is a very simple example:

import numpy as np

coordinates = [
  [0, 0, 0],
  [1, 1 ,1],
  [2, 2, 2],
  [3, 3, 3]
]

# Shift the rows with "wrap-around"
coordinates_shifted = np.roll(coordinates, -2, axis=0)
result = coordinates - coordinates_shifted

print("### ORIGINAL COORDINATES")
print(coordinates)
print("### SHIFTED COORDINATES")
print(coordinates_shifted)
print("### RESULT")
print(result)

The output is

### ORIGINAL COORDINATES
[[1, 1, 1], 
 [1, 1, 1], 
 [2, 2, 2], 
 [3, 3, 3]]
### SHIFTED COORDINATES
[[2 2 2]
 [3 3 3]
 [1 1 1]
 [1 1 1]]
### RESULT
[[-1 -1 -1]
 [-2 -2 -2]
 [ 1  1  1]
 [ 2  2  2]]
Sign up to request clarification or add additional context in comments.

Comments

1

If you don't wan't to "wrap-around," you can read your data into a numpy array as suggested by @bpgeck:

import numpy as np

data = np.loadtxt(open('file.xyz'), dtype=str)
atoms = data[:, 0]
coordinates = data[:, 1:4].astype(float)

Then compute difference between the current row and the 100th lagged row like so:

# subtract 100th lag row from current row
delta = coordinates[100:,:] - coordinates[:-100,:]

The delta array will be 100 rows smaller than coordinates array because of the lagging, but you can always fill in the first (or last) 100 rows with anything you want using the Numpy pad function.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.