0

I have a file like this:

system
1000
    1VEA      C    1   9.294  11.244  11.083
    1VEA     C1    2   9.324  11.375  11.161
    1VEA      H    3   9.243  11.396  11.232
...
 1203VEA    H2092601  20.738  16.293   7.837
 1203VEA    H2192602  20.900  16.225   7.869
 1203VEA    H2292603  20.822  16.330   7.989

I want to generate a dataframe which include 6 columns. I used following command to

    df = pd.read_csv('system.gro', skiprows=[0,1], delim_whitespace=True, header=None)

generate this dataframe. However, when it came to the row started with 1203, columns between H20 and 92601 has no white space and I cannot just use above command to split it. I used to split the line string by specific length like:

    f1 = open(fileName, 'r')
    for line in f1.readlines():
         atomName = line[8:15].strip(' ')
         globalIdx = int(line[15:20].strip(' '))

But it takes really long time to deal with the file. Does anyone has any idea about how to deal with this using dataframe?

2
  • This looks more like a data quality issue or something with the settings while exporting the file. Cant you ask for a file with an actualy delimiter, for example the | ? Commented May 8, 2019 at 0:07
  • 1
    instead of pd.read_csv use pd.read_fwf. I am not sure how the .strip() would work though. Commented May 8, 2019 at 0:47

1 Answer 1

2

As suggested by SRT HellKitty in the comments, use pd.read_fwf (see docs) like this:

import pandas as pd

data="""
   1VEA      C    1   9.294  11.244  11.083
   1VEA     C1    2   9.324  11.375  11.161
   1VEA      H    3   9.243  11.396  11.232
1203VEA    H2092601  20.738  16.293   7.837
1203VEA    H2192602  20.900  16.225   7.869
1203VEA    H2292603  20.822  16.330   7.989
"""

### make sure that the widths are correct!
df=pd.read_fwf(pd.compat.StringIO(data),colspecs=[(0,8),(8,14),(14,20),(20,28),(28,36),(36,44)])
print(df)
Sign up to request clarification or add additional context in comments.

2 Comments

@jezrael Thanks, I wasn't aware of that, as I usually seldom read from string :) and I've updated my answer accordingly.
This is exactly what I want! Really appreciate about the answers!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.