2

I'm trying to read a text file the way I usually do with Pandas, but for some reason the whole line is getting read as one column:

import pandas as pd
from StringIO import StringIO

a='''
TRE-G3T- Triumph-        0.000 11/06/2013 313585.10 1765.00000 11/06/2013 313600.10   41 20 54.57907  -70 38 14.25924      -30.400       -1.379   893059.006  2588821.543     2834.294   -19545.615      -45.849        0.985        1.058        3.399        3.694      -15.203        1.099   1.0000 6   6.37  4        0.000 I             -0.084     0.086    -0.059   0.000   0.000   0.000   363026.471  4578737.512      -30.400
TRE-G3T- Triumph-        0.000 11/06/2013 313585.20 1765.00000 11/06/2013 313600.20   41 20 54.61145  -70 38 14.22044      -30.332       -1.311   893061.933  2588824.850     2835.196   -19544.617      -45.779        0.944        1.015        3.313        3.592      -15.135       -3.365   1.4883 6   6.35  4        0.001 I              0.833    -0.485    -1.570   0.000   0.000   0.000   363027.391  4578738.493      -30.332
TRE-G3T- Triumph-        0.000 11/06/2013 313585.30 1765.00000 11/06/2013 313600.30   41 20 54.48685  -70 38 14.10862      -29.190       -0.169   893070.589  2588812.325     2837.797   -19548.465      -44.651        0.950        1.017        3.254        3.539      -13.994       -8.197   1.0000 6   5.70  4        0.001 I             -0.158     0.003     0.061   0.000   0.000   0.000   363029.917  4578734.602      -29.190
'''

df = pd.read_csv(StringIO(a),delimiter='r\s+')

shape(df)

(3,1)

I'm sure it's something simple, but I've been looking at the docs and examples and I can't figure it out!

1
  • grrr.... don't know how long I would have looked at this and not seen that. Commented Nov 29, 2013 at 19:51

1 Answer 1

2

I think your r is in the wrong place: you probably want delimiter=r'\s+'. :^)

(Although I think in this case it would have worked without the r prefix, it's a good habit.)

You should also be able to use delim_whitespace=True:

>>> df = pd.read_csv(StringIO(a.strip()), delimiter=r"\s+", header=None)
>>> df2 = pd.read_csv(StringIO(a.strip()), delim_whitespace=True, header=None)
>>> df2
         0         1   2           3         4     5           6         7   \
0  TRE-G3T-  Triumph-   0  11/06/2013  313585.1  1765  11/06/2013  313600.1   
1  TRE-G3T-  Triumph-   0  11/06/2013  313585.2  1765  11/06/2013  313600.2   
2  TRE-G3T-  Triumph-   0  11/06/2013  313585.3  1765  11/06/2013  313600.3   

   8   9         10  11  12        13      14     15          16           17  \
0  41  20  54.57907 -70  38  14.25924 -30.400 -1.379  893059.006  2588821.543   
1  41  20  54.61145 -70  38  14.22044 -30.332 -1.311  893061.933  2588824.850   
2  41  20  54.48685 -70  38  14.10862 -29.190 -0.169  893070.589  2588812.325   

         18         19      
0  2834.294 -19545.615 ...  
1  2835.196 -19544.617 ...  
2  2837.797 -19548.465 ...  

[3 rows x 42 columns]
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.