Parsing text file in pandas

Question

I'm trying to read a text file the way I usually do with Pandas, but for some reason the whole line is getting read as one column:

import pandas as pd
from StringIO import StringIO

a='''
TRE-G3T- Triumph-        0.000 11/06/2013 313585.10 1765.00000 11/06/2013 313600.10   41 20 54.57907  -70 38 14.25924      -30.400       -1.379   893059.006  2588821.543     2834.294   -19545.615      -45.849        0.985        1.058        3.399        3.694      -15.203        1.099   1.0000 6   6.37  4        0.000 I             -0.084     0.086    -0.059   0.000   0.000   0.000   363026.471  4578737.512      -30.400
TRE-G3T- Triumph-        0.000 11/06/2013 313585.20 1765.00000 11/06/2013 313600.20   41 20 54.61145  -70 38 14.22044      -30.332       -1.311   893061.933  2588824.850     2835.196   -19544.617      -45.779        0.944        1.015        3.313        3.592      -15.135       -3.365   1.4883 6   6.35  4        0.001 I              0.833    -0.485    -1.570   0.000   0.000   0.000   363027.391  4578738.493      -30.332
TRE-G3T- Triumph-        0.000 11/06/2013 313585.30 1765.00000 11/06/2013 313600.30   41 20 54.48685  -70 38 14.10862      -29.190       -0.169   893070.589  2588812.325     2837.797   -19548.465      -44.651        0.950        1.017        3.254        3.539      -13.994       -8.197   1.0000 6   5.70  4        0.001 I             -0.158     0.003     0.061   0.000   0.000   0.000   363029.917  4578734.602      -29.190
'''

df = pd.read_csv(StringIO(a),delimiter='r\s+')

shape(df)

(3,1)

I'm sure it's something simple, but I've been looking at the docs and examples and I can't figure it out!

grrr.... don't know how long I would have looked at this and not seen that. — Rich Signell
– Rich Signell, Commented Nov 29, 2013 at 19:51

DSM · Accepted Answer · 2013-11-29 18:59:38Z

I think your r is in the wrong place: you probably want delimiter=r'\s+'. :^)

(Although I think in this case it would have worked without the r prefix, it's a good habit.)

You should also be able to use delim_whitespace=True:

>>> df = pd.read_csv(StringIO(a.strip()), delimiter=r"\s+", header=None)
>>> df2 = pd.read_csv(StringIO(a.strip()), delim_whitespace=True, header=None)
>>> df2
         0         1   2           3         4     5           6         7   \
0  TRE-G3T-  Triumph-   0  11/06/2013  313585.1  1765  11/06/2013  313600.1   
1  TRE-G3T-  Triumph-   0  11/06/2013  313585.2  1765  11/06/2013  313600.2   
2  TRE-G3T-  Triumph-   0  11/06/2013  313585.3  1765  11/06/2013  313600.3   

   8   9         10  11  12        13      14     15          16           17  \
0  41  20  54.57907 -70  38  14.25924 -30.400 -1.379  893059.006  2588821.543   
1  41  20  54.61145 -70  38  14.22044 -30.332 -1.311  893061.933  2588824.850   
2  41  20  54.48685 -70  38  14.10862 -29.190 -0.169  893070.589  2588812.325   

         18         19      
0  2834.294 -19545.615 ...  
1  2835.196 -19544.617 ...  
2  2837.797 -19548.465 ...  

[3 rows x 42 columns]

Collectives™ on Stack Overflow

Parsing text file in pandas

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related