Python pandas give comma separated values new column

Question

Hello I have a csv file and it currently has 2 columns with 1000> rows. I want each comma seperated value to be a new column from the one column that it is in.

Here is an example of my csv:

print df4

 keys                                                env
0         FIT-2990                                          3000.0010
1         FIT-2918                                          3000.0004
2         FIT-2854                               2110.0070, 2110.0071
3    UXSCIENCE-640                                          1808.0001
4         FIT-2814                    1135.0017, 1135.0018, 1135.0019
5         FIT-2766                               1908.0043, 1908.0044
6         FIT-2760  1901.0012, 1903.0045, 1906.0020, 1922.0032, 19...
7         FIT-2725                                          0147.0001
8         FIT-2706                               1903.0045, 1922.0032
9         FIT-2554                               1802.0024, 1805.0028
10        FIT-2383                                             , 1910
11        FIT-2339                                          2113.0021
12   UXSCIENCE-438                    4000.0237, 4000.0238, 4000.0339
13        FIT-2201                    2023.0013, 2016.0013, 2019.0013

I want to split ex : 2110,0070 | 2110.0071 into separate columns for the entire csv.

What I got so far..

df5 = df4.join(df4.apply(lambda x: Series(x.split(', '))))
print df5

jezrael · Accepted Answer · 2016-02-04 19:16:50Z

You can try str.split and concat:

import pandas as pd
import numpy as np
import io

temp1=u"""keys;env
FIT-2990;3000.0010
FIT-2918;3000.0004
FIT-2854;2110.0070, 2110.0071
UXSCIENCE-640;1808.0001
FIT-2814;1135.0017, 1135.0018, 1135.0019
FIT-2766;1908.0043, 1908.0044
FIT-2760;1901.0012, 1903.0045, 1906.0020, 1922.0032, 19...
FIT-2725;0147.0001
FIT-2706;1903.0045, 1922.0032
FIT-2554;1802.0024, 1805.0028
FIT-2383;, 1910
FIT-2339;2113.0021
UXSCIENCE-438;4000.0237, 4000.0238, 4000.0339
FIT-2201;2023.0013, 2016.0013, 2019.0013"""

#after testing replace io.StringIO(temp) to filename
df = pd.read_csv(io.StringIO(temp1),  sep=";", index_col=None)
print df

#faster
df1 = pd.DataFrame([ x.split(',') for x in df['env'].tolist() ])
#slower
df1 = df['env'].str.split(',', expand=True)

print pd.concat([df['keys'], df1], axis=1)
             keys          0           1           2           3       4
0        FIT-2990  3000.0010        None        None        None    None
1        FIT-2918  3000.0004        None        None        None    None
2        FIT-2854  2110.0070   2110.0071        None        None    None
3   UXSCIENCE-640  1808.0001        None        None        None    None
4        FIT-2814  1135.0017   1135.0018   1135.0019        None    None
5        FIT-2766  1908.0043   1908.0044        None        None    None
6        FIT-2760  1901.0012   1903.0045   1906.0020   1922.0032   19...
7        FIT-2725  0147.0001        None        None        None    None
8        FIT-2706  1903.0045   1922.0032        None        None    None
9        FIT-2554  1802.0024   1805.0028        None        None    None
10       FIT-2383                   1910        None        None    None
11       FIT-2339  2113.0021        None        None        None    None
12  UXSCIENCE-438  4000.0237   4000.0238   4000.0339        None    None
13       FIT-2201  2023.0013   2016.0013   2019.0013        None    None

Collectives™ on Stack Overflow

Python pandas give comma separated values new column

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related