using split() to split values in an entire column in a python dataframe

Question

I am trying to clean a list of url's that has garbage as shown.

/gradoffice/index.aspx(
/gradoffice/index.aspx-
/gradoffice/index.aspxjavascript$
/gradoffice/index.aspx~

I have a csv file with over 190k records of different url's. I tried to load the csv into a pandas dataframe and took the entire column of url's into a list by using the statement

str = df['csuristem']

it clearly gave me all the values in the column. when i use the following code - It is only printing 40k records and it starts some where in the middle. I don't know where am going wrong. the program runs perfectly but is showing me only partial number of results. any help would be much appreciated.

import pandas
table = pandas.read_csv("SS3.csv", dtype=object)
df = pandas.DataFrame(table)
str = df['csuristem']
for s in str:
    s = s.split(".")[0]
    print s

I am looking to get an output like this

/gradoffice/index.
/gradoffice/index.
/gradoffice/index.
/gradoffice/index.

Thank you, Santhosh.

EdChum · Accepted Answer · 2014-12-29 22:55:40Z

3

You need to do the following, so call .str.split on the column and then .str[0] to access the first portion of the split string of interest:

In [6]:

df['csuristem'].str.split('.').str[0]
Out[6]:
0    /gradoffice/index
1    /gradoffice/index
2    /gradoffice/index
3    /gradoffice/index
Name: csuristem, dtype: object

answered Dec 29, 2014 at 22:55

EdChum

397k204 gold badges836 silver badges583 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

using split() to split values in an entire column in a python dataframe

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related