1

I am filtering a list for those records that contain a key word in one column. The overall list, outputs is given as:

outputs = 
sent_name   Name    Lat Lng type
    Abbey Road Station, London, UK  Abbey Road, London E15, UK  51.53193    0.00376 [u'transit_station', u'point_of_interest', u'establishment']
    Abbey Wood Station, London, UK  Abbey Wood, London SE2, UK  51.49106    0.12142 [u'transit_station', u'point_of_interest', u'establishment']

I search output[3] for the string 'station' and then append the results where this is true to an empty list, results. As per -

results = []

for output in outputs:
    if "station" in output[3]:
        results.append(output)

I wish to use Pandas for future analysis but do not know how to recreate a DataFrame after filtering these results.

OD = pd.read_csv('./results.csv', header=0)

Where, results.csv is again:

sent_name   Name    Lat Lng type
Abbey Road Station, London, UK  Abbey Road, London E15, UK  51.53193    0.00376 [u'transit_station', u'point_of_interest', u'establishment']
Abbey Wood Station, London, UK  Abbey Wood, London SE2, UK  51.49106    0.12142 [u'transit_station', u'point_of_interest', u'establishment']

Using iterrows, I am able to iterate over the rows in the pandas dataframe and filter out those where 'station' exists in the type column.

    for index, row in OD.iterrows():
        if "station" in row['type']:

However, I have not been able to create a new DataFrame from this. My ultimate aim is to create a new csv (that only contains records that feature 'station' in the type column) using the .to_csv function in Pandas.

I have tried to create a new dataframe with appropriate index names. Then filtering as above and attempting to append these results to the new dataframe

OD_filtered = pd.DataFrame(index=['sent_name','Name','Lat', 'Lng', 'type'])

for index, row in OD.iterrows():
    if "station" in row['type']:
        OD_filtered.append([row['sent_name'], row['Name'], row['Lat'], row['Lng'], row['type']])

pprint(OD_filtered)

However, this fails to write to dataframe and it remains empty. When I print(OD_filtered) it gives:

Empty DataFrame
Columns: []
Index: [sent_name, Name, Lat, Lng, type]
2
  • 1
    Your read_csv code shouldn't work as your csv has multiple commas but aside from that you should be able to do new_df = OD[OD.apply(lambda x: 'station' in x['type'], axis=1)] I think Commented Sep 3, 2015 at 9:34
  • Very elegant. I missed the OD.apply method. Please put that as an answer and I can mark it correct Commented Sep 3, 2015 at 9:36

1 Answer 1

2

You can create a boolean mask by calling apply on 'type' column to create your new df:

In [37]:
import io
import pandas as pd
t="""sent_name;Name;Lat;Lng;type
Abbey Road Station, London, UK;Abbey Road, London E15, UK;51.53193;0.00376;[u'transit_station', u'point_of_interest', u'establishment']
Abbey Wood Station, London, UK;Abbey Wood, London SE2, UK;51.49106;0.12142;[u'transit_station', u'point_of_interest', u'establishment']"""
df = pd.read_csv(io.StringIO(t), sep=';')
df

Out[37]:
                        sent_name                        Name       Lat  \
0  Abbey Road Station, London, UK  Abbey Road, London E15, UK  51.53193   
1  Abbey Wood Station, London, UK  Abbey Wood, London SE2, UK  51.49106   

       Lng                                               type  
0  0.00376  [u'transit_station', u'point_of_interest', u'e...  
1  0.12142  [u'transit_station', u'point_of_interest', u'e...  

In [39]:    
# filter the df
df[df['type'].apply(lambda x: 'station' in x)]

Out[39]:
                        sent_name                        Name       Lat  \
0  Abbey Road Station, London, UK  Abbey Road, London E15, UK  51.53193   
1  Abbey Wood Station, London, UK  Abbey Wood, London SE2, UK  51.49106   

       Lng                                               type  
0  0.00376  [u'transit_station', u'point_of_interest', u'e...  
1  0.12142  [u'transit_station', u'point_of_interest', u'e...  

So in your case the following should work:

new_df = OD[OD['type'].apply(lambda x: 'station' in x)]
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.