1

I have a very simple for loop :

## Keep or Drop Rows from Ad Servers

dataframes = [atlas_df, flashtalking_df, innovid_df, ias_viewability_df, ias_fraud_df]

for df in dataframes:
    df = df[df['Placement Name'].str.contains("»")]

when I run the for loop though, nothing filters.

However, if I write it down manually as:

ias_fraud_df = ias_fraud_df[ias_fraud_df['Placement Name'].str.contains("»")]

The filter works.

Any ideas on what I am missing?

1
  • I test it by doing a simple count of the df lines. For eg. when I apply the for loop to ias_fraud_df and do ias_fraud_df.count() the number of lines is the same. If I apply it manually, then the number of lines changes to the correct number. Commented Mar 31, 2017 at 14:36

2 Answers 2

3

You're working on the iterator, you need to reference the original df by using an index into the list:

for i in range(len(dataframes)):
    df = dataframes[i]    
    dataframes[i] = df[df['Placement Name'].str.contains("»")]

This is so the original df in the list is modified

Example:

In [108]:
l = list('abcd')
for i in range(len(l)):
    l[i] = 'new_' + l[i]

Out[108]:
['new_a', 'new_b', 'new_c', 'new_d']

Versus:

In [110]:
l = list('abcd')
for x in l:
    x = 'new_' + x
l

Out[110]:
['a', 'b', 'c', 'd']

So you see that the latter which is semantically the same as your code never modifies the original elements in the list whilst the other does

Sign up to request clarification or add additional context in comments.

4 Comments

I just tested the example you provide and it works perfectly. However when I apply the code for my exact examples, it stills doesn't filter
Then you may have some data integrity issues that we can't reproduce unless you post raw data and code. I'd test in your loop that your filtering actually produces a df
I double checked, so it works perfectly, but the new dataframe, instead of being called, for e.g. ias_viewability_df, has to be recalled as dataframes[3]. Is there any way to rename dataframe[3] to ias_viewability_df automatically? And accordingly, match all the remaining dfs to their dataframes list?
no, for that you'd need a dict where the key is the string name of your df, and the values being your df if you want that kind of behaviour
1

You can use list comprehension - output is list of filtered Dataframes:

dataframes = [df[df['Placement Name'].str.contains(u"»")] for df in dataframes]

Sample:

atlas_df = pd.DataFrame({'Placement Name':['deu_gathf»', 'deu_gahf', 'fra_gagg'],
                         'another_col':[1,2,3]})
flashtalking_df = pd.DataFrame({'Placement Name':['deu_gahf»','fra_ga', 'deu_gatt'],
                         'another_col':[4,5,6]})

dataframes = [atlas_df, flashtalking_df]
print (dataframes)
[  Placement Name  another_col
0     deu_gathf»            1
1       deu_gahf            2
2       fra_gagg            3,   Placement Name  another_col
0      deu_gahf»            4
1         fra_ga            5
2       deu_gatt            6]

dataframes = [df[df['Placement Name'].str.contains(u"»")] for df in dataframes]
print (dataframes)
[  Placement Name  another_col
0     deu_gathf»            1,   Placement Name  another_col
0      deu_gahf»            4]

1 Comment

Who was the guy looking for canonical answers?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.