0

I have multiple dataframes and I want to filter each of them so that each df only keeps columns consisting of the word "Overall." I have the following for-loop but it doesn't have the same effect as if I do it manually [aka y15 = y15.filter(like='Overall')].

pit_dfs = [y15,y16,y17]

for i in pit_dfs:
    i = i.filter(like='Overall')

Replicable example:

y15 = pd.DataFrame({'Col1-Overall': ['a','b','c','d'],
              'Col2': ['a','b','c','d'],
              'Col3': ['a','b','c','d'],
              'Col4': ['a','b','c','d']})

y16 = pd.DataFrame({'Col1-Overall': ['a','b','c','d'],
              'Col2': ['a','b','c','d'],
              'Col3': ['a','b','c','d'],
              'Col4': ['a','b','c','d']})

y17 = pd.DataFrame({'Col1-Overall': ['a','b','c','d'],
              'Col2': ['a','b','c','d'],
              'Col3': ['a','b','c','d'],
              'Col4': ['a','b','c','d']})

Expected output:

y15
+--------------+
| Col1-Overall |
+--------------+
| a            |
+--------------+
| b            |
+--------------+
| c            |
+--------------+
| d            |
+--------------+

y16
+--------------+
| Col1-Overall |
+--------------+
| a            |
+--------------+
| b            |
+--------------+
| c            |
+--------------+
| d            |
+--------------+

y17
+--------------+
| Col1-Overall |
+--------------+
| a            |
+--------------+
| b            |
+--------------+
| c            |
+--------------+
| d            |
+--------------+

I know this is a simple one, but have been looking through Stack for the past hour and can't find a similar example. What am I missing? Thanks!

2 Answers 2

2

See this answer and this example about Python for loops. The variable in the loop is not a pointer, so you're not changing the actual dataframes.

You can do (I haven't tested this):

pit_dfs = [y15,y16,y17,y18,y19]

for idx in range(len(pit_dfs)):
    pit_dfs[idx] = pit_dfs[idx].filter(like='Overall')
Sign up to request clarification or add additional context in comments.

1 Comment

Thanks so much for this! While this code does not change the original dataframes, it does change the dataframes within the list 'pit_dfs," which can then be accessed using pit_dfs[0], pit_dfs[1], etc. I appreciate it!!
1

Here's an alternative:

pit_dfs = [y15,y16,y17,y18,y19]

def filter_cols_like(df, like):
    cols_not_like = [col for col in df.columns if like not in col]
    df.drop(columns=cols_not_like,inplace=True)

for i in pit_dfs:
    filter_cols_like(i,like='Overall')

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.