0

I have searched for similarly worded questions but haven't found one that answers my question.

I have 2 dataframes showing the results on exam_1 and exam_2:

exam_1 = pd.DataFrame([['Jim', 87], ['Toby', 44], ['Joe', 71], ['Tom', 59]], columns=['Name', 'Score'])

exam_2 = pd.DataFrame([['Tom', 88], ['Joe', 55], ['Jim', 62], ['Toby', 70]], columns=['Name', 'Score'])

I want to iterate through these dataframes such that we:

  • remove subjects with Score less than 60
  • create new dataframes (J and T) based on the first letter of the subjects name:
for df in (exam_1, exam_2):
    
    # Remove subjects with score less than 60
    df = df[df['Score'] >= 60]

    # Dataframe of names starting with 'J' or 'T'
    J = df[df['Name'].str.startswith('J')]
    T = df[df['Name'].str.startswith('T')]

I want to add a suffix to the dataframes J and T based off the iteration in the for loop

For example, exam_1 is the 1st iteration, so the dataframes would be J_1 and T_1.

exam_2 is the 2nd iteration, so the dataframes would be J_2 and T_2.

Is this possible to do?

1

2 Answers 2

1

Maybe with enumerate and globals ?

for idx, df in enumerate((exam_1, exam_2), start=1):
    # Remove subjects with score less than 60
    df = df[df['Score'] >= 60]

    # Dataframe of names starting with 'J' or 'T'
    globals()[f'J_{idx}'] = df[df['Name'].str.startswith('J')]
    globals()[f'T_{idx}'] = df[df['Name'].str.startswith('T')]

NB : This will create the variables (T_1, J_1, T_2 and J_2) as a pandas.core.frame.DataFrame in the global scope.

Sign up to request clarification or add additional context in comments.

Comments

1

Use a dict in your case:

J = {}
T = {}

for i, df in enumerate((exam_1, exam_2), 1):
    
    # Remove subjects with score less than 60
    df = df[df['Score'] >= 60]

    # Dataframe of names starting with 'J' or 'T'
    J[i] = df[df['Name'].str.startswith('J')]
    T[i] = df[df['Name'].str.startswith('T')]

Output:

>>> J[1]
  Name  Score
0  Jim     87
2  Joe     71

>>> J[2]
  Name  Score
2  Jim     62

>>> T[1]  # all scores < 60
Empty DataFrame
Columns: [Name, Score]
Index: []

>>> T[2]
   Name  Score
0   Tom     88
3  Toby     70

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.