0

From the file output_hc_v1.csv with headers :emplid pay_status count location deptid grade_desc date version

I would like via python save a new csv filtered on "pay_status=Active". with this code:

a1=a.query ('"pay_status" == ["Active"]')

=========================================

but this part is not working.

My full code:
import csv 
import pandas as pd
import numpy as np

df= pd.read_csv ("C:/Users/e1087353/Documents/CAPCO/HC/python_data/output_hc_v1.csv")
df.head()
a= pd.pivot_table(df, index =["location","deptid","grade_desc","pay_status"] ,values=["Count"],aggfunc=np.sum)
a1=a.query ('"pay_status" == ["Active"]')
a1.to_csv ("C:/Users/e1087353/Documents/CAPCO/HC/python_data/Output_final2.csv")

2 Answers 2

1

You do not need to quote column names in the query string. Therefore you could use,

a1 = a.query('pay_status == "Active"')
Sign up to request clarification or add additional context in comments.

Comments

0

Don't use query. To filter simply do:

a1=a[a['pay_status']=='Active']

If you create a pivot table you may want to reset the index before you filter: a=a.reset_index()

Alternatively, for more complicated queries, if you want to filter for items in a list use isin:

filter_list=['Active','Inactive','Something']
a1=a[a['pay_status'].isin(filter_list)]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.