12

Now I know how to check the dataframe for specific values across multiple columns. However, I cant seem to work out how to carry out an if statement based on a boolean response.

For example:

Walk directories using os.walk and read in a specific file into a dataframe.

for root, dirs, files in os.walk(main):
        filters = '*specificfile.csv'
        for filename in fnmatch.filter(files, filters):
        df = pd.read_csv(os.path.join(root, filename),error_bad_lines=False)

Now checking that dataframe across multiple columns. The first value being the column name (column1), the next value is the specific value I am looking for in that column(banana). I am then checking another column (column2) for a specific value (green). If both of these are true I want to carry out a specific task. However if it is false I want to do something else.

so something like:

if (df['column1']=='banana') & (df['colour']=='green'):
    do something
else: 
    do something
11
  • What is the specific task? is it to update something in the same row? Commented Sep 22, 2015 at 9:27
  • Are you wanting to check whether any rows satisfy your condition for a df or rows? Commented Sep 22, 2015 at 9:27
  • I do not see any question in you question. Commented Sep 22, 2015 at 9:32
  • No i dont want to carry out any modifications to the data. There are plenty of comprehension style statements on google based on modifying the data. But I cant seem to find any with a just a normal if. We can assume the task will be a system based command action; os.makdirs for example Commented Sep 22, 2015 at 9:33
  • Apologies Alex.S I will modify it to try make myself clearer. Ive just reread that a few times and I can see the question in two forms. 1. In words, 'I cant seem to work out how to carry out an if statement based on a boolean response.' the 2nd is in code form (the snippet at the bottom of the post). Commented Sep 22, 2015 at 9:35

2 Answers 2

19

If you want to check if any row of the DataFrame meets your conditions you can use .any() along with your condition . Example -

if ((df['column1']=='banana') & (df['colour']=='green')).any():

Example -

In [16]: df
Out[16]:
   A  B
0  1  2
1  3  4
2  5  6

In [17]: ((df['A']==1) & (df['B'] == 2)).any()
Out[17]: True

This is because your condition - ((df['column1']=='banana') & (df['colour']=='green')) - returns a Series of True/False values.

This is because in pandas when you compare a series against a scalar value, it returns the result of comparing each row of that series against the scalar value and the result is a series of True/False values indicating the result of comparison of that row with the scalar value. Example -

In [19]: (df['A']==1)
Out[19]:
0     True
1    False
2    False
Name: A, dtype: bool

In [20]: (df['B'] == 2)
Out[20]:
0     True
1    False
2    False
Name: B, dtype: bool

And the & does row-wise and for the two series. Example -

In [18]: ((df['A']==1) & (df['B'] == 2))
Out[18]:
0     True
1    False
2    False
dtype: bool

Now to check if any of the values from this series is True, you can use .any() , to check if all the values in the series are True, you can use .all() .

Sign up to request clarification or add additional context in comments.

2 Comments

Great, thanks Anand S Kumar. Can I ask how you worked that out? I didnt see anything in docs that explained that
Update the answer with a small explanation.
0

You can also try to use lambda function and write any quick expression.

example:

total_votes_df.apply(lambda x: True if x['VOTES'] < 0.16666666666666668 * x['TOTAL_VOTES'] else False, axis=1)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.