2

The question of selecting pandas dataframe rows on the basis of column values was addressed in:

Select rows from a DataFrame based on values in a column in pandas

without regard to ordering of the rows based on the order of the column values.

As an example, consider:

import pandas as pd
import numpy as np
df = pd.DataFrame({'A': 'foo bar foo bar foo bar foo foo'.split(),
               'B': 'one one two three two two one three'.split(),
               'C': np.arange(8), 'D': np.arange(8) * 2})
print(df.loc[df['D'].isin([0,2])])

yields:

     A    B  C  D
0  foo  one  0  0
1  bar  one  1  2

as does:

print(df.loc[df['D'].isin([2,0])])

where the order of the column values has been reversed.

I'd like to know how I can modify this expression to respect the order of the desired column values such that the output would be:

     A    B  C  D
1  bar  one  1  2
0  foo  one  0  0

2 Answers 2

1

You can put the values in a data frame, then do an inner join (merge by default) which should preserve the order on the left data frame:

D = pd.DataFrame({"D": [2, 0]})
D.merge(df)

#   D     A   B C
#0  2   bar one 1
#1  0   foo one 0

Or a more sure way to do this:

D.reset_index().merge(df).sort_values("index").drop("index", 1)
#   D     A   B C
#0  2   bar one 1
#1  0   foo one 0
Sign up to request clarification or add additional context in comments.

Comments

0
df.loc[df['D'].isin([2,0])].sort_values(by='A')

Where 'A' can be any column the and ascending=False of you want reverse order.

3 Comments

How would this change if I wanted to specify an arbitrary order on the column values (not necessarily monotonically increasing or decreasing)? Also as a variant, is there a way to modify the isin (call it requiredisin) to require each element in the list of columns? In other words, the code should barf if I write df.loc[df['D'].requiredisin([2,0, 2342])].sort_values(by='A') since 2342 is not one of the values in the D column?
I think you want data type category. isin returns boolean for each element, no requirement flag.
df['A'] = df['A'].astype('category') df.A.cat.set_categories(['bar','foo'], ordered=True)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.