1

I have a DataFrame in which I'm attempting to reverse the row order if a condition is met.

The DataFrame (df) is:

    name    id  description
0   a   String 1    lion
1   b   String 1    snake
2   c   String 1    bear
3   d   String 1    tiger
4   e   String 1    dog
5   f   String 2    cat
6   g   String 2    bird
7   h   String 2    whale
8   i   String 2    eagle
9   j   String 2    rhino
10  k   String 3    monkey
11  l   String 3    lamb
12  m   String 3    horse
13  n   String 3    goat
14  o   String 3    rabbit
15  p   String 4    frog
16  q   String 4    jaguar
17  r   String 4    fox
18  s   String 4    sloth
19  t   String 4    beaver
20  u   String 5    parrot
21  v   String 5    dolphin
22  w   String 5    seal
23  x   String 5    spider
24  y   String 5    panda

For the df ['id'] rows that equal String 2 and String 4 (essentially all id's that are % 2 == 0) I would like to reverse the order of the rows in that group. The output DataFrame I am looking for would be:

    name    id  description
0   a   String 1    lion
1   b   String 1    snake
2   c   String 1    bear
3   d   String 1    tiger
4   e   String 1    dog
**5 j   String 2    rhino**
**6 i   String 2    eagle**
**7 h   String 2    whale**
**8 g   String 2    bird**
**9 f   String 2    cat**
10  k   String 3    monkey
11  l   String 3    lamb
12  m   String 3    horse
13  n   String 3    goat
14  o   String 3    rabbit
**15    t   String 4    beaver**
**16    s   String 4    sloth**
**17    r   String 4    fox**
**18    q   String 4    jaguar**
**19    p   String 4    frog**
20  u   String 5    parrot
21  v   String 5    dolphin
22  w   String 5    seal
23  x   String 5    spider
24  y   String 5    panda

I am capable of doing this individually with:

df.loc[df['id'] == 'condition'][::-1]

I am struggling with how to apply it to the DataFrame so that it modifies it. I have tried the following function to no avail:

def reversal(row):    
    for row in df.id:
        if row == 'condition':
            return df.loc[df['id'] == 'condition'][::-1]

It's my intent to utilize this on a DataFrame of about 30K rows. Which really isn't that much but I'm still mindful of trying to use the most efficient approach.

It is equally important to me to understand the logic that is behind the solution as I'm just really starting to learn Python. I think the code above is a good tell of that being so.

Thank you for any help, I'm kinda stumped on this one.

2
  • Add your code. What you added was only data, not python code. Google what you want to do and get a basic code sample of it. Even if the code is flawed, posting code makes it way easier to answer a question. So you will also profit from getting better feedback if you add code ^^ Commented May 13, 2020 at 4:11
  • Yes, thank you. I inadvertently submitted the question before I got to finish and format it. I've edited it. Commented May 13, 2020 at 4:26

2 Answers 2

1

Use:

#extract numbers from id and compare by % 2 == 0
mask = df['id'].str.extract('(\d+)', expand=False).astype(int) % 2 == 0
#lambda function for change order
f = lambda x: x.iloc[::-1]
#apply only for groups match condition
df[mask] = df[mask].groupby(df['id']).transform(f)
print (df)
   name        id description
0     a  String 1        lion
1     b  String 1       snake
2     c  String 1        bear
3     d  String 1       tiger
4     e  String 1         dog
5     j  String 2       rhino
6     i  String 2       eagle
7     h  String 2       whale
8     g  String 2        bird
9     f  String 2         cat
10    k  String 3      monkey
11    l  String 3        lamb
12    m  String 3       horse
13    n  String 3        goat
14    o  String 3      rabbit
15    t  String 4      beaver
16    s  String 4       sloth
17    r  String 4         fox
18    q  String 4      jaguar
19    p  String 4        frog
20    u  String 5      parrot
21    v  String 5     dolphin
22    w  String 5        seal
23    x  String 5      spider
24    y  String 5       panda
Sign up to request clarification or add additional context in comments.

6 Comments

Yes, that is correct. All columns except for the index. Thank you.
Thank you @jezrael. It looks like I'm getting a value error with the mask. Here is the traceback: --------------------------------------------------------------------------- ValueError Traceback (most recent call last) <ipython-input-3701-44a732ab229d> in <module> 1 #extract numbers from id and compare by % 2 == 0 ----> 2 mask = df['round'].str.extract('(\d+)', expand=False).astype(int) % 2 == 0 3 #lambda function for change order 4 f = lambda x: x.iloc[::-1] 5 #apply only for groups match condition
Specifically: ValueError: cannot convert float NaN to integer. Thank you again though, I will work on resolving it.
@sooted - There are always numbers in id ? Is possible some groups has no numbers? Because if no numbers then is created missing values, so try change solution to .astype(float) instead .astype(int).
Yes, the only difference is the word 'round' substituted for 'String' but otherwise all the same. I will modify to type float as suggested. Thanks! Update: changing the astype to float did the trick. Thanks again.
|
1
(
    df.groupby('id')
    .apply(lambda x: x.iloc[::-1] if int(x.id.iloc[0].strip('String '))%2==0 else x)
    .reset_index(drop=True)
)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.