1

trying to compare EVERY value within the row of one dataframe against EVERY other value

based on if decision in row that relates to the row before

> If value1 > value2: # in row_x
>     based_on_previous_value(value1)

referring to row_x-1 to then trying to build a new dataframe with these values df_new

example)

df = pandas.DataFrame({"R1": [8,2], "R2": [-21,-24], "R3": [-9,46]})
# second row in df_new for (just a  simple example of a function for clarification reasons)

def based_on_previous_value(x):
    return x*2

df_new = pandas.DataFrame({"R1": [32,2], "R2": [-21,-24], "R3": [-18,46]})

> # 8 --> 32 (because 8 ist bigger than -21 & 8 is bigger than -9) --> 8*2*2 = 32
> # -21 --> -21 (because -21 is smaller than 8 & smaller than -9) --> -21 = -21
> # -9 --> -18 (because -9 is smaller than 8 & bigger than -21) --> -9*2 = 18

EDIT: example2)

# I have a dataframe that Looks like this:
df = pandas.DataFrame({"R1": [8,2,3], "R2": [-21,-24,4], "R3": [-9,46,6],"R4": [16,-14,-1],"R5": [-3,36,76]})

as above: I want to compare every value within one row against each other, to then apply a function (if value 1 in row x is bigger then value 2 in row x) i am trying to apply something like this:

If value1 in row1 > value2 in row 1:
    based_on_previous_value(value1) # trying to put results in a new dataframe
Else:
    return value1 # trying to put results in a new dataframe

def based_on_previous_value(x):
        x in row_before + 1

--> this Code doesn't work (just trying to Show what I am trying to do in Code)

# results put in a new dataframe
df_new = pandas.DataFrame({"R1": [8,10,11], "R2": [-21,-21,-19], "R3": [-9,-5,-2],"R4": [16,17,17],"R5": [-3,0,4]})

--> "R1" in 2nd row: 2 > -24, 2 > -14 --> value("R1" in first row) + 2 = 10 --> "R2" in 2nd row: -21 < all the other 4 values --> value("R2" in first row) + 0 = -21 --> "R3" in 2nd row: 46 > all the other 4 values --> value("R3" in first row) + 4 = -5

1 Answer 1

1

Yeah, so you'll want to do several things :

See, if you order your columns, in ascending order, the smallest value will appear at the beginning and the largest will appear at the end.

Thanks to that, we can multiply the values by multiples of 2 depending on how far along they are on the axis=1

So, your example :

import pandas as pd
import numpy as np

df = pd.DataFrame({"R1": [8,2], "R2": [-21,-24], "R3": [-9,46]})

if we sort it like that :

val_sorted = np.sort(df.values,axis=1)

becomes :

array([[-21,  -9,   8],
       [-24,   2,  46]], dtype=int64)

Next, we'll create the multiplication depending on where the values stand on the column axis.

mult = [2**i for i in range(df.shape[1])]

We can then multiply them :

sorted_mult = val_sorted*mult

which outputs :

array([[-21, -18,  32],
       [-24,   4, 184]], dtype=int64)

and if we want to get the inital order of the dataframe, we flip the values :

flipped_sorted_mult = np.fliplr(val_sorted)

which outputs :

array([[ 32, -18, -21],
       [184,   4, -24]], dtype=int64)

Finally, we put it back in a dataframe :

df_final = pd.DataFrame(flipped_sorted_mult, columns = df.columns)

I think this might be a bit convoluted but each step should be clear.

Now, this is a way to do it which involves fewer steps but might be more cryptic :

df_sorted = df.apply(sorted,**{"reverse":True}, axis=1)
df_sorted = df_sorted.explode().values.reshape(df.shape)
df_final = pd.DataFrame(df_sorted*mult, columns=df.columns) 

What happened ?

We applied to each row the built-in sorted function and told the apply methods to pass the reverse argument as True.

Then, we get back a pandas Series with each row being sorted, unfortunately as a list. Thus, I used the new (as of pandas 0.25) explode method to break the lists down and finally I reshaped the array back in its inital shape.

The last step is similar to the one above.

I hope it helps,

Cheers

Sign up to request clarification or add additional context in comments.

7 Comments

Thank you soo much, I'll try to understand and implement your very detailed and nicely worded answer. I do really appreciate it :-)) Would you be available for any further Questions? Best regards
Hey, so i really like the way you solved it, thank you :-) the more cryptic part didn't work for me, as I can't apply .explode(), guess I have to update. But i do have a Question for the first Version: df_final slightly changes ist order within each Frame compared to df. is there a way to Keep the same order of values within the row?
@DieterMueller You can unnest the dataframe and not use explode by other means but then I suppose the first method if good enough for your problem. Sure, but you'll have to show me how it changes. When we use the values of a DataFrame, we have numpy arrays and they don't keep the index. But since we know the initial index and even better, we know the first column does not change then, you can use this SO answer
so my new dataframe is supposed to have the updated values, but still within the same order as before. in other words: i am trying to Keep the values linked to their respective column names. Maybe there is another way to Keep them linked after we sort them in an ascending order to determin mult = [2**i for i in range(df.shape[1])]? or is that not possible at all as Arrays dont refer to column names anyway?
I don't know without seeing the colums, fliplr should keep the ordering, it's just flipping it. numpy arrays don't refer to columns but you can use a dict to keep the transformation stored and then you map or transform on the dataframe, the dict would be {col:multiple for col, multiple in zip(df.columns, [2**i for i in range(df.shape[1]])}
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.