How to create a function based on another dataframe column being True?

Question

I have a dataframe shown below:

     Name   X    Y
0    A      False True
1    B      True  True
2    C      True  False

I want to create a function for example:

example_function("A") = "A is in Y"
example_function("B") = "B is in X and Y"
example_function("C") = "C is in X"

This is my code currently (incorrect and doesn't look very efficient):

def example_function(name):
    for name in df['Name']:
        if df['X'][name] == True and df['Y'][name] == False:
            print(str(name) + "is in X")
        elif df['X'][name] == False and df['Y'][name] == True:
            print(str(name) + "is in Y")
        else:
            print(str(name) + "is in X and Y")

I eventually want to add more Boolean columns so it needs to be scalable. How can I do this? Would it be better to create a dictionary, rather than a dataframe?

Thanks!

@timgeb I have edited it, an example of the output is "A is in Y" — turtle69
– turtle69, Commented Mar 24, 2022 at 12:21

mozway · Accepted Answer · 2022-03-24 12:41:59Z

1

If you really want a function you could do:

def example_function(label):
    s = df.set_index('Name').loc[label]
    l = s[s].index.to_list()
    return f'{label} is in {" and ".join(l)}'

example_function('A')
'A is in Y'

example_function('B')
'B is in X and Y'

You can also compute all the solutions as dictionary:

s = (df.set_index('Name').replace({False: pd.NA}).stack()
       .reset_index(level=0)['Name']
     )
out = s.index.groupby(s)

output:

{'A': ['Y'], 'B': ['X', 'Y'], 'C': ['X']}

answered Mar 24, 2022 at 12:41

mozway

267k13 gold badges56 silver badges106 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

turtle69 Over a year ago

Thank you very much, this works perfectly. However, if I add more columns, how do I make it say "A is in W, X, Y and Z", instead of "A is in W and X and Y and Z".

mozway Over a year ago

You need to test the length of the list and handle differently the last element and the rest. There might even be libraries that do that automatically.

Nicole Zattarin · Accepted Answer · 2022-03-24 12:45:52Z

1

I think you can stay with a DataFrame, the same output can be obtained with a function like this:

def func (name, df):
    # some checks to verify that the name is actually in the df
    occurrences_name = np.sum(df['Name'] == name)
    if occurrences_name == 0: 
        raise ValueError('Name not found')
    elif occurrences_name > 1:
        raise ValueError('More than one name found')

    # get the index corresponding to the name you're looking for
    # and select the corresponding row
    index = df[df['Name'] == name].index[0]
    row = df.drop(['Name'], axis=1).iloc[index]
    outstring = '{} is in '.format(name)
    for i in range(len(row)):
        if row[i] == True:
            if i != 0: outstring += ', '
            outstring += '{}'.format(row.index[i])
    return outstring

of course you can adapt this to the specific shape of your df, I'm assuming that the column containing names is actually 'Name'.

answered Mar 24, 2022 at 12:45

Nicole Zattarin

111 bronze badge

1 Comment

turtle69 Over a year ago

Thanks, this works but there is a small problem. If a name is False for X but True for Y, the output is "A is in , Y". Otherwise it is good.

Collectives™ on Stack Overflow

How to create a function based on another dataframe column being True?

2 Answers 2

2 Comments

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

2 Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related