How to use two variables with one function in python?

Question

I want to create new column in a df that shows two options when executing a function.

I have two lists:

lista = [A, B, C, D]
listb = [would not, use to, manage to, when, did not]

I want to find the first word that can appear from lista and return it in a new column called "Effect". If this is not found, then search for values from listband print the first encountered from listb along with it next 2 strings.

Example:

I have tried something like this:

def matcher(Description):
    for i in lista:
        if i in Description:
            return i
    return "Not found"

def matcher(Description):
    for j in listb:
        if j in Description:
            return j + 1
    return "Not found"

df["Effect"] = df.apply(lambda i: matcher(i["Description"]), axis=1)
df["Effect"] = df.apply(lambda j: matcher(j["Description"]), axis=1)

score 1 · Accepted Answer · 2022-11-05 04:19:52Z

1

The code below should do what you want to achieve:

def matcher(sentence):
    match_list = [substr for substr in lista 
                      if substr in [ word 
                                 for word in sentence.replace(',',' ').split(" ")]]
    if match_list: # list with items evaluates to True, empty list to False
        return match_list[0]
    match_list = [substr for substr in listb if ' '+substr+' ' in sentence]
    if match_list:
        substr = match_list[0]
        return substr + " " + sentence.split(substr)[-1].replace(',',' ').strip().split(" ")[0]
    return "Not found"

df["Effect"] = df.Description.apply(matcher)

If the sentences come with more than a ',' in them consider to use regular expression replacement instead of .replace(',',' ') of all non-letter characters in the sentence with a space (so that words stay guaranteed separated) and be aware of the fact that some unusual cases of substrings and sentences can have unexpected side-effects.

UPDATE providing code for adding any number of words after substring matched from listb (requested in the comments) along with explanations how the code works:

lista = ['A', 'B', 'C', 'D']
listb = ["unable to", "would not", "was not", "did not", "there is not", "could not", "failed to", "use to", "manage to", "when"]
# ^-- listb extendend with phrases from another question on same subject

# I want the following, for example, there is the following text: 
sentence1 = "During procedure it was noted that A, was present and were notified to deparment."
#  In the above text exists A and it will be returned in a new column, only the A value.
sentence2 = "During procedure it was noted that product did not inject as expected."
#  In the above text I want to found "did not" and print these text 
# along with it next N strings ("did not inject" for N=1 and "did not inject as" for N-2

def matcher(sentence, no_words=1):
    # First find a match from lista: 
    match_list = [substr for substr in lista 
                      if substr in [ word 
                                 for word in sentence.replace(',',' ').split(" ")]]
    if match_list: # list with items evaluates to True, empty list to False
        return match_list[0] # if match found in lista exit function with return

    # There was no match from lista so find a match from listb:
    match_list = [substr for substr in listb if ' '+substr+' ' in sentence]
    if match_list:
        substr = match_list[0]
        # The code for returning the substr along with additional words from the sentence
        # splits the sentence on substr 'sentence.split(substr)' and gets the sentence text
        # after the substring by taking the end element of the list created by splitting
        # using the list index [-1] ( [1] would do it too ): sentence.split(substr)[-1]. 
        # .replace(',',' ') handles the case of words separated by ',' instead of ' '. 
        # .strip() handles the case of whitespaces at start and end of the part of 
        # extracted sentence. 
        # .split(" ") creates a list of words after substr in the sentence and the slice 
        # [0:no_words] takes 'no_words' amount of words from this list to join the words
        # to one string using ' '.join() in order to add it to substr:  
        return substr + " " + ' '.join(sentence.split(substr)[-1].replace(',',' ').strip().split(" ")[0:no_words])

    # There was no match from lista and list b (no value was yet returned)  so: 
    return "Not found"

print(matcher(sentence1))
print(matcher(sentence2)) # no_words=1 is default
print(matcher(sentence2, 2))

The code above outputs:

A
did not inject
did not inject as

edited Nov 5, 2022 at 4:19

answered Nov 3, 2022 at 4:56

user7711283

Sign up to request clarification or add additional context in comments.

7 Comments

Victor Leon Over a year ago

I substituted "setence" for the column name of the df "Description". This is only giving me the second function, for example, "did not inject". Is not showing values from "lista".

user7711283 Over a year ago

Try again using the current code in the answer which should work as expected delivering 'A' and 'did not inject' values for the two sentences you mentioned (have tested it and it works).

Victor Leon Over a year ago

I have edited my question in order to make it more clear

Victor Leon Over a year ago

If I want to print more than 1 next string in the second function, what do I need to modify? substr = match_list[0] return substr + " " + sentence.split(substr)[-1].replace(',',' ').strip().split(" ")[0]

user7711283 Over a year ago

Yes, it didn't work as I have not mentioned the ' '.join() required for it to work. See updated answer with added join() and detailed explanations of how the code achieves what it does.

|

Tim Roberts · Accepted Answer · 2022-11-03 04:12:34Z

0

You can do both at once:

def matcher(Description):
    w = [i for i in lista if i in Description]
    w.extend( [i for i in listb if i in Description] )
    if not w:
        return "Not found"
    else:
        return ' '.join(w)

df["Effect"] = df.apply(lambda i: matcher(i["Description"]), axis=1)

answered Nov 3, 2022 at 4:12

Tim Roberts

55.3k4 gold badges28 silver badges41 bronze badges

6 Comments

user7711283 Over a year ago

The question requests that if an item of lista is found in Description it will be returned. So you have to put if not w: directly after w = and only if w == []: run the second comprehension but with extending the found substring with one or two words following this substring in Description. Not clear is if Effect should list all the found items of lista/listb or only the first found one ...

Victor Leon Over a year ago

I want the following, for example, there is the following text: "During procedure it was noted that A was present and were notified to deparment." In the above text A exists and it will be returned in a new column, only the A value. With the following text we have: "During procedure it was noted that product did not inject as expected" In the above text I want to found "did not" and print these text along with it next 1 strings, in this case "did not inject"

user7711283 Over a year ago

What if two or more items from lista or listb are found in the sentence? Return only the first one found? Or all?

Victor Leon Over a year ago

Only the first one

Tim Roberts Over a year ago

What about "A strange result occurred"? How will you tell that's not what you want?

|

Collectives™ on Stack Overflow

How to use two variables with one function in python?

2 Answers 2

7 Comments

6 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

7 Comments

6 Comments

Your Answer

Sign up or log in

Post as a guest

Related