1

I have a dataframe which contains list value, let us call it df1:

Text
-------
["good", "job", "we", "are", "so", "proud"]
["it", "was", "his", "honor", "as", "well", "as", "guilty"]

And also another dataframe, df2:

Word    Value
-------------
good    7.47
proud   8.03
honor   7.66
guilty  2.63

I want to create apply plus lambda function to create df1['score'] where the values are derived from sum-aggregating words per list in df1 which are found in df2's words. Currently, this is my code:

def score(list_word):
    sum = count = mean = sd = 0
    for word in list_word:
         if word in df2['Word']:
             sum = sum + df2.loc[df2['Word'] == word, 'Value'].iloc[0]
             count = count + 1
    if count != 0:
        return sum/count
    else:
        return 0

df['score'] = df.apply(lambda x: score(x['words']), axis=1)

This is what I envision:

Score
-------
7.75 #average of good (7.47) and proud (8.03)
5.145 #average of honor (7.66) and guilty (2.63)

However, it seems x['words'] did not pass as list object, and I do not know how to modify the score function to meet the object type. I try to convert it by tolist() method, but no avail. Any help appreciated.

1
  • from where are you reading the first dataframe? Commented May 7, 2020 at 0:39

1 Answer 1

1

Giving the first df1, and df2 with explode and map , Notice explode is after pandas 0.25

#import ast 
#df1.Text=df1.Text.apply(ast.literal_eval)
#If the list is string type , we need bring the format list back with fast 
s=df1.Text.explode().map(dict(zip(df2.Word,df2.Value))).mean(level=0)
0    7.750
1    5.145
Name: Text, dtype: float64

Update

df1.Text.explode().to_frame('Word').reset_index().merge(df2,how='left').groupby('index').mean()
       Value
index       
0      7.750
1      5.145
Sign up to request clarification or add additional context in comments.

2 Comments

this is indeed straight to the point, however I also want to do some calculation aside mean and use at least two value columns from df2... is it possible in map function to have another df2 columns to create another column which map the values with the second value column?
@rayyar see the update ,after merge , is df2 have more than one column it will append to the format df1 , then we do group and your calculation convert it back

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.