Dataframe Column is not Read as List in Lambda Function

Question

I have a dataframe which contains list value, let us call it df1:

Text
-------
["good", "job", "we", "are", "so", "proud"]
["it", "was", "his", "honor", "as", "well", "as", "guilty"]

And also another dataframe, df2:

Word    Value
-------------
good    7.47
proud   8.03
honor   7.66
guilty  2.63

I want to create apply plus lambda function to create df1['score'] where the values are derived from sum-aggregating words per list in df1 which are found in df2's words. Currently, this is my code:

def score(list_word):
    sum = count = mean = sd = 0
    for word in list_word:
         if word in df2['Word']:
             sum = sum + df2.loc[df2['Word'] == word, 'Value'].iloc[0]
             count = count + 1
    if count != 0:
        return sum/count
    else:
        return 0

df['score'] = df.apply(lambda x: score(x['words']), axis=1)

This is what I envision:

Score
-------
7.75 #average of good (7.47) and proud (8.03)
5.145 #average of honor (7.66) and guilty (2.63)

However, it seems x['words'] did not pass as list object, and I do not know how to modify the score function to meet the object type. I try to convert it by tolist() method, but no avail. Any help appreciated.

from where are you reading the first dataframe?

Rajat Mishra
– Rajat Mishra

2020-05-07 00:39:30 +00:00
Commented May 7, 2020 at 0:39 — Rajat Mishra
– Rajat Mishra, Commented May 7, 2020 at 0:39

BENY · Accepted Answer · 2020-05-07 01:05:22Z

1

Giving the first df1, and df2 with explode and map , Notice explode is after pandas 0.25

#import ast 
#df1.Text=df1.Text.apply(ast.literal_eval)
#If the list is string type , we need bring the format list back with fast 
s=df1.Text.explode().map(dict(zip(df2.Word,df2.Value))).mean(level=0)
0    7.750
1    5.145
Name: Text, dtype: float64

Update

df1.Text.explode().to_frame('Word').reset_index().merge(df2,how='left').groupby('index').mean()
       Value
index       
0      7.750
1      5.145

edited May 7, 2020 at 1:05

answered May 7, 2020 at 0:40

BENY

324k22 gold badges176 silver badges250 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

rayyar Over a year ago

this is indeed straight to the point, however I also want to do some calculation aside mean and use at least two value columns from df2... is it possible in map function to have another df2 columns to create another column which map the values with the second value column?

BENY Over a year ago

@rayyar see the update ,after merge , is df2 have more than one column it will append to the format df1 , then we do group and your calculation convert it back

Collectives™ on Stack Overflow

Dataframe Column is not Read as List in Lambda Function

1 Answer 1

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related