Nested for loop using lambda function

Question

I have a nested for loop something like:

for x in df['text']:
  for i in x:
    if i in someList:
      count++

Where df['text'] is a series of lists containing words such as ['word1', 'word2', 'etc']
I know I can just use the for format but I want to convert it into a lambda function.
I tried doing:
df['in'] = df['text'].apply(lambda x: [count++ for i in x if i in someList]) but it is not proper syntax. How can I modify to get the function to what I desire?

can you add a sample data and an expected output so we don't have to guess. Will be useful for future readers too.. — anky
– anky, Commented Jun 28, 2019 at 14:33

BENY · Accepted Answer · 2019-06-28 14:27:43Z

4

I feel like you need expend the row and doing with isin , since with pandas , we usually try not use for loop .

df['in']=pd.DataFrame(df['text'].tolist(),index=df.index).isin(someList).sum(1)

answered Jun 28, 2019 at 14:27

BENY

324k22 gold badges176 silver badges250 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

chepner · Accepted Answer · 2019-06-28 14:27:59Z

2

You don't need any additional functions. Just create a sequences of ones (one per element) to sum.

count = sum(1 for x in df['text'] for i in x if i in someList)

answered Jun 28, 2019 at 14:27

chepner

538k77 gold badges594 silver badges746 bronze badges

4 Comments

BlackBear Over a year ago

@OP also possibly faster to use a set instead of someList

BENY Over a year ago

Is this output one value or list ?

chepner Over a year ago

It's one value; no new lists are created. A generator expression is passed to sum.

cap Over a year ago

I should note that I have another column in my df, 'count', where the count for each text is placed into. The provided code sums up all of the counts I believe because my 'count' column contains all the same number, 60085.

piRSquared · Accepted Answer · 2019-06-28 14:37:59Z

Setup

someList = [*'ABCD']
df = pd.DataFrame(dict(text=[*map(list, 'AB CD AF EG BH IJ ACDE'.split())]))

df

           text
0        [A, B]
1        [C, D]
2        [A, F]
3        [E, G]
4        [B, H]
5        [I, J]
6  [A, C, D, E]

Numpy and `contains`

i = np.arange(len(df)).repeat(df.text.str.len())
a = np.zeros(len(df), int)
np.add.at(a, i, [*map(someList.__contains__, np.concatenate(df.text))])
df.assign(**{'in': a})

           text  in
0        [A, B]   2
1        [C, D]   2
2        [A, F]   1
3        [E, G]   0
4        [B, H]   1
5        [I, J]   0
6  [A, C, D, E]   3

`map` `lambda` and `contains`

df.assign(**{'in': df.text.map(lambda x: sum(map(someList.__contains__, x)))})

           text  in
0        [A, B]   2
1        [C, D]   2
2        [A, F]   1
3        [E, G]   0
4        [B, H]   1
5        [I, J]   0
6  [A, C, D, E]   3

Collectives™ on Stack Overflow

Nested for loop using lambda function

3 Answers 3

Comments

4 Comments

Setup

Numpy and `contains`

`map` `lambda` and `contains`

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

4 Comments

Setup

Numpy and __contains__

map lambda and __contains__

Comments

Your Answer

Sign up or log in

Post as a guest

Related

Numpy and `contains`

`map` `lambda` and `contains`