find index of element in list in dataframe

Question

I have:

adj           response                                                   

"beautiful"    ["beautiful", "beautiful2", "beautifu3"]
"good1"        ["beautiful1", "beautiful2", "beautifu3"]
"hideous"      ["hideous23r", "hideous", "hidoeous"]

I would like an extra column with the first index of the item in the previous column:

adj           response                                                   index

"beautiful"    ["beautiful", "beautiful2", "beautifu3"]                    0
"not there"    ["beautiful1", "beautiful2", "beautifu3"]                   None
"hideous"      ["hideous23r", "hideous", "hidoeous"]                       1

3rd row in both dataframes do not match

Ank
– Ank

2021-05-21 13:26:52 +00:00
Commented May 21, 2021 at 13:26 — Ank
– Ank, Commented May 21, 2021 at 13:26

Nk03 · Accepted Answer · 2021-05-21 13:26:33Z

3

TRY:

df['response'] = df['response'].apply(eval) # do not use this if column dtype is list
df['index'] = df.apply(lambda x: None if x['adj'] not in x['response'] else x['response'].index(x['adj']),1)

OUTPUT:

         adj                             response  index
0  beautiful   [beautiful, beautiful2, beautifu3]    0.0
1      good1  [beautiful1, beautiful2, beautifu3]    NaN
2    hideous       [hideous23r, hideous, hideous]    1.0

answered May 21, 2021 at 13:26

Nk03

15k2 gold badges11 silver badges24 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

BENY · Accepted Answer · 2021-05-21 13:43:38Z

1

Let us try unpack the list

s = pd.DataFrame(df.response.tolist()).eq(df.adj,0)
df['new'] = s.idxmax(1).where(s.any(1))
df
Out[30]: 
         adj                             response  new
0  beautiful   [beautiful, beautiful2, beautifu3]  0.0
1  not there  [beautiful1, beautiful2, beautifu3]  NaN
2    hideous       [hideous23r, hideous, hideous]  1.0

answered May 21, 2021 at 13:43

BENY

324k22 gold badges176 silver badges250 bronze badges

Comments

Achille G · Accepted Answer · 2021-05-21 13:40:00Z

A really naive way to do it:

import pandas as pd

df = pd.read_csv("h.csv", sep=";")
adj = df["adj"].to_list()
response = df["response"].to_list()
nresponse = []
for i in response:
    list_response = i.split(",")
    remove_char = ["[", "]", "\"", " "]
    for j in range(len(list_response)):
        for char in remove_char:
            list_response[j] = list_response[j].replace(char, "")
    nresponse.append(list_response)
indexes = []
for i in range(len(nresponse)):
    if adj[i] in nresponse[i]:
        x = nresponse[i].index(adj[i])
        indexes.append(x)
    else:
        indexes.append(None)
df["index"] = indexes

(Assuming the list in "response" corresponds to a string) This is way worst and I assume way slower than Nk03 solution.

Collectives™ on Stack Overflow

find index of element in list in dataframe

3 Answers 3

Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related