1

I have:

adj           response                                                   

"beautiful"    ["beautiful", "beautiful2", "beautifu3"]
"good1"        ["beautiful1", "beautiful2", "beautifu3"]
"hideous"      ["hideous23r", "hideous", "hidoeous"] 

I would like an extra column with the first index of the item in the previous column:

adj           response                                                   index

"beautiful"    ["beautiful", "beautiful2", "beautifu3"]                    0
"not there"    ["beautiful1", "beautiful2", "beautifu3"]                   None
"hideous"      ["hideous23r", "hideous", "hidoeous"]                       1
1
  • 2
    3rd row in both dataframes do not match Commented May 21, 2021 at 13:26

3 Answers 3

3

TRY:

df['response'] = df['response'].apply(eval) # do not use this if column dtype is list
df['index'] = df.apply(lambda x: None if x['adj'] not in x['response'] else x['response'].index(x['adj']),1)

OUTPUT:

         adj                             response  index
0  beautiful   [beautiful, beautiful2, beautifu3]    0.0
1      good1  [beautiful1, beautiful2, beautifu3]    NaN
2    hideous       [hideous23r, hideous, hideous]    1.0
Sign up to request clarification or add additional context in comments.

Comments

1

Let us try unpack the list

s = pd.DataFrame(df.response.tolist()).eq(df.adj,0)
df['new'] = s.idxmax(1).where(s.any(1))
df
Out[30]: 
         adj                             response  new
0  beautiful   [beautiful, beautiful2, beautifu3]  0.0
1  not there  [beautiful1, beautiful2, beautifu3]  NaN
2    hideous       [hideous23r, hideous, hideous]  1.0

Comments

0

A really naive way to do it:

import pandas as pd

df = pd.read_csv("h.csv", sep=";")
adj = df["adj"].to_list()
response = df["response"].to_list()
nresponse = []
for i in response:
    list_response = i.split(",")
    remove_char = ["[", "]", "\"", " "]
    for j in range(len(list_response)):
        for char in remove_char:
            list_response[j] = list_response[j].replace(char, "")
    nresponse.append(list_response)
indexes = []
for i in range(len(nresponse)):
    if adj[i] in nresponse[i]:
        x = nresponse[i].index(adj[i])
        indexes.append(x)
    else:
        indexes.append(None)
df["index"] = indexes

(Assuming the list in "response" corresponds to a string) This is way worst and I assume way slower than Nk03 solution.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.