1

So I have a pandas dataframe and this is how it looks like:

This is a paragraph [if-statement, for-loop]

This is a second paragraph [for-loop, java]

To explain, the left column serves as text-data and the right column classifies what the text-data is about.

I want to access the "java" only on the second paragraph. How can i access a list in a dataframe?

1
  • Can you add expected output to question? Commented Oct 29, 2018 at 6:56

2 Answers 2

1

IIUC need:

df = pd.DataFrame({'col1':['This is a paragraph','This is a second paragraph'],
                   'col2':[['if-statement', 'for-loop'],['for-loop','java']]})

df = df[df['col2'].apply(lambda x: 'java' in x)]
#alternative solution
#df = df[['java' in x for x in df['col2']]]

Or compare sets:

df = df[df['col2'].apply(set) >= set(['java'])]

print (df)
                         col1              col2
1  This is a second paragraph  [for-loop, java]
Sign up to request clarification or add additional context in comments.

2 Comments

Nice one as it used to be :) +1 , though i have added mine a little contribution and learning, pls correct if anything needed.
@pygo - Thank you. Still not 100% sure, what OP need, so ask it by comment under question.
0

How about using map.

>>> df['col2'].map(str)[1]
"['for-loop', 'java']"

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.