3

I've searched previous answers relating to this but those answers seem to utilize numpy because the array contains numbers. I am trying to search for a keyword in a sentence in a dataframe ('Timeframe') where the full sentence is 'Timeframe for wave in ____' and would like to return the column and row index. For example:

    df.iloc[34,0] 

returns the string I am looking for but I am avoiding a hard code for dynamic reasons. Is there a way to return the [34,0] when I search the dataframe for the keyword 'Timeframe'

3
  • You can access the corresponding row by using df.index.get_loc as explained in the target. Commented Jul 11, 2017 at 18:38
  • @ayhan - I reopen it, because it seems get_loc is not solution. Commented Jul 11, 2017 at 18:42
  • @jezrael Yes, you are right. Commented Jul 11, 2017 at 18:45

2 Answers 2

4

EDIT:

For check index need contains with boolean indexing, but then there are possible 3 values:

df = pd.DataFrame({'A':['Timeframe for wave in ____', 'a', 'c']})
print (df)
                            A
0  Timeframe for wave in ____
1                           a
2                           c



def check(val):
    a = df.index[df['A'].str.contains(val)]
    if a.empty:
        return 'not found'
    elif len(a) > 1:
        return a.tolist()
    else:
        #only one value - return scalar  
        return a.item()
print (check('Timeframe'))
0

print (check('a'))
[0, 1]

print (check('rr'))
not found

Old solution:

It seems you need if need numpy.where for check value Timeframe:

df = pd.DataFrame({'A':list('abcdef'),
                   'B':[4,5,4,5,5,4],
                   'C':[7,8,9,4,2,'Timeframe'],
                   'D':[1,3,5,7,1,0],
                   'E':[5,3,6,9,2,4],
                   'F':list('aaabbb')})

print (df)
   A  B          C  D  E  F
0  a  4          7  1  5  a
1  b  5          8  3  3  a
2  c  4          9  5  6  a
3  d  5          4  7  9  b
4  e  5          2  1  2  b
5  f  4  Timeframe  0  4  b


a = np.where(df.values == 'Timeframe')
print (a)
(array([5], dtype=int64), array([2], dtype=int64))

b = [x[0] for x in a]
print (b)
[5, 2]
Sign up to request clarification or add additional context in comments.

3 Comments

Thanks jezrael! one thing I didn't realize would be more problematic is that Timeframe is not the whole word. I initially tried doing a df[df == 'Timeframe'] kind of search expecting it to locate the first instance of Timeframe. However since Timeframe is part of a sentence it will not return a result. The full sentence is 'Timeframe for wave in____ ' do you have any tips?
Hmm, and output is positions? Or output is all sentence?
since my df is only one column, i would just need to return the row where the sentence 'Timeframe for wave in___' appears. Does that help clarify? So in my example df.iloc[34,0] actually corresponds to where the sentence appears
3

In case you have multiple columns where to look into you can use following code example:

import numpy as np
import pandas as pd
df = pd.DataFrame([[1,2,3,4],["a","b","Timeframe for wave in____","d"],[5,6,7,8]])
mask = np.column_stack([df[col].str.contains("Timeframe", na=False) for col in df])
find_result = np.where(mask==True)
result = [find_result[0][0], find_result[1][0]]

Then output for df and result would be:

>>> df
   0  1                          2  3
0  1  2                          3  4
1  a  b  Timeframe for wave in____  d
2  5  6                          7  8
>>> result
[1, 2]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.