1

I am checking to see if a pandas column matches a pre-defined regex, using .any() to get the first match if found. However, I need to return the index/row where this match occurred so that I can get the value of another column in that row.

I have the below to check where the reg_ex pattern exists in df['id_org']

if df['id_org'].str.contains(pat=reg_ex, regex=True).any()

Once the above evaluates to true, how do I get the index/row that caused the expression to evaluate to true? I would like to use this index so that I can access another column for that same row using pandas df.at[index, 'desired_col'] or .iloc functions.

In the past I have done: df.at[df['id_org'][df['id_org'] == key].index[0], 'desired_col'] however, I can't use this line of code any more because I am no longer checking for an exact string "key" match bur rather when a regex now matches in that column.

1 Answer 1

2

You can use idxmax combined with any:

reg_ex = 'xxx'

s = df['id_org'].str.contains(pat=reg_ex, regex=True)
out = s.idxmax() if s.any() else None

Or first_valid_index:

s = df['id_org'].str.contains(pat=reg_ex, regex=True)
out = s[s].first_valid_index()

Example of outputs:

# reg_ex = 'e'
1

# reg_ex = 'z'
None

Used input:

  id_org
0    abc
1    def
2    ghi
3    cde

all matches

s = df['id_org'].str.contains(pat=reg_ex, regex=True)
out = s.index[s]

Example for the regex 'e': Int64Index([1, 3], dtype='int64')

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.