6

I know we can use str.contains method to select a partial string.

My column is like,

col1
V2648   
V9174.
V9071
V0021;+
V7615***
()()
random
words

I want to select all rows which contain a pattern with V+ 4 digits number. So we need to apply more than one condition on these strings.

My output will be like,

col1
V2648   
V9174.
V9071
V0021;+
V7615***
2
  • Are you familiar with regular expressions? Commented Oct 29, 2019 at 15:15
  • @Derek_6424246 Oh yes, I forgot that. Commented Oct 29, 2019 at 15:27

2 Answers 2

5

You could do:

mask = df.col1.str.startswith('V') & df.col1.str.contains('\d+')
print(df[mask])

Output

       col1
0     V2648
1    V9174.
2     V9071
3   V0021;+
4  V7615***

The mask df.col1.str.startswith('V') checks everything that starts with 'V' and df.col1.str.contains('\d+') checks everything that has 4 digits. If you want to match exactly a V followed by 4 digits use:

mask = df.col1.str.contains('^V\d+')
Sign up to request clarification or add additional context in comments.

1 Comment

Hi @JiayuZhang Glad I could help, that one will match 5 digits, if I'm not mistaken.
3

str.match

df[df.col1.str.match('[V](\d{4})')]
Out[135]: 
       col1
0     V2648
1    V9174.
2     V9071
3   V0021;+
4  V7615***

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.