22

I use R a lot more and it is easier for me to do it in R:

> test <- c('bbb', 'ccc', 'axx', 'xzz', 'xaa')
> test[grepl("^x",test)]
[1] "xzz" "xaa"

But how to do it in python if test is a list?

P.S. I am learning python using google's python exercise and I prefer using regular expression.

0

5 Answers 5

24

In general, you may use

import re                                  # Add the re import declaration to use regex
test = ['bbb', 'ccc', 'axx', 'xzz', 'xaa'] # Define a test list
reg = re.compile(r'^x')                    # Compile the regex
test = list(filter(reg.search, test))      # Create iterator using filter, cast to list 
# => ['xzz', 'xaa']

Or, to inverse the results and get all items that do not match the regex:

list(filter(lambda x: not reg.search(x), test))
# >>> ['bbb', 'ccc', 'axx']

See the Python demo.

USAGE NOTE:

  • re.search finds the first regex match anywhere in a string and returns a match object, otherwise None
  • re.match looks for a match only at the string start, it does NOT require a full string match. So, re.search(r'^x', text) = re.match(r'x', text)
  • re.fullmatch only returns a match if the full string matches the pattern, so, re.fullmatch(r'x') = re.match(r'x\Z') = re.search(r'^x\Z').

If you wonder what the r'' prefix means, see Python - Should I be using string prefix r when looking for a period (full stop or .) using regex? and Python regex - r prefix.

Sign up to request clarification or add additional context in comments.

Comments

6

You can use the following to find if any of the strings in list starts with 'x'

>>> [e for e in test if e.startswith('x')]
['xzz', 'xaa']
>>> any(e.startswith('x') for e in test)
True

5 Comments

I want to have the string started with "x" to be extracted, but I can't see your answer can give the output I expect.
can I use re.match or similar function in re library instead?
@lokheart You could definitely use re.match in place of starswith in the list comprehension above.
@squiguy tried [x for x in test if re.match("^x",x)] and it works
@lokheart Cool :). Have fun with Python!
2

You could use filter. I am assuming you want a new list with certain elements from the old one.

new_test = filter(lambda x: x.startswith('x'), test)

Or if you want to use a regular expression in the filter function you could try the following. It requires the re module to be imported.

new_test = filter(lambda s: re.match("^x", s), test)

Comments

1

An example when you want to extract more than one datapoint from each string in the list:

Input:

2021-02-08 20:43:16 [debug] : [RequestsDispatcher@_execute_request] Requesting: https://test.com&uuid=1623\n

Code:

pat = '(.* \d\d:\d\d:\d\d) .*_execute_request\] (.*?):.*uuid=(.*?)[\.\n]'
new_list = [re.findall(pat,s) for s in my_list]

Output:

[[('2021-02-08 20:43:15', 'Requesting', '1623')]]

Comments

0

Here is some improvisation that works fine. Probably helps..

import re
l= ['bbb', 'ccc', 'axx', 'xzz', 'xaa'] #list
s= str( " ".join(l))                   #flattening list to string
re.findall('\\bx\\S*', s)               #regex to find string starting with x

['xzz', 'xaa']

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.