Python - All occurrences of substring in string
A substring is a contiguous occurrence of characters within a string. Identifying all instances of a substring is important for verifying various tasks. In this article, we will check all occurrences of a substring in String.
Using re.finditer()
re.finditer() returns an iterator yielding match objects for all non-overlapping matches of a pattern in a string that allows to check for specific patterns, such as digits, throughout the string.
import re
# Define the input string and substring
s = "hello world, hello universe"
substring = "hello"
# Find all occurrences using re.finditer
positions = [match.start() for match in re.finditer(substring, s)]
print(positions)
Output
[0, 13]
Explanation:
- Use
re.finditer()to find matches: There.finditer()function searches for all occurrences of the substring"hello"in the string"hello world, hello universe", returning an iterator of match objects. - Extract start positions: A list comprehension is used to extract the starting position of each match using
match.start().
Using str.find() in a loop
Using str.find() in a loop allows to find all occurrences of a substring by repeatedly searching for the next match starting from the last found index. The loop continues until no more matches are found (when find() returns -1).
s = "hello world, hello universe"
substring = "hello"
# Find all occurrences using str.find in a loop
positions = []
start = 0
while True:
start = s.find(substring, start)
if start == -1:
break
positions.append(start)
start += len(substring)
print(positions)
Output
[0, 13]
Explanation
- Use
find()in a loop: Thefind()method is called repeatedly, starting from the last found position, to locate each occurrence of the substring"hello"in the string"hello world, hello universe". - Track positions: Each found position is added to the
positionslist, and thestartindex is updated to move past the current match to continue searching for subsequent occurrences.
Using List Comprehension with range():
List comprehension with range() can be used to generate all starting positions of a substring by iterating over the string indices. It checks each possible position within the string to see if the substring matches starting from that index.
s = "hello world, hello universe"
substring = "hello"
# Find all occurrences using list comprehension
positions = [i for i in range(len(s)) if s.startswith(substring, i)]
print(positions)
Output
[0, 13]
Explanation
- Use list comprehension with
startswith(): The list comprehension iterates over each indexiin the string, checking if the substring"hello"starts at that position usingtext.startswith(substring, i). - Store starting positions: If the substring matches at index
i, that index is added to thepositionslist.