Most Occurring Number in a String Using Regex - Python
Given a string containing both words and numbers, the task is to find the number that appears most frequently in the string using Regex.
For example:
Input: "I have 3 apples, 12 oranges, 3 bananas, and 15 3"
Output: 3
Let's explore different methods to find most occuring number in a given string in Python.
Using re.findall() and max() Function
Use the "re.findall()" function to extract all numbers from the given string and then apply the "max()" function to find the most frequently occurring number.
import re
from collections import Counter
s = 'geek55of55gee4ksabc3dr2x'
a = re.findall(r'\d+', s)
freq = Counter(a)
res = max(freq, key=lambda x: (freq[x], int(x)))
print(res)
Output
55
Explanation:
- The key lambda x: (freq[x], int(x)) ensures that if two numbers have equal frequency, the larger number is chosen.
- max() efficiently finds the most frequent number in a single line.
Using re.findall() and collections.Counter
Use "re.findall()" to extract all the numbers and Counter from the collections module to count their frequencies. Then find the most common number, resolving ties by choosing the larger number.
import re
from collections import Counter
s = 'geek55of55gee4ksabc3dr2x'
a = re.findall(r'\d+', s)
freq = Counter(a)
mx, res = 0, 0
for x in freq:
if freq[x] >= mx:
mx = freq[x]
res = int(x)
print(res)
Output
55
Explanation:
- re.findall(r'\d+', s) finds all continuous digit sequences and returns them as a list like ['55', '55', '4', '3', '2'].
- Counter(a) counts occurrences of each number, producing a dictionary like {'55': 2, '4': 1, '3': 1, '2': 1}.
- The loop compares frequencies to track the maximum.
- In case of a tie (same frequency), the larger number is chosen.
Using re.finditer()
Use re.finditer() to extract matches one by one instead of all at once. It’s memory-efficient for processing very large strings but slower for small inputs.
import re
from collections import Counter
s = 'geek55of55gee4ksabc3dr2x'
a = [match.group() for match in re.finditer(r'\d+', s)]
freq = Counter(a)
res = max(freq, key=lambda x: (freq[x], int(x)))
print(res)
Output
55
Explanation:
- re.finditer(r'\d+', s) yields match objects instead of a list, saving memory.
- Each match’s .group() extracts the number text.
- Counter and max() are used the same way as in previous methods.