How to split string using regex in python?

Question

Below is my string format.

test_string=`"test (11 MHz - 11 MHz)"`
 test1_string = 'test1 (11 MHz - 11 MHz)'

Needed output like below using regex in python:

output = ["test1", "11 MHz", "11 MHz"]

Give a minimal reproducible example illustrating the specific problem with your attempt — jonrsharpe
– jonrsharpe, Commented Dec 6, 2019 at 12:02
@rts He probably meant, please show your current regex pattern / attempt for easier being able to help by seeing where it failed. — bobble bubble
– bobble bubble, Commented Dec 6, 2019 at 12:09
Why would you expect otherwise? You're just splitting on whitespace, you might as well write "A1-A4 US (430 Mhz - 780 Mhz)".split(). — jonrsharpe
– jonrsharpe, Commented Dec 6, 2019 at 12:14
@bobblebubble Working as i expected. thanks. If u post the answer then i will upvote. — nrs
– nrs, Commented Dec 6, 2019 at 13:01
Using the PyPi regex module you might also use (?:^(\w+(?:-\w+)+(?: [A-Z]+)?) \(|\G(?!^))(\d+ MHz)(?: - (?!\)))?(?=[^()]*\)) regex101.com/r/rkYclW/1 — The fourth bird
– The fourth bird, Commented Dec 6, 2019 at 13:43

bobble bubble · Accepted Answer · 2019-12-06 15:30:24Z

2

An idea with either non parenthesis at start or digits followed by mhz anywhere.

res = re.findall(r'(?i)^[^)(]+\b|\d+ mhz', test_string)

See this demo at regex101 or a Python demo at tio.run

with flag (?i) for ignorecase to match lower and upper Mhz
^[^)(]+\b the first part will match one or more non parentheses from ^ start until a \b
| OR \d+ mhz one or more digits followed by the specified substring

This will work as long as your input matches the pattern.

edited Dec 6, 2019 at 15:30

answered Dec 6, 2019 at 13:06

bobble bubble

18.8k4 gold badges32 silver badges52 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

WayToDoor · Accepted Answer · 2019-12-06 12:13:03Z

This regex seems to do the job ([^(\n]*) \((\d* Mhz) - (\d* Mhz)\)

You can try it online

The website gives some code you can use for matcing with Python

# coding=utf8
# the above tag defines encoding for this document and is for Python 2.x compatibility

import re

regex = r"([^(\n]*) \((\d* Mhz) - (\d* Mhz)\)"

test_str = ("A1-A4 US (430 Mhz - 780 Mhz)\n"
    "A7-A8 PS (420 Mhz - 180 Mhz)\n")

matches = re.finditer(regex, test_str, re.MULTILINE)

for matchNum, match in enumerate(matches, start=1):

    print ("Match {matchNum} was found at {start}-{end}: {match}".format(matchNum = matchNum, start = match.start(), end = match.end(), match = match.group()))

    for groupNum in range(0, len(match.groups())):
        groupNum = groupNum + 1

        print ("Group {groupNum} found at {start}-{end}: {group}".format(groupNum = groupNum, start = match.start(groupNum), end = match.end(groupNum), group = match.group(groupNum)))

# Note: for Python 2.7 compatibility, use ur"" to prefix the regex and u"" to prefix the test string and substitution.

Mark M · Accepted Answer · 2019-12-06 12:15:47Z

Using named groups:

import re
sample = "A1-A4 US (430 Mhz - 780 Mhz)"

split_pat = r"""
    (?P<first>.+)               # Capture everything up to first space
    \s\(                        # Skip space and initial parentheses
    (?P<second>\d+\s\bMhz\b)    # Capture numeric values, space, and Mhz
    \s+?\-\s+?                  # Skip hyphen in the middle
    (?P<third>\d+\s\bMhz\b)     # Capture numeric values, space, and Mhz
    \)                          # Check for closing  parentheses
    """

# Use re.X flag to handle verbose pattern string
p = re.compile(split_pat, re.X)

first_text = p.search(sample).group('first')
second_text = p.search(sample).group('second')
third_text = p.search(sample).group('third')

Andrej Kesely · Accepted Answer · 2019-12-06 12:32:32Z

0

You can use re.findall to search the text:

import re

text = "A1-A4 US (430 Mhz - 780 Mhz)"

first_text, second_text, third_text = re.findall(r'(.*?US).*?(\d+.Mhz).*?(\d+.Mhz)', text)[0]
print(first_text)
print(second_text)
print(third_text)

Prints:

A1-A4 US
430 Mhz
780 Mhz

answered Dec 6, 2019 at 12:32

Andrej Kesely

196k15 gold badges60 silver badges105 bronze badges

Collectives™ on Stack Overflow

How to split string using regex in python?

4 Answers 4

Comments

Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

Comments

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related