Python regex Get first element after specific string

Question

I'm trying to get the first number (int and float) after a specific pattern:

strings = ["Building 38 House 10",
           "Building : 10.5 house 900"]
for x in string:
    print(<rule>)

Wanted result:

'38'
'10.5'

I tried:

for x in strings:
    print(re.findall(f"(?<=Building).+\d+", x))
    print(re.findall(f"(?<=Building).+(\d+.?\d+)", x))
[' 38 House 10']
['10']
[' : 10.5 house 900']
['00']

But I'm missing something.

Building.*?([\d.]+) or simply [\d.]+ with re.find().

Olvin Roght
– Olvin Roght

2022-07-12 09:16:02 +00:00
Commented Jul 12, 2022 at 9:16 — Olvin Roght
– Olvin Roght, Commented Jul 12, 2022 at 9:16

The fourth bird · Accepted Answer · 2022-07-12 09:30:17Z

2

You could use a capture group:

\bBuilding[\s:]+(\d+(?:\.\d+)?)\b

Explanation

\bBuilding Match the word Building
[\s:]+ Match 1+ whitespace chars or colons
(\d+(?:\.\d+)?) Capture group 1, match 1+ digits with an optional decimal part
\b A word boundary

Regex demo

import re
strings = ["Building 38 House 10",
           "Building : 10.5 house 900"]
pattern = r"\bBuilding[\s:]+(\d+(?:\.\d+)?)"
for x in strings:
    m = re.search(pattern, x)
    if m:
        print(m.group(1))

Output

38
10.5

answered Jul 12, 2022 at 9:30

The fourth bird

165k16 gold badges61 silver badges75 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

bobble bubble · Accepted Answer · 2022-07-12 11:41:09Z

1

An idea to use \D (negated \d) to match any non-digits in between and capture the number:

Building\D*\b([\d.]+)

See this demo at regex101 or Python demo at tio.run

Just to mention, use word boundaries \b around Building to match the full word.

answered Jul 12, 2022 at 11:41

bobble bubble

18.8k4 gold badges32 silver badges52 bronze badges

Comments

Hidi Eric · Accepted Answer · 2022-07-12 09:22:13Z

0

re.findall(r"(?<![a-zA-Z:])[-+]?\d*\.?\d+", x)

This will find all numbers in the given string.

If you want the first number only you can access it simply through indexing:

re.findall(r"(?<![a-zA-Z:])[-+]?\d*\.?\d+", x)[0]

edited Jul 12, 2022 at 9:22

answered Jul 12, 2022 at 9:18

Hidi Eric

3641 silver badge8 bronze badges

2 Comments

slothrop Over a year ago

That would include 10 and 900, which OP doesn't want.

Hidi Eric Over a year ago

You are right, edited the answer.

Collectives™ on Stack Overflow

Python regex Get first element after specific string

3 Answers 3

Comments

Comments

2 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

Comments

2 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related