Trouble transforming a string into an array with regex

Question

I have some trouble transforming a string into an array. The separators between numbers are spaces, and the thousands also.

My string is this:

'25 127,4 17 588,6 16 264,3 324,4 8,7'

I want it to look like this:

['25 127,4', '17 588,6', '16 264,3', '324,4', '8,7']

or better like this:

['25127,4', '17588,6', '16264,3', '324,4', '8,7']

I was trying to use regex and findall to do so, but the problem is that it only captures 5-digit numbers.

My code is kind of like this:

a = '25 127,4 17 588,6 16 264,3 324,4 8,7'
print(re.findall(r'\d+\s\d+,\d{1}', a))

which gives me this output:

['25 127,4', '17 588,6', '16 264,3', '4 8,7']

How to solve this?

Try (?! )[\d ]+,\d

Hao Wu
– Hao Wu

2021-07-07 03:06:45 +00:00
Commented Jul 7, 2021 at 3:06 — Hao Wu
– Hao Wu, Commented Jul 7, 2021 at 3:06
Maybe re.findall(r'[0-9\s]+,\d+', s) given a string s?

Mark
– Mark

2021-07-07 03:07:29 +00:00
Commented Jul 7, 2021 at 3:07 — Mark
– Mark, Commented Jul 7, 2021 at 3:07
Thanks a lot @HaoWu and Mark. It worked!!

Samantha Guillén
– Samantha Guillén

2021-07-07 03:44:27 +00:00
Commented Jul 7, 2021 at 3:44 — Samantha Guillén
– Samantha Guillén, Commented Jul 7, 2021 at 3:44

mb4329 · Accepted Answer · 2021-07-07 03:18:12Z

1

The regex recommendation in the comment seems to work for your example, but having spaces represent delimiters and thousand separators in your data will cause a lot of problems. If possible you should see if a different sort of delimiter could be used!

answered Jul 7, 2021 at 3:18

mb4329

213 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Samantha Guillén Over a year ago

Thank you for the recommendations. Sure will change the delimeter

The fourth bird · Accepted Answer · 2021-07-07 07:58:33Z

1

You can optionally repeat matching a space and matching 1+ digits. (Note that \s can also match a newline)

Then remove the spaces from the matches.

\b\d+(?: \d+)*,\d\b

Regex demo

import re

pattern = r"\b\d+(?: \d+)*,\d\b"
s = "25 127,4 17 588,6 16 264,3 324,4 8,7"
result = [m.replace(" ", "") for m in re.findall(pattern, s)]
print(result)

Output

['25127,4', '17588,6', '16264,3', '324,4', '8,7']

edited Jul 7, 2021 at 7:58

answered Jul 7, 2021 at 7:52

The fourth bird

165k16 gold badges61 silver badges75 bronze badges

Collectives™ on Stack Overflow

Trouble transforming a string into an array with regex

2 Answers 2

1 Comment

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

Comments

Your Answer

Sign up or log in

Post as a guest

Related