How to split the string using python3

Question

How to split the string using regex

input :
result = '1,000.03AM2,97.2323,089.301,903.230.0034,928.9911,24.30AM'

Want to split this so that I can store into different strings for further use like following

o/p should be :
a = 1,000.03AM, b = 2,97.23, c = 23,089.30, d = 1,903.23, e = 0.00, f = 34,928.99, g = 11,24.30AM

I have tried like this but it's showing wrong output

import re
print(re.findall(r'[0-9.]+|[^0-9.]', result))

What is AM stands for? AM/PM? It looks like what you should parse it as float, but including AM/PM would make it string, unless it time. — XCanG
– XCanG, Commented Dec 6, 2019 at 8:54
@Abhi Your expected result is not matching with the above regex mentioned by Wiktor. — shaik moeed
– shaik moeed, Commented Dec 6, 2019 at 8:55
@shaikmoeed But my answer contains the solution that matches what is expected. — Wiktor Stribiżew
– Wiktor Stribiżew, Commented Dec 6, 2019 at 9:06

Wiktor Stribiżew · Accepted Answer · 2019-12-06 09:05:16Z

2

You may extract the strings using

re.findall(r'\d+(?:,\d+)*(?:\.\d{2})?[^,\d]*', text)

See the regex demo

Details

\d+ - 1+ digits
(?:,\d+)* - 0 or more repetitions of a comma and 1+ digits
(?:\.\d{2})? - an optional occurrence of a dot and 2 digits
[^,\d]* - any 0 or more chars other than a comma and digit.

Python demo:

import re
text = "1,000.03AM2,97.2323,089.301,903.230.0034,928.9911,24.30AM"
print( re.findall(r'\d+(?:,\d+)*(?:\.\d{2})?[^,\d]*', text) )
# => ['1,000.03AM', '2,97.23', '23,089.30', '1,903.23', '0.00', '34,928.99', '11,24.30AM']

edited Dec 6, 2019 at 9:05

answered Dec 6, 2019 at 8:54

Wiktor Stribiżew

631k41 gold badges502 silver badges632 bronze badges

Sign up to request clarification or add additional context in comments.

7 Comments

shaik moeed Over a year ago

This gives the first element as '1,000.03AM2' where it should be '1,000.03AM' as mentioned by OP.

Wiktor Stribiżew Over a year ago

@shaikmoeed I have reverted to the original suggestion.

shaik moeed Over a year ago

But still is not matching with expected output of OP.

shaik moeed Over a year ago

After . there should only two digits(plus two alphabets if exists). But this results as 928.9911

Wiktor Stribiżew Over a year ago

Ok, now it does.

|

XCanG · Accepted Answer · 2019-12-06 09:14:38Z

For your result you need following regex:

re.findall(r"[\d,]+\.\d{2}(?:AM)?", result)

This produce following:

['1,000.03AM', '2,97.23', '23,089.30', '1,903.23', '0.00', '34,928.99', '11,24.30AM']

Regex explanation:

[\d,] - match digits and comma
[\d,]+\.\d{2} - match whole float value (with two digest after dot)
(?:AM)? - matching optional AM in non-capturing group, in example below I use (?=AM)? to not include it into result
In case on the place of AM you have anything else, you may edit (?:AM) to (?:AM|Other|...)

If you need to parse it as float, I have two suggestion for you. First is removing comma:

map(lambda x: float(x.replace(",", "")), re.findall(r"[\d,]+\.\d{2}(?=AM)?", s))

Result:

[1000.03, 297.23, 23089.3, 1903.23, 0.0, 34928.99, 1124.3]

Another variant is using locale:

>>> import locale
>>> locale.setlocale(locale.LC_ALL, 'en_US.UTF8')
'en_US.UTF8'
>>> list(map(lambda x: locale.atof(x), re.findall(r"[\d,]+\.\d{2}(?=AM)?", s)))
[1000.03, 297.23, 23089.3, 1903.23, 0.0, 34928.99, 1124.3]

Amrish Mishra · Accepted Answer · 2019-12-06 09:07:49Z

0

Provided if string length and its parameter remains same. Most efficient solution would be.

a = result[0:10]
b = result[10:17]
c = result[17:26]
d = result[26:34]
e = result[34:38]
f = result[38:47]

Hope this helps.

answered Dec 6, 2019 at 9:07

Amrish Mishra

1801 silver badge13 bronze badges

3 Comments

Wiktor Stribiżew Over a year ago

I suspect the AM optional part may be missing or present in arbitrary comma-separated fields, so this is not likely to help in the end.

Amrish Mishra Over a year ago

If alphabetical characters aren't important then you can try this re.findall(r"[\d,]+\.\d{2}", result)

Amrish Mishra Over a year ago

This would be perfect. re.findall(r"([\d,]+\.\d{2}[A-Z]{2}?|[\d,]+\.\d{2})", result)

Collectives™ on Stack Overflow

How to split the string using python3

3 Answers 3

7 Comments

Comments

3 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

7 Comments

Comments

3 Comments

Your Answer

Sign up or log in

Post as a guest

Related