7

I'm making a program that takes currency from a string and converts it in to other currencies. For example, if the string was 'the car cost me $13,250' I would need to get $ and 13250. I have this regex already (?:\£|\$|\€)(?:.{1,}) that sort of does it, however there is a reasonably large possibility that the string might have more than one price, all using different currencies. This is something that I do no know how to do effectively.

What I need to know is how to extract all of the prices from a string. I think even if the regex just returns something like ['$12,250,000','£14,500,123','£120.25'] then it is fine because I can use something like this to get the number:

prices = ['$12,250','£14,500','£120']
for value in prices:
    value.replace(',','')

And something like this to get the currency:

for c in prices:
     currency = c[0]

Then there is the problem that the price might not be a whole number, and might be something like $12.54. Any help on how to get that initial list of prices would be great.

4
  • Possible duplicate of Parse currency into numbers in Python Commented Sep 11, 2017 at 20:35
  • No I am fine with converting currency in to numbers as I showed, but I need to know how I can extract all of the price values from a string first. Commented Sep 11, 2017 at 20:41
  • use re.findall? Something like re.findall(r"(?:\£|\$|\€)(?:[\d\.\,]{1,})",s) It won't be perfect, but maybe it will be easier to just filter false-positives later Commented Sep 11, 2017 at 20:43
  • This is a duplicate. The problem you are having, and the solution to your problem (with a working example) are both in the link I posted. Commented Sep 11, 2017 at 20:49

4 Answers 4

12

This regular expression will work better for your purposes:

(?:[\£\$\€]{1}[,\d]+.?\d*)

Try it out here.

Then as sainoba notes, you can use re.findall or re.finditer to get the matches.

Then you can extract the currency from the first character, remove commas, and finally split on a decimal point if needed.

Sign up to request clarification or add additional context in comments.

3 Comments

Thanks, this is exactly what I needed.
Works great when no other solutions worked and saved me time in having to write a regex, thank you for this!
The above is almost correct: characters after the number will be captured, too, if the value does not have a decimal point. I added a backslash to match only that: (?:[\£\$\€]{1}[,\d]+\.?\d*)
1

When dealing with currencies, you cannot use simple approaches like replacing commas and periods. There are a multitude of language and regional differences. The Euro may use commas or periods as the decimal separator. Some locales may have two or three digits between grouping separators. The currency symbol may be on the left or right. A symbol may represent any one of a dozen different currencies, depending on the user's locale.

Make use of a library to handle this work for you. This issue has been discussed in detail in other posts, such as this one.

Comments

0
import re
re.findall('£{1}[,0-9]{1,10}',values)

Comments

-2

You could use re.findall or re.finditer:

re.findall(pattern, string) returns a list of matching strings.

re.finditer(pattern, string) returns an iterator.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.