0

I am currently trying to use regex to isolate values within string values from list and append the only the numbers to a new list. Yes, I am aware of this post (Regular Expressions: Search in list) and am using one of the answers from it but for some reason it is still including the text part of the values in the new list.

[IN]:
['0.2 in', '1.3 in']

snowamt = ['0.2 in', '1.3 in']
r = re.compile("\d*\.\d*")
newlist = list(filter(r.match, snowamt)) # Read Note
print(newlist)

[OUT]:
['0.2 in', '1.3 in']

I have tried so many combinations of regex and I just can't get it. Can someone please correct what I know is a stupid mistake. Here are just a few of the regex's I've tried:

"(\d*\.\d*)"
"\d*\.\d*\s"
"\d*\.\d*\s$"
"^\d*\.\d*\s$"
"^\d*\.\d*\s"

My end goal is to sum up all the values in the list generated above and I was initially able to get around using re.compile by using re.split :

inches_n = [ ]
i = 0
for n in snowamt:
    split = re.split(" ", n, maxsplit=0, flags=0)
    inches_n.append(split[0])
i += 1

print(inches_n) 

The problem is that the value '-- in' may show up in the original list as I am getting the numbers by scraping a website (weather underground which is okay to scrape) and it would less steps if I could just select for the numbers initially with regex because with re.split I have to add an extra step to reiterate through the new list and only select for the numbers.

Anyway can someone please correct my regex so I can move on with my life from this problem, thank you!

3
  • In the first code example, what do you want the output to be instead? Commented Mar 31, 2021 at 22:23
  • Just to explain the correct answer below, what YOUR code is doing is asking "Does this string CONTAIN a number? If so, keep it". You aren't EXTRACTING the number. Commented Mar 31, 2021 at 22:23
  • 1
    So your basic problem was not your regex but your use of filter which passed all strings that contained a number. To get just the numbers from the string you could use map rather than filter as in list(map(lambda x: r.match(x).group(), snowamt)). (using your definition of r). But, its simpler to use list comprehension as in the posted answer. Commented Mar 31, 2021 at 22:40

1 Answer 1

1

To get only digits from the list, you can use this example:

import re

snowamt = ["0.2 in", "1.3 in"]
r = re.compile(r"(\d+\.?\d*)")

newlist = [m.group(1) for i in snowamt if (m := r.match(i))]
print(newlist)

Prints:

['0.2', '1.3']
Sign up to request clarification or add additional context in comments.

1 Comment

I like this use case for the := operator.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.