1

First I am using python 2.7

I have these possibilities of the string:

3 paths

12  paths

12 path rooms

Question

what is the reqular expression to get the number without the text.

thanks

10
  • 1
    Why not just split on the space? Or - do you have other examples that means that's not viable... Commented Mar 6, 2014 at 18:14
  • @JonClements since I am using scrapy, i am allowed to use just xpath and reqular expression, i can't use python functions Commented Mar 6, 2014 at 18:15
  • errr no... once you've extracted the string, there's nothing to stop you using builtin str operations Commented Mar 6, 2014 at 18:16
  • @JonClements I do know that :). I mean that in the policy of where I am working I can't use python functions Commented Mar 6, 2014 at 18:19
  • okay... I've put the scrapy tag back - you might want to edit your question to include these comments and the fact that this is not a standalone regular expression with standalone strings... Commented Mar 6, 2014 at 18:20

5 Answers 5

2

You say you can only use scrapy methods, so I guess you're after:

hxs.select('//some/xpath/expression/text()').re(r'(\d+).*')
Sign up to request clarification or add additional context in comments.

Comments

2

you can use this: Regex = [\d]*

Comments

2

An alternative way would be to use [0-9] instead of \d

import re

def extract_number(string):
    r = re.compile('[0-9]+')
    return r.match(string).group()

Comments

1

(\d+).*\n for pulling the numbers and then skipping the rest of the line.

number_finder = re.compile('(\d+).*\n') number_finder.findall(mystr)

will output an array of the number values

Example:

In [3]: r = re.compile('(\d+).*\n') In [4]: r.findall('12 a \n 12 a \n') Out[4]: ['12', '12']

1 Comment

cross multiple-line search, just use re.M flag, it codes like re.findall(r'\d+',multiple_line_string,re.M) :-)
1

The regex pattern to look for is \d. So in python you would code it as:

pattern = re.compile(r'\d+')
result =  re.search(pattern, input_string)

2 Comments

you need to write [\d]* with the * so it will catch all the number and not just the first digit.
@ifryed You are right. We could alternatively do a \d+

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.