0

I am trying to split my string into a list, separating by whitespace and characters but leaving numbers together.
For example, the string:

"1 2 +="  

would end up as:

["1", " ", "2", " " ,"+", "="]    

The code I currently have is

temp = re.findall('\d+|\S', input)  

This seperates the string as intended but does also remove the whitespace, how do I stop this?

2
  • perhaps you need \s? Commented Nov 18, 2013 at 22:00
  • Are you writing a postfix parser? Commented Nov 18, 2013 at 22:05

2 Answers 2

3

Just add \s or \s+ to your current regular expression (use \s+ if you want consecutive whitespace characters to be grouped together). For example:

>>> s = "1 2 +="
>>> re.findall(r'\d+|\S|\s+', s)
['1', ' ', '2', ' ', '+', '=']

If you don't want consecutive whitespace to be grouped together, then instead of r'\d+|\S|\s' it would probably make more sense to use r'\d+|\D'.

Sign up to request clarification or add additional context in comments.

Comments

1

You can use \D to find anything that is not a digit:

\d+|\D

Python:

temp = re.findall(r'\d+|\D', input) 
//Output: ['1', ' ', '2', ' ', '+', '=']

It would also work if you just used . since it'll match the \d+ first anyway. But its probably cleaner not to.

\d+|.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.