0

I am writing a compiler (for a practice language) in python and I want to split my text to tokens by spaces or comments. I tried /\*.*?\*/|/{2}.*?\n|\s : the fisr regex pattern is supposed to fine comments in the form of /** text */ or /* text */, possibly multyline. The second regex is supposed to fine comments in the form of // text that ends with the new line character. The last one finds white spaces.

My question:

I checked my regex here and it seemes to be great, but when I call

temp = file.read()

temp = temp.split('/\*.*?\*/|\/{2}.*?\n|\s',flags=DOTALL)

print temp 

it returns a list with only one elements which is the entire text I'm parsing.

Any ideas about where am I going wrong? Thanks!

Thanks!

8
  • Shouldn't the 2nd regex be \/{2}.*\n instead of /\{2}.*\n? Commented May 29, 2012 at 8:12
  • But that's how it is right now... Commented May 29, 2012 at 8:16
  • In the second regex, you escape the forward slash. Is that a typo? Commented May 29, 2012 at 8:16
  • @Vikas: why are you escaping / ? Commented May 29, 2012 at 8:17
  • @yotamoo, Yeah, looks like you edited the typo. Commented May 29, 2012 at 8:17

1 Answer 1

3

Problem is not with regex but with split. You are calling split method of an str, which does not split by regex. It splits string by a sub-string. Instead use re.split

>>> code = open('file').read()
>>> code
'/* comment */\ntext1\n// comment\n\ntest2\n\ntext3 // comment\n\ntext4 /* comment */\n'
>>> import re
>>> re.split
<function split at 0x10d9c6320>
>>> re.split('/\*.*?\*/|\/{2}.*?\n|\s', code)
['', '', 'text1', '', '', 'test2', '', 'text3', '', '', 'text4', '', '', '']

More information on python re module.

Sign up to request clarification or add additional context in comments.

2 Comments

Ok this is great, thanks. Still a little problem - the list I get now includes a lot of empty elements like ['', '', '', '', '', class, main] and so on
@yotamoo, filter those out. [i for i in re.split('/\*.*?\*/|\/{2}.*?\n|\s', code) if i]

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.