Regex as if string is a variable

Question

I'm working with a script that would determine if my string would be a valid variable. It's very basic but I can`t seem to figure out how to use regular expression.

So basically I want:

A-Z
a-z
0-9
no whitespace anywhere
no special char except _

Is that possible ? This is what I tried:

re.match("[a-zA-Z0-9_,/S]*$", char_s):

You need to anchor your pattern with ^ at the front so the letter (no number) occurs at the beginning... — beroe
– beroe, Commented Sep 30, 2013 at 23:17

p.s.w.g · Accepted Answer · 2013-09-30 22:47:52Z

4

A pattern like this should work:

^[a-zA-Z_][a-zA-Z0-9_]*$

Or more simply:

^(?!\d)\w+$

In both cases, it will match a string which consists of one or more letters, digits or underscores as long it doesn't start with a digit.

The (?!…) in the second pattern is a negative look-ahead assertion. It ensures the first character is not a digit. More information can be found in the manual.

edited Sep 30, 2013 at 22:47

answered Sep 30, 2013 at 22:42

p.s.w.g

150k31 gold badges307 silver badges339 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Emil Davtyan · Accepted Answer · 2013-09-30 23:23:27Z

3

Well on top of the regular expressions mentioned you need to make sure it is not one of the reserved keywords :

and       del       from      not       while    
as        elif      global    or        with     
assert    else      if        pass      yield    
break     except    import    print              
class     exec      in        raise              
continue  finally   is        return             
def       for       lambda    try

So something like this :

reserved = ["and", "del", "from", "not", "while", "as", "elif", "global", "or", "with", "assert", "else", "if", "pass", "yield", "break", "except", "import", "print", "class", "exec", "in", "raise", "continue", "finally", "is", "return", "def", "for", "lambda", "try"]

def is_valid(keyword):
    return (keyword not in reserved and
            re.match(r"^(?!\d)\w+$", keyword) # from p.s.w.g answer

Or like @nofinator suggests you can and should probably just use keyword.iskeyword().

edited Sep 30, 2013 at 23:23

answered Sep 30, 2013 at 22:50

Emil Davtyan

14.1k5 gold badges48 silver badges68 bronze badges

3 Comments

nofinator Over a year ago

You could also use keyword.iskeyword(). See docs.python.org/2/library/keyword.html#keyword.iskeyword

Emil Davtyan Over a year ago

@nofinator Nice, did not know that.

SKTLZ Over a year ago

yeah this is what I used for keyword

John Kugelman · Accepted Answer · 2013-09-30 22:41:51Z

1

re.match(r"^[^\W\d]\w*$", char_s):

The word \w character class is equivalent to [a-zA-Z0-9_]. Identifiers cannot start with a digit, so match [^\W\d] for the first character and \w* for the rest of them.

answered Sep 30, 2013 at 22:41

John Kugelman

364k70 gold badges555 silver badges600 bronze badges

Comments

Veedrac · Accepted Answer · 2013-09-30 23:39:37Z

1

The correct methods:

Python 2

import re
import keyword
import tokenize

re.match(tokenize.Name+"$", char_s) and not keyword.iskeyword(char_s)

Python 3

import keyword

char_s.isidentifier() and not keyword.iskeyword(char_s)

Note that Python 2's method silently fails on Python 3.

When you see these kind of questions the first thing you should ask is "how does Python do it?" because almost all of the time it exposes a method to the user.

answered Sep 30, 2013 at 23:39

Veedrac

60.7k15 gold badges120 silver badges177 bronze badges

Collectives™ on Stack Overflow

Regex as if string is a variable

4 Answers 4

Comments

3 Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

Comments

3 Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related