I need to determine if a string represents a valid python identifier. Since python 3 identifiers support obscure unicode functionality, and python syntax might change across releases, I decided to avoid manual parsing. Unfortunately my attempts at utilizing python's internal interfaces don't seem to work:
I. function compile
>>> string = "a = 5; b "
>>> test = "{} = 5"
>>> compile(test.format(string), "<string>", "exec")
<code object <module> at 0xb71b4d90, file "<string>", line 1>
Clearly test can't force compile to use ast.Name as the root of the AST.
Next I attempt using the modules ast and parser. These modules are intended to derive a string, rather than determining if a string matches a particular derivation, but I figure they might be helpful anyway.
II. module ast
>>> a=ast.Module(body=[ast.Expr(value=ast.Name(id='1a', ctx=ast.Load()))])
>>> af = ast.fix_missing_locations(a)
>>> c = compile(af, "<string>", "exec")
>>> exec(c)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<string>", line 1, in <module>
NameError: name '1a' is not defined
OK, clearly Name isn't parsing '1a' for correctness. Perhaps this step happens earlier, in the parse phase.
III. module parser
>>> p = parser.suite("a")
>>> t = parser.st2tuple(p)
>>> t
(257, (268, (269, (270, (271, (272, (302, (306, (307, (308, (309, (312, (313, (314, (315, (316, (317, (318, (319, (320, (1, 'a')))))))))))))))))), (4, ''))), (4, ''), (0, ''))
>>>
>>> t = (257, (268, (269, (270, (271, (272, (302, (306, (307, (308, (309, (312, (313, (314, (315, (316, (317, (318, (319, (320, (1, '1a')))))))))))))))))), (4, ''))), (4, ''), (0, ''))
>>> p = parser.sequence2st(t)
>>> c = parser.compilest(p)
>>> exec(c)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<syntax-tree>", line 0, in <module>
NameError: name '1a' is not defined
OK, still not being checked... why? Quick check of python's full grammar specification shows that NAME is not defined. If these checks are performed by the bytecode compiler, shouldn't 1a have been caught?
I'm starting to suspect python exposes no functionality towards this goal. I'm also curious why some attempts failed.