-
Notifications
You must be signed in to change notification settings - Fork 27.3k
feat($parse): support unicode identifier names #3848
Conversation
|
FYI, the unicode code points are generated by this python script: |
Support unicode identifier names as defined in Section 7.6 Identifier Names and Identifiers, ECMAScript Language Specification (http://www.ecma-international.org/ecma-262/5.1/#sec-7.6), except for unicode escape sequences which is hard to implement without changing too much existing code. Closes #3847
|
Sorry for the frequent closing and opening again. Travis is being strange. Everything is fine on local and actually one of the builds has passed; the problem is there are two builds for the same build number. |
|
Now the overhead of this feature is 3.2kb for minified code and 1.9kb for minified & gzipped code, which is about 6% increase from master branch. |
|
The file size increase is a concern. I wonder if one could ship an "international" version of the library that supported unicode identifiers? @IgorMinar - what do you think? |
|
@clee704 - Can you ensure that you have signed the CLA. Thanks |
|
@petebacondarwin Yes, I have. My name is Choongmin Lee. |
|
If the file size is a concern, maybe we could make a separate module for i18n and put this code there. I guess it should not increase the size of $parse much, though I'm not sure as I'm completely new to how angular is built. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not strictly correct, although it was not before this PR, too. After a dot character, a new identifier should start, so we should check for isIdent() instead of ch == '.' || isIdentPart(ch) || isNumber(ch) after dot. If this is eventually getting into the core, I'll fix it.
|
@clee704 - I think moving this into an optional |
|
I agree with @petebacondarwin that it is a good idea to have this (un)pluggable and maybe even customizable. It is ~4K of code that will be called a lot and some people may not like incurring the speed and KB penalty. For my project I only need 10 more letters, it’s 66 letters for Russian, and I guess this may be the case for many european languages. These cases would have a considerable smaller footprint compared to what is the all-in-one thing. I’m wondering what’s holding this back from getting into the public release. //cc @clee704 |
|
I'm closing this in favor of #4747. |
Support unicode identifier names as defined in Section 7.6 Identifier Names
and Identifiers, ECMAScript Language Specification
(http://www.ecma-international.org/ecma-262/5.1/#sec-7.6),
except for unicode escape sequences which is hard to implement
without changing too much existing code.
Closes #3847