6

I'm trying to write a regex that will replace all invalid characters in a JavaScript variable name with underscores (in Java).

What I'm wanting to do is:

String jsVarName = "1inva>idName".replaceAll("[a-zA-Z_$][0-9a-zA-Z_$]", "_");

and end up with a variable named _inva_idName.

What I'm struggling to do is figure out how to make the first character different to the others.

[a-zA-Z_$][0-9a-zA-Z_$] are the characters I want, but I cant figure out to hook them into the correct syntax. I know JS var names can be full unicode, but I only care about about ASCII.

2
  • related: stackoverflow.com/questions/1661197/… Commented Oct 30, 2014 at 3:37
  • since the title is somewhat confusing, note that a javascript variable name can contain far more charactesr than just 0-9a-zA-Z_$ Commented Aug 15, 2021 at 15:21

1 Answer 1

5
String jsVarName = "1inva>idName".replaceAll("^[^a-zA-Z_$]|[^0-9a-zA-Z_$]", "_");

Note that since \w is [a-zA-Z_0-9], it can be simplified:

String jsVarName = "1inva>idName".replaceAll("^[^a-zA-Z_$]|[^\\w$]", "_")

^[^a-zA-Z_$] matches anything that is not [a-zA-Z_$] and appears at the beginning of the line. | is OR. [^0-9a-zA-Z_$] matches anything that is not [0-9a-zA-Z_$].

See regex tutorial for more info.

Sign up to request clarification or add additional context in comments.

2 Comments

Oh you were talking in relation with my answer, sorry I'm a little slow atm. I thought the character class after | would also match the beginning of the string without negating the ^ but apparently it works without it.
Oh damn, I didn't realize the first group is contained in the second group. +1 your answer is pretty cleaner.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.