0

I am using Java to parse through a JavaScript file. Because the scope is different than expected in the environment in which I am using it, I am trying to replace every instance of i.e.

test = value

with

window.test = value

Previously, I had just been using

writer.append(js.getSource().replaceAll("test", "window.test"));

which obviously isn't generalizable, but for a fixed dataset it was working fine.

However, in the new files I'm supposed to work with, an updated version of the old ones, I now have to deal with

window['test'] = value

and

([[test]])

I don't want to match test in either of those cases, and it seems like those are the only two cases where there's a new format. So my plan was to now do a regex to match anything except ' and [ as the first character. That would be ([^'\[])test; however, I don't actually want to replace the first character - just make sure it's not one of the two I don't want to match.

This was a new situation for me because I haven't worked with replacement with RegExps that much, just pattern matching. So I looked around and found what I thought was the solution, something called "non-capturing groups". The explanation on the Oracle page sounded like what I was looking for, but when I re-wrote my Regular Expression to be (?:[^'\\[])test, it just behaved exactly the same as if I hadn't changed anything - replacing the character preceding test. I looked around StackOverflow, but what I discovered just made me more confident that what I was doing should work.

What am I doing wrong that it's not working as expected? Am I misusing the pattern?

6
  • Can you post an SSCCE Commented Dec 10, 2012 at 19:56
  • regexplanet.com/advanced/java/index.html, along with the examples of Regexes, expressions to match, and the results, are an example. Commented Dec 10, 2012 at 19:58
  • You can refer to this question stackoverflow.com/questions/632204/… Commented Dec 10, 2012 at 19:59
  • That was exactly the example I was looking at; however, when I used it in replaceAll, it did not behave as I wanted. For example, if I wanted to replace "stackoverflow.com" with "google.com" and also wanted to catch ftp, I would use str.replaceAll("(?:http|ftp)://...", "google.com") but the result would just be "google.com" Commented Dec 10, 2012 at 20:04
  • 1
    Non-capturing groups ((?:...)) only affect the groups, not the match itself. See an excellent example at stackoverflow.com/questions/3512471/non-capturing-group, in which a non-matching-group http is still part of the match, but not in a group. Commented Dec 10, 2012 at 20:04

2 Answers 2

3

If you include an expression for the character in your regex, it will be part of what is matched.

The trick is to use what you match in the replacement String, so you replace that bit by itself.

try :

replaceAll("([^'\[])test", "$1window.test"));

the $1 in the replacement String is a back reference to what capturing group 1 matched. In this case that is the character preceding test

Sign up to request clarification or add additional context in comments.

Comments

0

Why not simply test on "(test)(\s*)=(\s*)([\w\d]+)" ? That way you only match "test", then whitespace, followed by an '=' sign followed by a value (in this case consisting of digits and alphabetical letters and the underscore character). You can then use the groups (between parentheses) to copy the value -and even the whitespace if required - to your new text.

2 Comments

The example I gave isn't really comprehensive - there are also places with i.e. test.n = 5 or x = test.a.b.c.d.substring(4, 2);. In that instance I would want it to become window.test.a.b.c.d...
OK, that fine then, I just wanted to mention that it is sometimes easier to match what you are actually wanting to be matched - the opposite regex more or less.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.