3

I have strings like this <p0=v0 p1=v1 p2=v2 ....> and I want to swap pX with vX to have something like <v0=p0 v1=p1 v2=p2 ....> using regexps. I want only pairs in <> to be swapped.

I wrote:

Pattern pattern = Pattern.compile("<(\\w*)=(\\w*)>");
Matcher matcher = pattern.matcher("<p1=v1>");
System.out.println(matcher.replaceAll("$2=$1"));

But it works only with a single pair pX=vX Could someone explain me how to write regexp that works for multiple pairs?

3
  • 4
    Maybe get rid of < and > from pattern? Commented Feb 25, 2014 at 17:05
  • I want it to swap only those pairs between < and > Commented Feb 25, 2014 at 17:23
  • Your example input strings contains only data that are inside <...>. Can there be x=y pairs outside of it like <..>p4=v4 which you want to avoid? Commented Feb 25, 2014 at 17:28

5 Answers 5

2

Simple, use groups:

String input = "<p0=v0 p1=v1 p2=v2>";
//                                   |group 1
//                                   ||matches "p" followed by one digit
//                                   ||      |... followed by "="
//                                   ||      ||group 2
//                                   ||      |||... followed by "v", followed by one digit
//                                   ||      |||          |replaces group 2 with group 1,
//                                   ||      |||          |re-writes "=" in the middle
System.out.println(input.replaceAll("(p[0-9])=(v[0-9])", "$2=$1"));

Output:

<v0=p0 v1=p1 v2=p2>
Sign up to request clarification or add additional context in comments.

4 Comments

but your solution swaps all pairs, and I want it to swap only those pairs between < and >
Do you think he needs to use ordered 'p' and 'v' and digits constants, or \w variability as was in his regex?
Sorry, from my tablet right now - will take a closer look and adjust tomorrow.
Looks like the question is already answered now... Sorry I couldn't finish this.
0

You can use this pattern:

"((?:<|\\G(?<!\\A))\\s*)(p[0-9]+)(\\s*=\\s*)(v[0-9]+)"

To ensure that the pairs are after an opening angle bracket, the pattern start with:

(?:<|\\G(?<!\\A))

that means: an opening angle bracket OR at the end of the last match

\\G is an anchor for the position immediatly after the last match or the begining of the string (in other words, it is the last position of the regex engine in the string, that is zero at the start of the string). To avoid a match at the start of the string I added a negative lookbehind (?<!\\A) -> not preceded by the start of the string.

This trick forces each pair to be preceded by an other pair or by a <.

example:

String subject = "p5=v5 <p0=v0 p1=v1 p2=v2 p3=v3> p4=v4";
String pattern = "((?:<|\\G(?<!\\A))\\s*)(p[0-9]+)(\\s*=\\s*)(v[0-9]+)";
String result = subject.replaceAll(pattern, "$1$4$3$2");

If you need p and v to have the same number you can change it to:

String pattern = "((?:<|\\G(?<!\\A))\\s*)(p([0-9]+))(\\s*=\\s*)(v\\3)";
String result = subject.replaceAll(pattern, "$1$5$4$2");

If parts between angle brackets can contain other things (that are not pairs):

String pattern = "((?:<|\\G(?<!\\A))(?:[^\s>]+\\s*)*?\\s*)(p([0-9]+))(\\s*=\\s*)(v\\3)";
String result = subject.replaceAll(pattern, "$1$4$3$2");

Note: all these patterns only checks if there is an opening angle bracket, but don't check if there is a closing angle bracket. If a closing angle bracket is missing, all pairs will be replaced until there is no more contiguous pairs for the two first patterns and until the next closing angle bracket or the end of the string for the third pattern.

You can check the presence of a closing angle bracket by adding (?=[^<>]*>) at the end of each pattern. However adding this will make your pattern not performant at all. It is better to search parts between angle brackets with (?<=<)[^<>]++(?=>) and to perform the replacement of pairs in a callback function. You can take a look at this post to implement it.

2 Comments

Works fine, but there is no explanation. I found a tutorial which is useful to understand your solution. javaranch.com/journal/2003/04/RegexTutorial.htm
@user3329098: Sorry but I have connection problems, I will add explanations.
0

To replace everything between < and > (let's call it tag) is - imho - not possible if the same pattern can occur outside the tag.

Instead to replace everything at once, I'd go for two regexes:

String str = "<p1=v1 p2=v2> p3=v3 <p4=v4>";
Pattern insideTag = Pattern.compile("<(.+?)>");
Matcher m = insideTag.matcher(str);

while(m.find()) {
    str = str.replace(m.group(1), m.group(1).replaceAll("(\\w*)=(\\w*)", "$2=$1"));
}
System.out.println(str);

//prints: <v1=p1 v2=p2> p3=v3 <v4=p4>

The matcher grabs everything between < and > and for each match it replaces the content of the first capturing group with the swapped one on the original string, but only if it matches (\w*)=(\w*), of course.

Trying it with

<p1=v1 p2=v2 just some trash> p3=v3 <p4=v4>

gives the output

<v1=p1 v2=p2 just some trash> p3=v3 <v4=p4>

Comments

0

This should work to swap only those pairs between < and >:

String string = "<p0=v0 p1=v1 p2=v2> a=b c=d xyz=abc <foo=bar baz=bat>";
Pattern pattern1 = Pattern.compile("<[^>]+>");
Pattern pattern2 = Pattern.compile("(\\w+)=(\\w+)");
Matcher matcher1 = pattern1.matcher(string);
StringBuffer sbuf = new StringBuffer();
while (matcher1.find()) {
    Matcher matcher2 = pattern2.matcher(matcher1.group());
    matcher1.appendReplacement(sbuf, matcher2.replaceAll("$2=$1"));
}
matcher1.appendTail(sbuf);
System.out.println(sbuf);

OUTPUT:

<v0=p0 v1=p1 v2=p2> a=b c=d xyz=abc <bar=foo bat=baz>

Comments

0

If Java can do the \G anchor, this will work for unnested <>'s
Find: ((?:(?!\A|<)\G|<)[^<>]*?)(\w+)=(\w+)(?=[^<>]*?>)
Replace (globally): $1$3=$2

Regex explained

 (                     # (1 start)
      (?:
           (?! \A | < )
           \G                    # Start at last match
        |  
           <                     # Or, <
      )
      [^<>]*? 
 )                     # (1 end)
 ( \w+ )               # (2)
 =
 ( \w+ )               # (3)
 (?= [^<>]*? > )       # There must be a closing > ahead

Perl test case

$/ = undef;
$str = <DATA>;
$str =~ s/((?:(?!\A|<)\G|<)[^<>]*?)(\w+)=(\w+)(?=[^<>]*?>)/$1$3=$2/g;
print $str;
__DATA__
<p0=v0 p1=v1  p2=v2 ....>

Output >>

<v0=p0 v1=p1  v2=p2 ....>

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.