10

I have this homework problem where I need to use regex to remove every other character in a string.

In one part, I have to delete characters at index 1,3,5,... I have done this as follows:

String s = "1a2b3c4d5";
System.out.println(s.replaceAll("(.).", "$1"));

This prints 12345 which is what I want. Essentially I match two characters at a time, and replacing with the first character. I used group capturing to do this.

The problem is, I'm having trouble with the second part of the homework, where I need to delete characters at index 0,2,4,...

I have done the following:

String s = "1a2b3c4d5";
System.out.println(s.replaceAll(".(.)", "$1"));

This prints abcd5, but the correct answer must be abcd. My regex is only incorrect if the input string length is odd. If it's even, then my regex works fine.

I think I'm really close to the answer, but I'm not sure how to fix it.

1
  • I'm not sure enough to post this as an answer, and I have no way of testing it, but I wonder if inserting a ? quantifier would work? That is, something like System.out.println(s.replaceAll(".(.)?", "$1")); (EDIT: Dang it, I hate being right and not being sure!) Commented Jul 2, 2010 at 14:07

3 Answers 3

20

You are indeed very close to the answer: just make matching the second char optional.

String s = "1a2b3c4d5";
System.out.println(s.replaceAll(".(.)?", "$1"));
// prints "abcd"

This works because:

  • Regex is greedy by default, it will take the second character if it's there
    • When the input is of odd length, the second char won't be there at the last replacement, but you'd still match one char (i.e. last char in input)
  • You can still use backreferences in substitution even if the group fails to match
    • It will substitute in the empty string, not "null"
    • This is different from Matcher.group(int), which returns null for failed groups

References


A closer look at the first part

Let's take a closer look at the first part of the homework:

String s = "1a2b3c4d5";
System.out.println(s.replaceAll("(.).", "$1"));
// prints "12345"

Here you didn't have to use ? for the second char, but it "works" because even though you didn't match the last char, you didn't have to! The last char can remain unmatched, unreplaced, due to the problem specification.

Now suppose that we want to delete chars at index 1,3,5..., and put the chars at index 0,2,4... in brackets.

String s = "1a2b3c4d5";
System.out.println(s.replaceAll("(.).", "($1)"));
// prints "(1)(2)(3)(4)5"

A-ha!! Now you're experiencing the exact same problem with odd-length input! You couldn't match the last char with your regex, because your regex needs two chars, but there's only one char at the end for odd-length input!

The solution, again, is to make matching the second char optional:

String s = "1a2b3c4d5";
System.out.println(s.replaceAll("(.).?", "($1)"));
// prints "(1)(2)(3)(4)(5)"
Sign up to request clarification or add additional context in comments.

2 Comments

I haven't tested this, but does your first part of your homework work for both even and odd? You might need this fix on the first part too, for the same reason. I guess I'm not sure though.
@glowcoder: Great comment! I've delved deeper into the first part thanks to your remark!
2

my regex is only incorrect if the input string length is odd. if it's even, then my regex works fine.

Change your expresion to .(.)? - the question mark makes the second character optional, which means it doesn't matter if input is odd or even

Comments

0

Your regex needs 2 chars to match, so fails on the final char.

This regex:

".(.{0,1})"

Will make the second char optional, so it will match with your final '5' as well

3 Comments

Isn't {0,1} just a confusing, unnecessary way to write ?
It's a different way to write it, but I'd argue with confusing and unnecessary

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.