1

I'm trying to use the Java Pattern and Matcher to apply input checks. I have it working in a really basic format which I am happy with so far. It applies a REGEX to an argument and then loops through the matching characters.

import java.util.regex.Pattern;
import java.util.regex.Matcher;

public class RegexUtil {

   public static void main(String[] args) {

      String    argument;
      Pattern   pattern;
      Matcher   matcher;

      argument = "#a1^b2";
      pattern = Pattern.compile("[a-zA-Z]|[0-9]|\\s");
      matcher = pattern.matcher(argument);

      // find all matching characters
      while(matcher.find()) {
         System.out.println(matcher.group());
      }

   }

}

This is fine for extracting all the good characters, I get the output

a
1
b
2

Now I wanted to know if it's possible to do the same for any characters that don't match the REGEX so I get the output

#
^

Or better yet loop through it and get TRUE or FALSE flags for each index of the argument

false
true
true
false
true
true

I only know how to loop through with matcher.find(), any help would be greatly appreciated

1
  • [^a-zA-Z0-9] will give you all characters not matching the range so: # ^ Commented Mar 9, 2018 at 13:32

5 Answers 5

3

You may add a |(.) alternative to your pattern (to match any char but a line break char) and check if Group 1 matched upon each match. If yes, output false, else, output true:

String argument = "#a1^b2";
Pattern pattern = Pattern.compile("[a-zA-Z]|[0-9]|\\s|(.)"); // or "[a-zA-Z0-9\\s]|(.)"
Matcher matcher = pattern.matcher(argument);

while(matcher.find()) {                           // find all matching characters
    System.out.println(matcher.group(1) == null);

See the Java demo, output:

false
true
true
false
true
true

Note you do not need to use a Pattern.DOTALL here, because \s in your "whitelist" part of the pattern matches line breaks.

Sign up to request clarification or add additional context in comments.

7 Comments

Check condition to print boolean can be simplified as basic while (matcher.find()) { System.out.println(matcher.group(1) == null); }
@azro Yes, I just thought that a custom string might be necessary, if true or false are the only values needed then sure.
@Trent Because Group 1 will hold them. The regex is basically tokenizing a string into 2 types of chars, those you want to be labelled as true and those that must be false.
@Trent Sorry, I do not get you. If you just want to print the match print it with matcher.group(0).
It's ok, I'm to tinker a bit but this answer is perfect for what I need, thanks a bunch
|
1

Why not simply removing all matching chars from your string, so you get only the non matching ones back:

import java.util.regex.Pattern;
import java.util.regex.Matcher;

public class RegexUtil {

   public static void main(String[] args) {

      String    argument;
      Pattern   pattern;
      Matcher   matcher;

      argument = "#a1^b2";
      pattern = Pattern.compile("[a-zA-Z]|[0-9]|\\s");
      matcher = pattern.matcher(argument);

      // find all matching characters
      while(matcher.find()) {
         System.out.println(matcher.group());
         argument = argument.replace(matcher.group(), "");
      }

      System.out.println("argument: " + argument);

   }

}

Comments

0

You have to iterate over each char of the String and check one by one :

//for-each loop, shorter way
for (char c : argument.toCharArray()){
    System.out.println(pattern.matcher(c + "").matches());
}

or

//classic for-i loop, with details
for (int i = 0; i < argument.length(); i++) {
    String charAtI = argument.charAt(i) + "";
    boolean doesMatch = pattern.matcher(charAtI).matches();
    System.out.println(doesMatch);
}

Also, when you don't require it, you can do declaration and give a value at same time :

String argument = "#a1^b2";
Pattern pattern = Pattern.compile("[a-zA-Z]|[0-9]|\\s");

Comments

0

Track positions, and for each match, print the characters between last match and current match.

int pos = 0;
while (matcher.find()) {
    for (int i = pos; i < matcher.start(); i++) {
        System.out.println(argument.charAt(i));
    }
    pos = matcher.end();
}
// Print any trailing characters after last match.
for (int i = pos; i < argument.length(); i++) {
    System.out.println(argument.charAt(i));
}

Comments

0

One solution is

    String  argument;
    Pattern pattern;
    Matcher matcher;

    argument = "#a1^b2";
    List<String> charList = Arrays.asList(argument.split(""));
    pattern = Pattern.compile("[a-zA-Z]|[0-9]|\\s");
    matcher = pattern.matcher(argument);
    ArrayList<String> unmatchedCharList = new ArrayList<>();
    // find all matching
    while(matcher.find()) {
        unmatchedCharList.add(matcher.group());
    }
    for(String charr : charList)
    {
        System.out.println(unmatchedCharList.contains(charr ));
    }

Output

false true true false true true

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.