3

I have this regex

"((\\s{0,2})\\p{XDigit}{2}\\s{0,2})*"

the user can select the matching string from Byte dump like this enter image description here

the selection above should be possible but selecting half of a Byte shouldn´t like this enter image description here

spaces at the end or beginning shouldn´t be a problem like this enter image description here

the problem with the given regex is that it takes far too long to match. what can i improve, what is the problem?

edit:

so i build a solution for this case. the only thing i need to check ist the beginning and the end of the string. removing the spaces and check if the first and last elements length of the splitted string is 1. I am Splitting it anyway because after that i am parsing it to a byte Array.

        String selection = dumpText.getSelectionText();

        if (selection.equals(" ") || selection.equals("  ")){
            return;
        }

        //remove spaces at the beginning
        while(selection.charAt(0) == ' '){
            selection = selection.substring(1);
        }

        //remove spaces at the end
        while(selection.charAt(selection.length()-1) == ' '){
            selection = selection.substring(0, selection.length()-1);
        }

        String[] splitted = selection.split("\\s{1,2}");

        if(splitted.length == 0 || splitted[0].length()==1 || splitted[splitted.length-1].length()==1){
                return;
        }
2
  • Can you be more specific. What takes too long the matching? Do you perform this operation a lot? Is the String too big? If it is performed often you can compile once and just use a matcher multiple times (if you don't do that already). Commented Aug 19, 2014 at 7:27
  • Or don't use split, use the last pattern in my answer; it returns an array of all two character occurrences. Check the Demo 3 Commented Aug 19, 2014 at 9:01

4 Answers 4

3

When you are asking something simple, a basic string comparison will be more efficient. In this case you are only interested in the first 2 and last 2 characters.

So you could test only those (after validating the length):

s.charAt(0) != ' ' && s.charAt(1) == ' ' 
    && s.charAt(s.length - 1) != ' ' && s.charAt(s.length - 2) == ' '

Although this isn't as fancy, it will be very fast. You just test if you have a single character and then a space, the other way around at the end.

This only works for basic validation though.

Sign up to request clarification or add additional context in comments.

Comments

1

Try this pattern:

\s{0,2}(?:\p{XDigit}{2}\s{0,2})*

You're experiencing Catastrophic Backtracking, where (in this case) you have multiple ways of failing to match the string.
The pattern as I've written is is basically the same, but should have only one way of matching the selection:

  • \s{0,2} - Optional leading spaces
  • (?:\p{XDigit}{2}\s{0,2})* - one or more hexadecimal pairs, with spaces after it.

Note that it is possible for this pattern to match hexadecimal digits without spaces, like 12AB, but it should work for your use case anyway.

Comments

0

I wouldn't actually try to match preceding or trailing spaces and would keep regex as simple as this using word boundaries:

\\b\\p{XDigit}{2}\\b

Use this regex in Matcher#find to match each byte sequence individually.

- RegEx Demo

Comments

0

Another solution, just check if there's any single character surrounded by spaces.

/^([a-zA-Z0-9]\s+)|(\s+[a-zA-Z0-9]\s+)|(\s+[a-zA-Z0-9])$/gm

Or something like this, in order to match a single character at the beginning or at the end of the sequence

/^([a-zA-Z0-9]\s+)|(\s+[a-zA-Z0-9])$/gm

Or this one, it returns only two character occurrences

/(?:\s*)([a-zA-Z0-9]{2})(?:\s*)/gm

Demo 1 | Demo | Demo 3

Side note: In this case you may use \p{XDigit} instead of [a-za-z0-9] as well

1 Comment

That is a good idea. If I understand correctly, the OP attempts to match the selected text only (text that the user selected), out of context in the whole string. It may not have spaces, but this should work: \b\p{XDigit}\b.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.