1

I have a statement which finds strings that contain one character, say P. This works when matching against a string delimited by no white space

e.g.

APAXA

Thr regex being ^[^P]*P[^P]*$

It picks this string out fine, however, what if I have a string

XPA  DREP EDS

What would be the regex to identify all strings in one line that match the condition (strings always seperated by some kind of white space - tab, space etc)?

e.g. how would I highlight XPA and DREP

I am using while(m.find()) to loop multiple times and System.out.println(m.group())

so m.group has to contain the entire string.

1
  • What type of data is this? Just uppercase ASCII letters and ASCII spaces only? Commented Jan 20, 2011 at 14:39

6 Answers 6

2

Split it by whitespace and then check each token against your existing regex.

Sign up to request clarification or add additional context in comments.

1 Comment

That won't find DREP as the whitespace is part of the match condition.
1

why must it be a an overly complicated regex?

String string = "XPA  DREP EDS";
String[] s = string.split("\\s+");
for( String str: s){
  if ( str.contains("P") ){
     System.out.println( str );
  }
}

Comments

0

you can try and use the \s pattern (match whitespace). Look at this regexp page for java.

1 Comment

You mean match ASCII whitespace, as opposed to Unicode whitespace.
0

Thr reex being ^[^P]P[^P]$

Such a regex finds only string containing exactly one P, which may or may not be what you want. I suppose you want .*P.* instead.

For finding all words containing at least one P you can use \\S+P\\S+, where \S stands for non-blank character. You may consider \w instead.

For finding all words containing exactly one P you can use [^\\sP]+P[^\\sP]+(?=\\s) which is more complicated. Here, \s stands for blank, [^abc] matches everything expect for abc, (?=...) is lookahead. Without the lookahead, you'd find in "APBPC" two "words": "APB" and "PC".

1 Comment

You're wrong, or do you really mean the following is ascii? final String s = "Příliš žluťoučký kůň úpěl ďábelské ódy"; final Pattern p = Pattern.compile("\\S+l\\S+"); final Matcher m = p.matcher(s); while (m.find()) System.out.println(m.group());`
0

Try adding whitespace characters (\s) in your negated character classes, and you'll also want to remove the ^ and $ anchors:

[^P\s]*P[^P\s]*

or as a Java String literal:

"[^P\\s]*P[^P\\s]*"

Note that the above does not work on Unicode, only ASCII (as tchrist mentioned in the comments).

1 Comment

With the proviso that that’s only going to work on ASCII characters, not non-ASCII Unicode characters.
0
\b[^P\s]*P[^P\s]*\b

will match all words that contain exactly one P. Don't forget to double the backslashes when constructing your regex from a Java string.

Explanation:

\b      # Assert position at start/end of a word
[^P\s]* # Match any number of characters except P and whitespace
P       # Match a P
[^P\s]* # Match any number of characters except P and whitespace
\b      # Assert position at start/end of a word

Please note that \b doesn't match all word boundaries correctly when dealing with Unicode string (thanks tchrist for reminding me). If that is the case for you, you might want to replace the \bs with (don't look):

(?:(?<=[\pL\pM\p{Nd}\p{Nl}\p{Pc}[\p{InEnclosedAlphanumerics}&&\p{So}]])(?![\pL\pM\p{Nd}\p{Nl}\p{Pc}[\p{InEnclosedAlphanumerics}&&\p{So}]])|(?<![\pL\pM\p{Nd}\p{Nl}\p{Pc}[\p{InEnclosedAlphanumerics}&&\p{So}]])(?=[\pL\pM\p{Nd}\p{Nl}\p{Pc}[\p{InEnclosedAlphanumerics}&&\p{So}]]))

(taken from this question's winning answer)

4 Comments

It does here. It is supposed to match that word, isn't it?
Ah - you want the entire word to contain only one P? I get it.
No its supposed to match any word containing a single P. Sorry maybe I didn't mention that... :S
That’s another ASCII-only pattern. It does not work “properly” on full Unicode, Java’s native character set. I strongly suggest some sort of commenting about that restriction.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.