2

I have several strings in the rough form:

String s = "Rendering content from websiteNAme using user agent userAgentNameWithSpaces ; for user username ; at time someTime";

I want to extract the values for websiteName, userAgentNameWithSpaces, username and someTime. I have tried the following code.

private static final Pattern USER_NAME_PATTERN = Pattern.compile("for user.*;");
final Matcher matcher = USER_NAME_PATTERN.matcher(line); 
matcher.find() ? Optional.of(matcher.group(group)) : Optional.empty();

It returns the whole string " for user username" after that I have to replace the for user string with empty string to get the user name. However, I want to know if there is regex to just get the username directly?

1
  • 1
    Have you tried anything? Also is your String format the same every time? Commented Apr 19, 2017 at 19:15

2 Answers 2

2

You can use regex groups:

Pattern pattern = Pattern.compile("for user (\\w+)");
Matcher matcher = pattern.matcher(s);
if (matcher.find()) {
    System.out.println(matcher.group(1));
}

The pair of parenthesis ( and ) forms a group that can be obtained by the matcher using group method (as it's the first parenthesis, it's group 1).

\w means a "word character" (letters, numbers and _) and + means "one or more ocurrences". So \w+ means basically "a word" (assuming your username has only these characters). PS: note that I had to escape \, so the resulting expression is \\w+.

The ouput of this code is:

username


If you want to match all the values (websiteName, userAgentNameWithSpaces and so on), you could do the following:

Pattern pattern = Pattern.compile("Rendering content from (.*) using user agent (.*) ; for user (.*) ; at time (.*)");
Matcher matcher = pattern.matcher(s);
if (matcher.find()) {
    System.out.println(matcher.group(1));
    System.out.println(matcher.group(2));
    System.out.println(matcher.group(3));
    System.out.println(matcher.group(4));
}

The output will be:

websiteNAme
userAgentNameWithSpaces
username
someTime

Note that if userAgentNameWithSpaces contains spaces, \w+ won't work (because \w doesn't match spaces), so .* will work in this case.


But you can also use [\w ]+ - the brackes [] means "any of the characters inside me", so [\w ] means "a word character, or a space" (note that there's a space between w and ]. So the code would be (testing with a username with spaces):

String s = "Rendering content from websiteNAme using user agent userAgent Name WithSpaces ; for user username ; at time someTime";
Pattern pattern = Pattern.compile("Rendering content from (.*) using user agent ([\\w ]+) ; for user (.*) ; at time (.*)");
Matcher matcher = pattern.matcher(s);
if (matcher.find()) {
    System.out.println(matcher.group(1));
    System.out.println(matcher.group(2));
    System.out.println(matcher.group(3));
    System.out.println(matcher.group(4));
}

And the output will be:

websiteNAme
userAgent Name WithSpaces
username
someTime

Note: you can test if the groups were matched before calling matcher.group(n). The method matcher.groupCount() returns how many groups were matched (because if you call matcher.group(n) and group n is not available, you'll get an IndexOutOfBoundsException)

Sign up to request clarification or add additional context in comments.

2 Comments

Thanks for the explanation. Now, I am using this expression to match userAgentWithSpaces. Lemme know if this is right. Pattern.compile("using user agent ([\\w*\\s*]*)");
How about this. Pattern TEMPLATE_LOG_PATTERN = Pattern.compile( "Rendering content from (.*)using user agent (.*) ; for user (.*) ; at time (.*)$"); time = System.out.println(matcher.group(4).trim()); userName = System.out.println(matcher.group(3).trim());
2

I think you want to use lookaheads and lookbehinds:

String s = "Rendering content from websiteNAme using user agent userAgentNameWithSpaces ; for user username ; at time someTime";
Pattern USER_NAME_PATTERN = Pattern.compile("(?<=for user).*?(?=;)");
final Matcher matcher = USER_NAME_PATTERN.matcher(s);
matcher.find();
System.out.println(matcher.group(0).trim());

Output:

username

1 Comment

You can replace .*? with [^;]* or even [^;]*+ to save some match attempts.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.