2

I'm pretty new to java, trying to find a way to do this better. Potentially using a regex.

String text = test.get(i).toString()
// text looks like this in string form:
// EnumOption[enumId=test,id=machine]

String checker = text.replace("[","").replace("]","").split(",")[1].split("=")[1];

// checker becomes machine

My goal is to parse that text string and just return back machine. Which is what I did in the code above.

But that looks ugly. I was wondering what kinda regex can be used here to make this a little better? Or maybe another suggestion?

5
  • For clarification: Do you want to get the string that is written behind id= regardless of the following string? Commented Dec 14, 2020 at 17:26
  • 3
    String checker = text.replaceFirst("EnumOption\\[enumId=test,id=(.*)\\]", "$1"); but isn’t there a simpler option like test.get(i).getId()? Commented Dec 14, 2020 at 17:31
  • What’s test? As Holger said, can’t you get the object’s ID directly without going the detour via toString()? Commented Dec 14, 2020 at 17:40
  • @Holger, I 100% agree with you, but in Eclipse when I tried to do that. .getId() was not an option. I don't know much about java. I just assumed if Eclipse doesn't show it available, then it's not available. Commented Dec 14, 2020 at 19:30
  • test is a customTypedList, and I iterate through it, looping through each element. and each element is an enumOption Commented Dec 14, 2020 at 19:35

4 Answers 4

3

Assuming you’re using the Polarion ALM API, you should use the EnumOption’s getId method instead of deparsing and re-parsing the value via a string:

String id = test.get(i).getId();
Sign up to request clarification or add additional context in comments.

Comments

2

Use a regex' lookbehind:

(?<=\bid=)[^],]*

See Regex101.

(?<=     )            // Start matching only after what matches inside
    \bid=             // Match "\bid=" (= word boundary then "id="),
          [^],]*      // Match and keep the longest sequence without any ']' or ','

In Java, use it like this:

import java.util.regex.*;

class Main {
  public static void main(String[] args) {
    Pattern pattern = Pattern.compile("(?<=\\bid=)[^],]*");
    Matcher matcher = pattern.matcher("EnumOption[enumId=test,id=machine]");
    if (matcher.find()) {
      System.out.println(matcher.group(0));
    }
  }
}

This results in

machine

Comments

2

Using the replace and split functions don't take the structure of the data into account.

If you want to use a regex, you can just use a capturing group without any lookarounds, where enum can be any value except a ] and comma, and id can be any value except ].

The value of id will be in capture group 1.

\bEnumOption\[enumId=[^=,\]]+,id=([^\]]+)\]

Explanation

  • \bEnumOption Match EnumOption preceded by a word boundary
  • \[enumId= Match [enumId=
  • [^=,\]]+, Match 1+ times any char except = , and ]
  • id= Match literally
  • ( Capture group 1
    • [^\]]+ Match 1+ times any char except ]
  • )\]

Regex demo | Java demo

Pattern pattern = Pattern.compile("\\bEnumOption\\[enumId=[^=,\\]]+,id=([^\\]]+)\\]");
Matcher matcher = pattern.matcher("EnumOption[enumId=test,id=machine]");

if (matcher.find()) {
    System.out.println(matcher.group(1));
}

Output

machine

If there can be more comma separated values, you could also only match id making use of negated character classes [^][]* before and after matching id to stay inside the square bracket boundaries.

\bEnumOption\[[^][]*\bid=([^,\]]+)[^][]*\]

In Java

String regex = "\\bEnumOption\\[[^][]*\\bid=([^,\\]]+)[^][]*\\]";

Regex demo

Comments

0

A regex can of course be used, but sometimes is less performant, less readable and more bug-prone.

I would advise you not use any regex that you did not come up with yourself, or at least understand completely.

PS: I think your solution is actually quite readable.

Here's another non-regex version:

String text = "EnumOption[enumId=test,id=machine]";
text = text.substring(text.lastIndexOf('=') + 1);
text = text.substring(0, text.length() - 1);

Not doing you a favor, but the downvote hurt, so here you go:

String input = "EnumOption[enumId=test,id=machine]";
Matcher matcher = Pattern.compile("EnumOption\\[enumId=(.+),id=(.+)\\]").matcher(input);
if(!matcher.matches()) {
  throw new RuntimeException("unexpected input: " + input);
}

System.out.println("enumId: " + matcher.group(1));
System.out.println("id: " + matcher.group(2));

6 Comments

When you talk about performance, I’m wondering why you are unnecessarily doing two substring operations instead of a single text.substring(text.lastIndexOf('=') + 1, text.length() - 1)
I did not mean to imply that my version is more performant. I usually optimize for readability and speed of implementation. It was more of a general comment.
So you think, doing two substring operations instead of one makes the code more readable?
Not really no. About as straightforward as the OP's own solution. Readable, easy to understand and step through with the debugger.
int start = text.lastIndexOf("id="); int end = text.length() - 1; text = text.substring(start, end); How is that not more readable that the two substrings?
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.