0

I need to extract data-user_idnumbers from a string

input string example

 data-user_id="987654" lorem epsem  lorem epsem  lorem epsem  lorem
 data-user_id="123456-6" lorem epsem epsem  lorem epsem
 <img src="abcd.com"/> lorem epsem  data-user_id="123456"

expected output

987654,123456-6,123456

Code I have (don't work)

private static String getIdFromLine(String inputLine) {
    Pattern p = Pattern.compile("(data-user_id=\"[0-9a-z]*\")");
    Matcher m = p.matcher(inputLine);
    if (m.find()) {
        String src = m.group(2);
    }

    return null;
}

3 Answers 3

4

You should have this regex:

data-user_id=\"([0-9a-z-]+)\"

group(1) will contain the desired output.

Your code suffers from more problems; you're not looping on the result, instead of if you should have while loop:

while (m.find()) {
    //build the result here
{
Sign up to request clarification or add additional context in comments.

Comments

1

In order to avoid compiltation of the pattern on each method call, I would write the method like this:

private static Pattern DATA_USER_ID_PATTERN = //
        Pattern.compile("data-user_id=\"([0-9a-z-]+)\"");

private static String getIdFromLine(String inputLine) {
    String src = null;
    Matcher m = DATA_USER_ID_PATTERN.matcher(inputLine);
    if (m.find()) {
        src = m.group(1);
    }

    return src;
}

If you're sure that no multiple threads will call your method, you can write it like this:

private static Matcher DATA_USER_ID_MATCHER = //
        Pattern.compile("data-user_id=\"([0-9a-z-]+)\"").matcher("");

private static String getIdFromLine(String inputLine) {
    String src = null;
    Matcher m = DATA_USER_ID_MATCHER;

    m.reset(inputLine);
    if (m.find()) {
        src = m.group(1);
    }

    return src;
}

Comments

0

There are a few issues with your attempt:

  1. your regex expression doesn't match the - character (or uppercase letters, in case you need those)
  2. you aren't building up your string to return
  3. you aren't looping using while to get all the matches in a single string

EDIT you've now edited your question to have a different input string. It is not clear if you will have multiple matches per line now.

try:

private static String getIdFromLine(String inputLine) {
    Pattern p = Pattern.compile("data-user_id=\"([0-9a-zA-Z-]*)\"");
    Matcher m = p.matcher(inputLine);
    StringBuilder sb = new StringBuilder("");
    while (m.find()) {
        String src = m.group(1);
        sb.append(src + ",");
    }
    int len = sb.length();
    if (len > 0)
        sb.delete(len - 1, len);

    return sb.toString();
}

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.