0

Given that the user can enter values in specific formats only, I need to extract relevant sections of that string into Java variables.

Say for instance acceptable formats are:-

String types[] = {"The quick brown ${animal} jumped over the lazy ${target}.",
                  "${target} loves ${animal}.",
                  "${animal} became friends with ${target}"};

Variables:-

private String animal;
private String target;

Now if the user enters "The quick brown fox jumped over the lazy dog.", the animal variable should be set to "fox" and target variable should be set to "dog".

If the user input matches none of the given types, it should display an error.

Basically I am trying to do the inverse of what org.apache.commons.lang.text.StrSubstitutor does.

My approach (looks inefficient, hence asking for help):-

Create regex patterns to find out the type of the entered string and then write different logic for each of the type. For example, for the first type, get the word after the word "brown" and assign it to variable animal and so on.

4
  • In a rare instance, I actually found a Code Ranch article which is an exact duplicate of your question: coderanch.com/mobile/t/375438/java/appendReplacement-appendTail Commented Mar 3, 2018 at 7:53
  • Just adapt that code sample to your problem. Commented Mar 3, 2018 at 7:54
  • @TimBiegeleisen It's a different problem, I don't think it can be solved simply by using appendReplacement, which gives the matched part of the string. Commented Mar 3, 2018 at 8:46
  • Map<String, String> valuesMap = new HashMap<String, String>(); valuesMap.put("animal", "quick brown fox"); valuesMap.put("target", "lazy dog"); String templateString = "The ${animal} jumped over the ${target}."; StrSubstitutor sub = new StrSubstitutor(valuesMap); String resolvedString = sub.replace(templateString); Commented Mar 3, 2018 at 9:10

3 Answers 3

1

Using @Josh Withee's answer:-

/**
 * @param input         String from which values need to be extracted
 * @param templates     Valid regex patterns with capturing groups
 * @param variableNames Names for named capturing groups
 * @return Map with variableNames as the keys and the extracted strings as map values
 * OR an empty, non-null map if the input doesn't match with any template, or if there is no group with the given variableNames
 */
public static Map<String, String> extractVariablesFromString(String input, List<String> templates, String... variableNames) {
        Map<String, String> resultMap = new HashMap<>();
        Optional<String> matchedTemplate = templates.stream().filter(input::matches).findFirst();
        matchedTemplate.ifPresent(t -> {
            Matcher m = Pattern.compile(t).matcher(input);
            m.find();
            Arrays.stream(variableNames)
                    .forEach(v -> {
                        try {
                            resultMap.put(v, m.group(v));
                        } catch (IllegalArgumentException e) {
                        }
                    });
        });
        return resultMap;
    }

Tests:-

    @Test
    public void shouldExtractVariablesFromString() {
        String input = "The quick brown fox jumped over the lazy dog.";
        String types[] = {"The quick brown (?<animal>.*) jumped over the lazy (?<target>.*).",
                "(?<target>.*) loves (?<animal>.*).",
                "(?<animal>.*) became friends with (?<target>.*)"};
        Map<String, String> resultMap = StringUtils.extractVariablesFromString(input, Arrays.asList(types), "animal", "target1", "target");
        Assert.assertEquals("fox", resultMap.get("animal"));
        Assert.assertEquals("dog", resultMap.get("target"));
        Assert.assertFalse(resultMap.containsKey("target1"));
    }

    @Test
    public void shouldReturnEmptyMapIfInputDoesntMatchAnyPatternForVariableExtraction() {
        String input = "The brown fox passed under the lazy dog.";
        String types[] = {"The quick brown (?<animal>.*) jumped over the lazy (?<target>.*).",
                "(?<animal>.*) became friends with (?<target>.*)"};
        Map<String, String> resultMap = StringUtils.extractVariablesFromString(input, Arrays.asList(types), "animal", "target1", "target");
        Assert.assertTrue(resultMap.isEmpty());
    }
Sign up to request clarification or add additional context in comments.

Comments

0

You can do this with named capture groups:

String userInput = "dog loves fox.";

String types[] = {"The quick brown (?<animal>.*?) jumped over the lazy (?<target>.*?).",
                  "(?<target>.*?) loves (?<animal>.*?).",
                  "(?<animal>.*?) became friends with (?<target>.*?)"};

Matcher m;

for(int i=0; i<types.length(); i++;){
    if(userInput.matches(types[i]){
        m = Pattern.compile(types[i]).matcher(userInput);
        break;
    }
}

m.find();

String animal = m.group("animal");
String target = m.group("target");

3 Comments

Thanks Marathon55. I had to remove ? from end of named groups to make it work. Like (?<target>.*?) was changed to (?<target>.*). I have added an answer, trying to convert it to Java 8. Also, is there any way where I don't have to provide the names like "animal" and "target", and still get a map having all the named groups it found? i.e. in my answer, I don't want to supply the variableNames in the parameter.
You can name the named capture groups anything you want. They don't have to be the same name as the variable
I know, that's obvious. I am saying instead of asking for groups by name as in m.group("animal"), is there a way where we don't specify the name of the group and get all groups? Something like m.getAllGroups(), which would return a map with the group name as the key and the actual value as the map values? It's ok if there isn't a way, this will also work. Thanks.
0
/**
 *
 * @param input /Volumes/data/tmp/send/20999999/sx/0000000110-0000000051-007-20211207-01.txt
 * @param template /{baseDir}/send/{yyyyMMdd}/{organization}/{sendOrganization}-{receiveOrganization}-{fileType}-{date}-{batch}.txt
 * @param prefix
 * @param suffix
 * @return
 */
public static Map<String, String> extractVariables(final String input, final String template, final String prefix, final String suffix) {
    final HashSet<String> variableNames = new HashSet<>();
    String variableNamesRegex = "(" + prefix + "([^" + prefix + suffix + "]+?)" + suffix + ")";
    Pattern variableNamesPattern = Pattern.compile(variableNamesRegex);
    Matcher variableNamesMatcher = variableNamesPattern.matcher(template);
    while (variableNamesMatcher.find()) {
        variableNames.add(variableNamesMatcher.group(2));
    }
    final String regexTemplate = template.replaceAll(prefix, "(?<").replaceAll(suffix, ">.*)");
    Map<String, String> resultMap = new HashMap<>();
    Matcher matcher = Pattern.compile(regexTemplate).matcher(input);
    matcher.find();
    variableNames.forEach(v -> resultMap.put(v, matcher.group(v)));
    return resultMap;
}

usage like

extractVariables(input, template2, "\{", "\}")

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.