0

I am trying to get all Values from Keys that contain the same substring. For example:

If a Key's string is "AAABBB' and another Key's string is 'XXXBBB' I want to get the Value from both of those Keys. (Since BBB matches)


The relationship of the substring match should be 3 characters in length. The prefix is from index 0-3 and the suffix index is from 3-6.

For example: AAABBB (AAA is the suffix and BBB is the prefix.)

(The relationship AABBBA is ignored because AAB and BBA do not match.)

I'm trying to avoid using nested for loops because my algorithm will run very slow at O(N^2). I'm working on finding a solution with 1 HashMap and 1 for loop.

HashMap a = new HashMap();

    map.put("AAABBB", 1);
    map.put("CCCPPP", 2);
    map.put("XXXBBB", 3);
    map.put("AAAOOO",4);



    for (Entry<String, String> entry : a.entrySet()) {
         String prefix = entry.getKey().substring(0,3);
         String suffix = entry.getKey().substring(3,6);

            if(map.contains("ANY_SUBSTRING" + suffix){
                System.out.println(map.get("ANY_SUBSTRING" + suffix);
                }

             }

Output: (1,3)

AAABBB => 1 XXXBBB => 3

6
  • Can you tell us more about the substring requirement? I mean, if two keys even share a single letter, do you still consider them to have a relationship? My gut feeling here is that a hashmap may not even be the optimal data structure for this problem. Commented Sep 3, 2019 at 16:04
  • Thanks for the quick response. Yes as long as the 2 keys share the same prefix or suffix. They share a relationship at substring of 3 characters in length. I'm trying to identify all key's which start or end with a substring of 3 characters. As long as there is a match. Commented Sep 3, 2019 at 16:10
  • It's still not clear. Suppose you have the keys 'AAABBB' and 'TTTUUB'. Do you consider them as related or not? They share a B. What about 'AAABBB' and 'TTAAALL'? They share the AAA. Do you consider them related? Can the strings have different lengths? Your relation is not well defined. Commented Sep 3, 2019 at 16:13
  • 1
    Make sure to add the required additional information to the question, using the edit button. Comments are not a good place for such explanations. Commented Sep 3, 2019 at 16:14
  • AAABBB and TTTUUB wouldn't match because they would need the same 3 characters. AAABBB and TTAAALL wouldn't match because the substring prefix starts from index (0,3) and suffix index stars from (3,6).No strings of different lengths. Only 6 characters, with a prefix and suffix of 3 characters. Thank you I will edit and update this question for further clearification. Commented Sep 3, 2019 at 16:18

1 Answer 1

1

I have following approach with streams.

  • Define a function to extract the suffix or prefix of each key of your map
  • Stream your maps entryset and group by prefix/suffix
  • filter those out which have no prefix/suffix incommon

Using your example map and assuming each key length is = 6

Map<String,Integer> map = new HashMap<>();
map.put("AAABBB", 1);
map.put("CCCPPP", 2);
map.put("XXXBBB", 3);
map.put("AAAOOO",4);

Function<Entry<String, Integer>,String> prefix = e -> e.getKey().substring(0,3);
Function<Entry<String, Integer>,String> suffix = e -> e.getKey().substring(3);

Map<String,List<Integer>> resultBySuffix = 
        map.entrySet().stream()
             .collect(Collectors.groupingBy( suffix , 
                        Collectors.mapping(Entry::getValue, Collectors.toList())
            )).entrySet().stream()
                    .filter(e -> e.getValue().size() > 1)
                    .collect(Collectors.toMap(Entry::getKey, Entry::getValue));

System.out.println(resultBySuffix);

Map<String,List<Integer>> resultByPrefix = 
       map.entrySet().stream()
            .collect(Collectors.groupingBy( prefix , 
                            Collectors.mapping(Entry::getValue, Collectors.toList())
            )).entrySet().stream()
                    .filter(e -> e.getValue().size() > 1)
                    .collect(Collectors.toMap(Entry::getKey, Entry::getValue));

System.out.println(resultByPrefix); 

No idea what the time complexity of the above example is like. But I think you can see what is going on (in terms of readability)

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.