0

We have two externally maintained NoSQL databases, a legacy one and a new one. The records on these databases need to be compared for equality, and the new one must be updated to match the legacy one. The decision to carry this out using a Java application was not mine, but one which I must now implement.

Essentially the objects are returned and have various nested lists. I need to perform the java equivalent of joins on these lists, finding out at each nested list if the object is present in only legacy system (left outer join where right join key is null), only the new system (right outer join where left join key is null), or present in both (inner join). If they are present in both, I then need to then carry out the same logic on the nested list.

Example below:

// Objects below

    class Qualification {
        String qualificationName;
        String qualificationValue;
    }

    class Person {
        String personId;
        List<Qualification> qualifications;
    }

    class Group {
        String groupId;
        List<Person> people;
    }
    
    class DatabaseResponse {
        String recordId;
        List<Group> groups;
    }

So I have two lists of DatabaseResponses. I want to check records where the recordId matches from each database.

  • Records found only in the old system need added to the new one
  • Records found only in the new system need removed from it
  • Records which match need compared further.

We then compare the Groups in the same system, joining on groupId.

  • Records found only in the old system need added to the new one
  • Records found only in the new system need removed from it
  • Records which match need compared further.

Next we need to compare the People in the matching Groups, joining on personId.

  • Records found only in the old system need added to the new one
  • Records found only in the new system need removed from it
  • Records which match need compared further.

Finally we need to compare each qualification, checking the name and value are correct. Based on this, we will need to add/update/delete the new system based on the old one. There are records which are to be ignored, so some of the joins will have flags which need to be ignored.

I have tried doing this with lists, using the .equals method setting the join key in there. But I feel maybe streams will be the right way to do this, possibly with a flatMap, as I will need to reference parent objects when updating in the NoSQLdatabase? Any help is appreciated.

for(DatabaseResponse legacyResponse : legacyList){
    for(DatabaseResponse newResponse : newList){
        if(legacyResponse.equals(newResponse){
          // cycle through each responses groups to find the ones that match

          // cycle through each matching group to compare their people etc...
        }
    }
}
5
  • are you sure that you can handle both databases in memory at the same time? How big are them? Commented Feb 23, 2022 at 17:00
  • They are being sent record by record via an API, Its not an ideal solution but this is basically the only way this can be done - through API's, which is why this Java application is required Commented Feb 23, 2022 at 17:07
  • I wonder if understand this problem correctly? As a result of each the step you've described you need the following: a list of groups to insert and a list of groups to remove from DB; a list of people to insert and a list of people to remove from DB; a list of people to update. Commented Feb 23, 2022 at 18:47
  • Can you provide your imperative solution so that it'll be more how to translate it into streams? Commented Feb 23, 2022 at 18:51
  • shown something there.. Commented Feb 24, 2022 at 1:07

1 Answer 1

1

If I understood your goal correctly you as a result need the following:

  • a collection of Group objects that have to be inserted into the new DB and a collection of Group objects that have to be removed from the new DB;
  • the same for the Person objects;
  • the same for the Qualification objects;

My approach is to generate two maps Map<Group, Set<Person>> for old and new DB. And then use them to get unions and differences (in terms of the set theory) for groups, people and qualifications.

I assume that the equality of Group and Person objects is based solely on their id.

Note: multi-line functions (especially those that you see in getQualifByPersToRemove() and getQualifByPersToAdd()) has to be avoided and normally extracted into a separate method.

    public static void main(String[] args) {
        List<DatabaseResponse> oldResp = getResp() // fetching responces
        List<DatabaseResponse> newResp = getResp() // fetching responces

        Map<Group, Set<Person>> persByGroupIdOld = getPersByGroupId(oldResp);
        Map<Group, Set<Person>> persByGroupIdNew = getPersByGroupId(newResp);

        List<Group> groupsToRemove = getGroupsToRemove(persByGroupIdOld, persByGroupIdNew);
        List<Group> groupsToAdd = getGroupsToAdd(persByGroupIdOld, persByGroupIdNew);

        Set<Person> persToRemove = getPersToRemove(persByGroupIdOld, persByGroupIdNew);
        Set<Person> persToAdd = getPersToAdd(persByGroupIdOld, persByGroupIdNew);

        Map<Person, Set<Qualification>> qualifByPersToRemove = getQualifByPersToRemove(persByGroupIdOld, persByGroupIdNew);
        Map<Person, Set<Qualification>> qualifByPersToAdd = getQualifByPersToAdd(persByGroupIdOld, persByGroupIdNew);
    }
    public static Map<Group, Set<Person>> getPersByGroupId(List<DatabaseResponse> responses) {
        return responses.stream()
                .flatMap(resp -> resp.getGroups().stream())
                .collect(Collectors.groupingBy(Function.identity(),
                                               Collectors.flatMapping(group -> group.getPeople().stream(),
                                                                  Collectors.toSet())));
    }
    public static List<Group> getGroupsToRemove(Map<Group, Set<Person>> persByGroupIdOld,
                                                Map<Group, Set<Person>> persByGroupIdNew) {

        return persByGroupIdNew.keySet().stream()
                .filter(group -> !persByGroupIdOld.containsKey(group))
                .collect(Collectors.toList());
    }
    public static List<Group> getGroupsToAdd(Map<Group, Set<Person>> persByGroupIdOld,
                                             Map<Group, Set<Person>> persByGroupIdNew) {

        return persByGroupIdOld.keySet().stream()
                .filter(group -> !persByGroupIdNew.containsKey(group))
                .collect(Collectors.toList());
    }
    public static Set<Person> getPersToRemove(Map<Group, Set<Person>> persByGroupIdOld,
                                              Map<Group, Set<Person>> persByGroupIdNew) {

        return persByGroupIdNew.keySet().stream()
                .filter(persByGroupIdOld::containsKey) // exist in both DB
                .flatMap(group -> {
                    List<Person> persDiff = new ArrayList<>(persByGroupIdNew.get(group));
                    persDiff.removeAll(persByGroupIdOld.get(group));
                    return persDiff.stream();
                })
                .collect(Collectors.toSet()); // set is used because person could potentially belong to many groups
    }
    public static Set<Person> getPersToAdd(Map<Group, Set<Person>> persByGroupIdOld,
                                           Map<Group, Set<Person>> persByGroupIdNew) {

        return persByGroupIdOld.keySet().stream()
                .filter(persByGroupIdOld::containsKey) // exist in both DB
                .flatMap(group -> {
                    List<Person> persDiff = new ArrayList<>(persByGroupIdOld.get(group));
                    persDiff.removeAll(persByGroupIdNew.get(group));
                    return persDiff.stream(); // assumption that equality of the Person class is based on the personId only
                })
                .collect(Collectors.toSet()); // set is used because person could potentially belong to many groups
    }
    public static Map<Person, Set<Qualification>> getQualifByPersToRemove(Map<Group, Set<Person>> persByGroupIdOld,
                                                                          Map<Group, Set<Person>> persByGroupIdNew) {

        return persByGroupIdOld.keySet().stream()
                .filter(persByGroupIdNew::containsKey) // exist in both DB
                .flatMap(group -> {
                    Map<Person, Set<Qualification>> qualifByCommonPersOld = getQualifByPers(persByGroupIdOld.get(group));
                    Map<Person, Set<Qualification>> qualifByCommonPersNew = getQualifByPers(persByGroupIdNew.get(group));
                    return qualifByCommonPersNew.entrySet().stream()
                            .filter(entry -> qualifByCommonPersOld.containsKey(entry.getKey())) // intersection of people
                            .filter(entry -> !entry.getValue()
                                    .equals(qualifByCommonPersOld.get(entry.getKey()))) // qualifications sets for a particular person don't match
                            .map(entry -> { // qualification that aren't present in the old DB
                                Set<Qualification> newQualif = new HashSet<>(entry.getValue());
                                newQualif.removeAll(qualifByCommonPersOld.get(entry.getKey()));
                                return Map.entry(entry.getKey(), newQualif); // Person object and qualifications are present only in new DB
                            });
                })
                .collect(Collectors.toMap(Map.Entry::getKey, Map.Entry::getValue));
    }
    public static Map<Person, Set<Qualification>> getQualifByPersToAdd(Map<Group, Set<Person>> persByGroupIdOld,
                                                                       Map<Group, Set<Person>> persByGroupIdNew) {

        return persByGroupIdOld.keySet().stream()
                .filter(persByGroupIdNew::containsKey) // exist in both DB
                .flatMap(group -> {
                    Map<Person, Set<Qualification>> qualifByCommonPersOld = getQualifByPers(persByGroupIdOld.get(group));
                    Map<Person, Set<Qualification>> qualifByCommonPersNew = getQualifByPers(persByGroupIdNew.get(group));
                    return qualifByCommonPersOld.entrySet().stream()
                            .filter(entry -> qualifByCommonPersNew.containsKey(entry.getKey())) // intersection of people
                            .filter(entry -> !entry.getValue()
                                    .equals(qualifByCommonPersNew.get(entry.getKey()))) // qualifications sets for a particular person don't match
                            .map(entry -> {
                                Set<Qualification> oldQualif = new HashSet<>(entry.getValue());
                                oldQualif.removeAll(qualifByCommonPersNew.get(entry.getKey()));
                                return Map.entry(entry.getKey(), oldQualif); // Person object and qualifications are not present or differ
                            });
                })
                .collect(Collectors.toMap(Map.Entry::getKey, Map.Entry::getValue));
    }
    public static Map<Person, Set<Qualification>> getQualifByPers(Set<Person> people) {
        return people.stream()
                .collect(Collectors.groupingBy(Function.identity(),
                        Collectors.flatMapping(pers -> pers.getQualifications().stream(), Collectors.toSet())));
    }
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.