2

I am iterating a complex list in python with this format:

[
  {
    <parent id>: [
      <child id>,
      <child id>
    ],
    <parent id>: [
      <child id>
    ]
  },
  {
    <parent id>: [
      <child id>,
      <child id>,
      <child id>
    ],
    <parent id>: [
      <child id>
    ]
  }
]

The list will have dict as elements. These dict have keys of <parent id> and values of a list of <child id>

There can be the same <parent id> in different dict, but <child id> can only belong to one <parent id>. An example is this:

[
  {
    2: [1, 5],
    3: [3, 7],
    4: [6]
  },
  {
    1: [2, 4, 8],
    4: [6]
  }
]

Parent id 4 is in both dict elements but all child ids are unique to a parent id.

Now I am iterating this data structure as input because I want to make sure that the condition of all childs being unique to a parent id is met. This is my code:

def check_format(self, mapping):
    # mapping is the data structure
    unique_parent_list = []
    child_list = []

    for x in range(0, 2):
        for parent in mapping[x].keys():
            if parent not in unique_parent_list:
                unique_parent_list.append(parent)
                for child in mapping[x][parent]:
                    child_list.append(child)
    if len(child_list) > len(set(child_list)):
        return 'Condition not met'
    else:
        return 'Condition met'

This works, but I do not like that it is O^4 complexity or something like that. Is there a way to simplify or code this for better performance?

1
  • Please put more descriptive names on your variables. key -> parent and y -> child for example. Commented Aug 8, 2018 at 5:59

2 Answers 2

3

You clearly have a mapping relationship from child to parent. The easiest thing I can think of is just to make a dict with children as keys. If you encounter a child that's already in, check the parent value.

The lookup and insertion happen in constant time (dict keys are effectively a hash set). You can also use short circuiting more effectively since you can stop the moment you find a child with multiple parents:

def check_format(map_list):
    check = {}
    for parent, children in (i  for d in map_list for i in d.items()):
        for child in children:
            if check.setdefault(child, parent) != parent:
                return False
    return True

This will iterate exactly once per child and perform a constant-time (ideally) operation on each using dict.setdefault.

Sign up to request clarification or add additional context in comments.

1 Comment

Thank you so much. You are a rockstar
0

Are you sure this is O(3) complexity? In regards to what?

Is this code too slow for you? If you want to look through all childs there really isn't anything else than just iterating through them like this.

However. Consider having the unique_parent_list and child_list be sets instead of list. That will probably make the in checks faster (O(log(n) compared to O(1)). But if you care you should profile to see that so is the case.

You can also make an early exit (if the format is bad) as soon as you find a duplicate child if you check the condition when you put children in the child_list.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.