Avoid nested loops when checking data in Python

Question

I have two lists of dictionaries:

dict_list1 = [{'k1':1, 'k2':2}, {'k1':3, 'k2':4}]
dict_list2 = [{'k1':1, 'k2':2, 'k3':10}, {'k1':3, 'k2':4, 'k3':10}]

And now for each dict_x in dict_list1, I want to know if there is a dict_y on dict_list2 that contains every key,value from dict_x.

I cannot think of another way of doing this other then:

for dict_x in dict_list1:
    for dict_y in dict_list2:
        count = len(dict_x)
        for key, val in dict_x.items():
            if key in dict_y and dict_y[key] == val:
                count -= 1
        if count == 0:
            print('YAY')
            break

ShadowRanger · Accepted Answer · 2018-08-08 20:41:17Z

dict views can perform quick "is subset" testing via the inequality operators. So:

if dict_x.items() <= dict_y.items():  # Use .viewitems() instead of .items() on Python 2.7

will only return true if every key/value pair in dict_x also appears in dict_y.

This won't change anything in terms of big-O performance, but it does make the code somewhat cleaner:

for dict_x in dict_list1:
    for dict_y in dict_list2:
        if dict_x.items() <= dict_y.items():
            print('YAY')
            break

Note that creating the views costs something (it's just a fixed cost, not dependent on dict size), so if performance matters, it may be worth caching the views; doing so for dict_list1 is free:

for dict_x in dict_list1:
    dict_x_view = dict_x.items()
    for dict_y in dict_list2:
        if dict_x_view <= dict_y.items():
            print('YAY')
            break

but some eager conversions would be needed to cache both:

# Convert all of dict_list2 to views up front; costs a little if
# not all views end up being tested (we always break before finishing)
# but usually saves some work at the cost of a tiny amount of memory
dict_list2_views = [x.items() for x in dict_list2]
for dict_x in dict_list1:
    dict_x_view = dict_x.items()
    for dict_y_view in dict_list2_views:
        if dict_x_view <= dict_y_view:
            print('YAY')
            break

You could also collapse the loop using any (which removes the need to break since any short-circuits), so the first (simplest) check could become:

for dict_x in dict_list1:
    if any(dict_x.items() <= dict_y.items() for dict_y in dict_list2):
       print('YAY')

This could be further collapsed to a single list comprehension that results in the various matches, but at that point the code is going to be pretty cramped/ugly:

for _ in (dict_x in dict_list1 if any(dict_x.items() <= dict_y.items() for dict_y in dict_list2)):
    print('YAY')

though without knowing what you'd really do (as opposed to just printing YAY) that's getting a little pointless.

Patrick Haugh · Accepted Answer · 2018-08-08 20:41:13Z

3

Below, I use the fact that the dict.items view implements set operations to check for each d1.items() if there exists a d2.items(), such that d1.items() is a subset of d2.items()

[any(d1.items() <= d2.items() for d2 in dict_list2) for d1 in dict_list1]

edited Aug 8, 2018 at 20:41

answered Aug 8, 2018 at 20:34

Patrick Haugh

61.3k13 gold badges94 silver badges101 bronze badges

1 Comment

Patrick Haugh Over a year ago

My current version returns a list the length of dict_list1 representing whether or not some element of dict_list2 is a superdict of the dict at that index in dict_list1.

Ajax1234 · Accepted Answer · 2018-08-08 20:38:11Z

-1

You can use any and all:

dict_list1 = [{'k1':1, 'k2':2}, {'k1':3, 'k2':4}]
dict_list2 = [{'k1':1, 'k2':2, 'k3':10}, {'k1':3, 'k2':4, 'k3':10}]
v = [any(all(c in i and i[c] == k for c, k in b.items()) for i in dict_list2)\
   for b in dict_list1]

Output:

[True, True]

edited Aug 8, 2018 at 20:38

answered Aug 8, 2018 at 20:30

Ajax1234

71.7k9 gold badges67 silver badges110 bronze badges

4 Comments

Patrick Haugh Over a year ago

This only checks keys, not values.

roganjosh Over a year ago

Also, doesn't it have multiple loops just put in a 1-liner?

GuiFGDeo Over a year ago

Yes, I supose there are 3 fors as well, but in one single line

Patrick Haugh Over a year ago

Try dict_list1 = [{'k1':1, 'k2': None}] and dict_list2 = [{'k1':1}].

Collectives™ on Stack Overflow

Avoid nested loops when checking data in Python

3 Answers 3

1 Comment

1 Comment

4 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

1 Comment

1 Comment

4 Comments

Your Answer

Sign up or log in

Post as a guest

Related