11

Is there any HashSet implementation in Python? I know HashTable can be represented using dictionaries, but how do we represent HashSet implementation.

I am NOT looking for a data structure with the same methods as HashSets but rather someone with a CONSTANT lookup time, or the order of O(1);

Also, I want to know if the lookup time in a Python Dictionary is constant aka O(1)

5
  • Are you looking for set()? And for complexity, read this: wiki.python.org/moin/TimeComplexity Commented Jan 3, 2018 at 8:02
  • set is HashSets and for your "also", yes. Commented Jan 3, 2018 at 8:02
  • does set() have a has method to check whether a given value exists in the set? I saw x in sin the docs, but I'm not sure if it can be used to in if or only in a for. I'm new to python so haven't really used set() before Commented Jan 3, 2018 at 8:06
  • Both are constants, and yeah, set ~ HashSet and dict is HashTable Commented Jan 3, 2018 at 8:06
  • I'm not sure if it can be used to in if. Yes, and it's actually if x in set() is the Pythonic (the "recommended") way of testing belonging (if x in dict() checks if x is among the dictionary keys). Take a look to docs.python.org/3/reference/datamodel.html#object.__hash__ (there's a lot of interesting information and links to other juicy parts of the documentation in that page) Commented Jan 3, 2018 at 8:21

3 Answers 3

6

I think the HashSet implementation you are looking is set(). This answer may help you: What's the difference between HashSet and Set?

And yes, the average time complexity for python dictionary O(1). You may read on why we use the term: "Average time complexity": Time complexity of accessing a Python dict

Sign up to request clarification or add additional context in comments.

1 Comment

This answer could be improved by adding the key message from the linked items to it in case the links die.
3

I guess this is what you want. You may define a hash function yourself, like what I did in HashSet or just use the built-in hash() function in Python.

class HashSet:
    CONST = 2 ** 61 - 1

    def __init__(self, size = 10_000):
        self.size = size * 2
        self.contents = [None] * self.size

    def hash(self, x):
        return x % CONST

    def put(self, key):
        idx = self.hash(key) % self.size
        arr = self.contents[idx]
        if arr is None:
            self.contents[idx] = [key]
        elif key not in arr:
            arr.append(key)
        return None

    def get(self, key):
        idx = self.hash(key) % self.size
        arr = self.contents[idx]
        if arr is None or key not in arr:
            return False
        return True


myset = HashSet()
myset.put(123)
myset.put(145)
myset.put(138)
res = myset.get(145)
print(res)
res = myset.get(10)
print(res)


class HashMap:
    def __init__(self, size = 10_000):
        self.size = size * 2
        self.contents = [None] * self.size

    class __Pair:
        def __init__(self, key, value):
            self.key = key
            self.value = value

    def find(self, arr, key):
        for pair in arr:
            if pair.key == key:
                return pair
        return None

    def put(self, key, value):
        idx = hash(key) % self.size
        pair = self.__Pair(key, value)
        arr = self.contents[idx]
        if arr is None:
            self.contents[idx] = [pair,]
            return None

        t = self.find(arr, key)
        if t != None:
            t.value = value
        else:
            arr.append(pair)

    def get(self, key):
        idx = hash(key) % self.size
        arr = self.contents[idx]
        if arr == None:
            raise KeyError(f'{key} is not a valid key')
        t = self.find(arr, key)
        if t == None:
            raise KeyError(f'{key} is not a valid key')
        return t.value


mymap = HashMap()
mymap.put('abc', [123,456])
mymap.put('def', [456,789])
res = mymap.get('abc')
print(res)
res = mymap.get('def')
print(res)
res = mymap.get('defx')
print(res)

Comments

-4

@Ayush Gupta, I have implemented the HashSet. Please do have a look at it. Comment for any feedback.

class MyHashSet:
def __init__(self):
    self.l = []

def add(self, key: int) -> None:
    if key not in self.l:
        self.l.append(key)

def remove(self, key: int) -> None:
    if key in self.l:
        self.l.remove(key)

def contains(self, key: int) -> bool:
    return key in self.l
    
# Your MyHashSet object will be instantiated and called as such:
obj = MyHashSet()
obj.add(key)
obj.remove(key)
param_3 = obj.contains(key)

3 Comments

This doesn't provide O(1) lookup: item in list is O(n).
@snakecharmerb can you suggest some changes to it. I am also a beginner.
I think this will be O(1) as internally hashset is backed by hashmap. stackoverflow.com/questions/25247854/…

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.