I am developing AI to perform MDP, I am getting states(just integers in this case) and assigning it a value, and I am going to be doing this a lot. So I am looking for a data structure that can hold(no need for delete) that information and will have a very fast get/update function. Is there something faster than the regular dictionary? I am looking for anything really so native python, open sourced, I just need fast getting.
-
It's really hard to beat a well implemented hash table like the one in Python, however if your integer keys are dense, then a list may be slightly faster than a dictionary (even though both are O(1) anyway). However it's quite unlikely that any performance bottleneck you're having is due to using dictionary.Lie Ryan– Lie Ryan2013-04-23 19:39:36 +00:00Commented Apr 23, 2013 at 19:39
-
2Yes, this sounds like premature optimization.Kyle Strand– Kyle Strand2013-04-23 19:44:39 +00:00Commented Apr 23, 2013 at 19:44
-
1@Easily: Switching out the underlying data structure will be very easy if the need arises, why do you think this would "cement" anything?Niklas B.– Niklas B.2013-04-23 19:52:14 +00:00Commented Apr 23, 2013 at 19:52
-
1Yeah, and if you're really worried about the refactoring this might cause, just create some kind of wrapper class that hides the implementation; that way, you can change that class quite easily without changing any of the other code.Kyle Strand– Kyle Strand2013-04-23 19:52:58 +00:00Commented Apr 23, 2013 at 19:52
-
1As a side note, if you do find a solution that's much faster than a python dictionary I suggest you tell the python people about it so they can incorporate it into python.jleahy– jleahy2013-04-23 20:13:59 +00:00Commented Apr 23, 2013 at 20:13
3 Answers
Using a Python dictionary is the way to go.
4 Comments
You're saying that all your keys are integers? In that case, it might be faster to use a list and just treat the list indices as the key values. However, you'd have to make sure that you never delete or add list items; just start with as many as you think you'll need, setting them all equal to None, as shown:
mylist = [None for i in xrange(totalitems)]
Then, when you need to "add" an item, just set the corresponding value.
Note that this probably won't actually gain you much in terms of actual efficiency, and it might be more confusing than just using a dictionary.
For 10,000 items, it turns out (on my machine, with my particular test case) that accessing each one and assigning it to a variable takes about 334.8 seconds with a list and 565 seconds with a dictionary.
5 Comments
[None]*totalitems[None] * n is a widely-used Python idiom for creating a list, and as such slightly more readable than the list comprehension. (Also, the list comprehension has extra square brackets that make it invalid syntax.)If you want a rapid prototype, use python. And don't worry about speed.
If you want to write fast scientific code (and you can't build on fast native libraries, like LAPACK for linear algebra stuff) write it in C, C++ (maybe only to call from Python). If fast instead of ultra-fast is enough, you can also use Java or Scala.