This is a follow up from this question. The goal is to make some function that can return the number of possible permutations with repetitions. As Gareth pointed out there is a nice formula for this problem:
$$ \frac{m!}{n_1!\,n_2! \ldots n_k!} $$
Where \$m\$ is the length of our permutations and \$n_i\$ is the number of times the \$i\$th letter appears in \$m\$. A simple example can be shown with the word Mississippi.
$$ P(\text{Mississippi}) = \frac{11!}{4!\,4!\,2!} $$
A simple implementation of this can be done in Python as follows:
from numpy import prod
def count_unique_perms(obj):
obj = str(obj)
f = [factorial(i) for i in Counter(obj).values()]
return factorial(len(obj))/prod(f)
I include this code as a basetest for the "improved" version below. Questions were raised whether this approach works for large factorials, and it was said that the running time was at worst \$O(n^3(log n)^2)\$
I used the ideas from my previous question. To try to improve the answer. First I find the number of factors in \$m!\$ then I subtract the factors from \$n_1!\,n_2! \ldots n_k!\$. A small optimization I did was avoiding calculating duplicates.
My question if this is a decent improvement. It seems much slower until really big string. For example, it is slightly faster for str(Mississippi)*500. Any other tips or algorithms are also welcome =)
from collections import Counter
from math import factorial, log
from numpy import prod
from primesieve import generate_primes
def count_unique_perms(obj):
obj = str(obj)
f = [factorial(i) for i in Counter(obj).values()]
return factorial(len(obj))/prod(f)
def fast_unique_perms(obj):
obj = str(obj)
factors = factors_factorial(len(obj))
obj_count = Counter([i for i in Counter(obj).values() if i > 1])
for p in obj_count:
count_p = factors_factorial(p, obj_count[p])
for prime in count_p:
factors[prime] -= count_p[prime]
num_perms = 1
for p in factors:
num_perms *= p**factors[p]
return num_perms
def factors_factorial(n, multiplicity=1):
"Calculates all the facors of n!"
count = Counter()
for p in generate_primes(n+1):
mul = 0
for k in range(1, int(1+log(n)/log(p))):
mul += int(n/p**k)
count[p] = multiplicity*mul
return count
if __name__ == '__main__':
import timeit
lst = 1123
string = 'mississippi'
print count_unique_perms(string)
print fast_unique_perms(string)
times = 100
t1 = timeit.timeit("count_unique_perms('mississippi'*1000)",
setup="from __main__ import count_unique_perms", number=times)/float(times)
t2 = timeit.timeit("fast_unique_perms('mississippi'*1000)",
setup="from __main__ import fast_unique_perms", number=times)/float(times)
print "fast_unique_perms: ", 1000*t2, "ms"
print "count_unique_perms: ", 1000*t1, "ms"
print "The count_unique_perms solution was: ", t2/float(t1), "times faster than fast_unique_perms."