Given alphabet $A$ and a sequence $S$, determine the optimal dictionary $D$ of max word length $m$ that can compose the sequence.
Optimality can be given as a two-part cost $(L)$ = dictionary cost $(L(D))$ + encoding cost $(L(S|D))$, e.g.
$L(D) = \log_2(|A|) * \sum_{w \in D} |w| $
$L(S|D)$ is Huffman coding, using probabilities learned from the sequence.
Typical scenarios for my case: |A|=10, length of S=180, m=21
The goal is to find structure in the sequence, being encoded in words $w\in D$. In some cases, it is given there exists a word of length $m^*$. Please point to any references, libraries (python preferred) for the algorithm.