If we are given a binary file of length n, where each bit independently is one with probability 1/3 and zero else. We want to construct a method that the expected length of the compressed sequence is less than 10 percent more than Shannon's lower bound (for all n large enough). I've got the lower bound is 0.918. I tried to use tuples of size 2, but it gives me an expected length of 1.88 by Huffman coding. Am I going in the right direction?
- What if we want to get a 3% margin ?