0

I got a challenge on CodeEval, which is:

Sequence 011212201220200112 ... constructed as follows: first is 0, then repeated the following action: already written part is attributed to the right with replacement 0 to 1, 1 to 2, 2 to 0. E.g.

0 -> 01 -> 0112 -> 01121220 -> ...

Create an algorithm which determines what number is on the N-th position in the sequence.

INPUT SAMPLE:

0
5
101
25684

OUTPUT SAMPLE:

0
2
1
0

CONSTRAINTS:

0 <= N <= 3000000000

I already wrote the code, but from the constraint, N can be a very large number, and my code got MemoryError on numstring += temp

Here's my code:

from string import maketrans

def predict(n):
    n = int(n)
    instring = '012'
    outstring = '120'
    tab = maketrans(instring, outstring)
    numstring = '0'
    temp = numstring
    while len(numstring) <= n:
        temp = temp.translate(tab)
        numstring += temp
        temp = numstring
    print numstring[n]

Is there any way to optimize this?

Note: The Input and Output is in the type of string, not integer

2
  • Yeah, I'm so stupid I even store the entire string in memory xD Commented Jun 15, 2014 at 10:51
  • The whole string are 30 GB. You need to reconsider. Commented Jun 15, 2014 at 10:55

3 Answers 3

1

Think about it. Is it actually necessary to store everything?

You start with one character; this generates two, which generates four, etc. Each character in a given substring is a function of one, and only one, character in the preceding substring. You don't need to keep track of any of the other characters; only the ones that affect your answer.

Sign up to request clarification or add additional context in comments.

Comments

1

This kind of problems is easily solved thinking recursively. First of all you have to get rid of the thought that you are dealing with strings and string operations. The translations that you have to do are 0 -> 1, 1 -> 2 and 2 -> 0. This means that you are simply adding 1 to all digits modulo 3 (i.e. add 1, divide by 3 and take the remainder).

Recursively how would you solve this problem? Well, if the N requested is 0 you are in the base case of the beginning of the string, so the answer is 0. If the request is for a bigger index what should we do? As Tom pointed out if you split the sequence in two halves every character in the second half is a function of one character in the first half. If we can compute the index of such a character we can solve the problem recursively.

The index of such a character is easily computed. The sequence itself is of infinite length, but, given the input N you can always consider the prefix of length 2^(ceil(log2(N))) and you'll always have a situation like:

a b c ... z | A B C ... Z
         middle   ^
                  N 

i.e. where N is on the second half.

If from N we substract the length of the first half we obtain the index of the wanted character. The length of half the string would be 2^(floor(log2(N))):

   a b c ... z A B C ... Z      ~~>      A B C ... Z
              ^    ^                         ^
             2^x   N                        N - 2^x

So we have to recursively solve the problem for N - 2^x (with x = floor(log2(N))) and then we have to apply the transformation to get the result relative to the fact that we were in the second half of the string.

In other words, the solution is:

def find_digit(n):
    if n == 0:
        return 0
    # bit_length() gives floor(log2(N))
    length = n.bit_length()
    return (1 + find_digit(n - 2 ** (length-1))) % 3

And in fact:

In [22]: find_digit(0)
Out[22]: 0

In [23]: find_digit(5)
Out[23]: 2

In [24]: find_digit(101)
Out[24]: 1

In [25]: find_digit(25684)
Out[25]: 0

In [26]: find_digit(3000000000)
Out[26]: 0

In [27]: find_digit(3000000001)
Out[27]: 1

In [28]: find_digit(3000000002)
Out[28]: 1

In [29]: find_digit(30000000021265672555626541643155343185826435812641)
Out[29]: 2


In [30]: find_digit(30000000021265672555626541643155343185826435812641**10)
Out[30]: 1

Note that this solution is both memory and computationally efficient, doing only log(n) operations. However it is still limited to 1000 recursive calls.

From the recursive solution it's pretty easy to obtain a non-recursive solution which will work for huge inputs:

def find_digit(n):
    start = 0
    while n:
        start += 1
        length = n.bit_length()
        n -= 2 ** (length - 1)
    return start % 3

If you want to output a string just change the return line to return str(start % 3). If you want to receive a string input just add n = int(n) at the top of the function.

Comments

0

If you consider how large the string can get, I would assume the point of this exercise is exactly to not store the whole string in memory.

Also, you might want to try replacing

numstring += temp
temp = numstring

by

temp = "%s%s" % (numstring, temp)

This may be slightly more memory-efficient in the short run, even though it won't really solve the underlying algorithmic issue.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.