3

I am making a function that takes length:int and distance:int as inputs, and output the largest integer that satisfies the following properties in its binary representation:

  1. It starts with 1, ends with 1

  2. There are distance - 1 many 0 between each 1

  3. Its length is strictly less than length

It could be implemented as the following:

def repeatdigit(length:int, distance:int) -> int:
   result = 0
   pointer = 1
   for _ in range(1, length, distance):
      result |= pointer
      pointer <<= distance
   return result

For example: repeatdigit(10,3)==0b1001001 , repeatdigit(15,4)=0b1000100010001. I’m only caring about binary because I need to use it as a tool of bit manipulation. In practice, length will be very big, distance ** 2 < length. Assuming the input is valid.

As its name suggested, it is repeating a same pattern over and over.

An explicit formula:

result = (1 << (length+distance-2 - ((length-2) % distance)))//((1 << distance) - 1)

But both algorithm are slow, their time complexity is O(n²), where n = length.

Doing similar things with list only takes O(n). Namely ([1]+[0]*(distance-1)) * ((length-2)//distance) + [1]

(Context: I want to make such an integer so that I don’t need to store 0 and 1 in a very long list as it is space consuming)

How do I make such an integer in O(n)?

9
  • n is not the value of length; it's the logarithm of length (or rather, the logarithm of length + distance). Commented Dec 19, 2024 at 21:34
  • 2
    You would probably be better off using the bitarray package. Commented Dec 19, 2024 at 21:41
  • @chepner It's their question, they get to define it, not us. If their definition causes an issue, then point out that issue. (And I btw don't see an issue with that.) Commented Dec 19, 2024 at 22:28
  • Sorry, I didn’t test my code thoroughly. Now the bugs should be fixed Commented Dec 20, 2024 at 0:43
  • If you need a random-access mutable sequence data type, ints make a terrible substitute. This isn't the only operation where trying to use ints like a list will be terrible for performance. Commented Dec 20, 2024 at 1:03

1 Answer 1

0

Doubling the number of 1-bits instead of adding them just one at a time:

def repeatdigit(length:int, distance:int) -> int:
   def f(want):
      if want == 1:
         return 1
      have = (want + 1) // 2
      x = f(have)
      add = min(have, want - have)
      return x | (x << (add * distance))
   ones = 1 + (length - 2) // distance
   return f(ones)

My ones, want, have and add refer to numbers of 1-bits.

The add = min(have, want - have) could be "simplified", as I either add have or have - 1 1-bits. It comes from this earlier, slightly less optimal version:

def repeatdigit(length:int, distance:int) -> int:
   result = 1
   want = 1 + (length-2) // distance
   have = 1
   while have < want:
      add = min(have, want - have)
      result |= result << (add * distance)
      have += add
   return result
Sign up to request clarification or add additional context in comments.

2 Comments

Why the one that uses recursive function call will be faster than the while loop version? I remembered that function calls are expensive. I’m just curious.
@Mr.W With very big lengths, the few function calls are insignificant. Almost all time will be spent by the int operations. Now let's say we want 17 ones. The recursive version will build values with 17, 9, 5, 3, 2 and 1 ones, 37 total. The while-loop one will build values with 1, 2, 4, 8, 16, and 17 ones, 48 total. It's cheaper to build 17 from combining 9 and 9, wasting just 1 overlapped. Compared to combining 16 and 16, wasting 15 overlapped.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.