2

I'm pretty new to python, so I thought I'd give it a shot for this quick script...

Given a set of input words: i.e. "dead", "beef", how would one programmatically generate all possible strings of a given length and padding character?

The result would look something like this (using a pad of 0 and length of 12):

deadbeef0000
dead0beef000
dead00beef00
dead000beef0
dead0000beef
0deadbeef000
0dead0beef00
0dead00beef0
0dead000beef
00deadbeef00
00dead0beef0
00dead00beef
000deadbeef0
000dead0beef
0000deadbeef

The naive method to generate this list might be:

for x in range(0, 5):
  pre = '0' * x
  for y in range(0, 5):
    mid = '0' * y
    for z in range (0, 5):
      post = '0' * z
      val = pre + 'dead' + mid + 'beef' + post
      if len(val) == 12:
        print val

Is there a more straightforward approach? I've looked into itertools for combinations, but can't get it to produce the desired result.

5 Answers 5

6

maybe something like:

>>> from itertools import permutations
>>> lst = ['dead', 'beef', '0', '0', '0', '0']
>>> for i in set(permutations(lst)):
...     ''.join(i)
... 
'beefdead0000'
'dead0beef000'
'dead000beef0'
'00dead0beef0'
'0beef000dead'
...

EDIT: following @TimPeters comment

>>> [''.join(i) for i in set(permutations(lst)) 
                                        if i.index('dead') < i.index('beef')]

or in a more general way:

>>> real_words = ['dead', 'beef']
>>> padding = 4
>>> [''.join(i) for i in set(permutations(real_words + ['0'] * padding)) 
                      if reduce(lambda x,y: i.index(x) < i.index(y), real_words)]
Sign up to request clarification or add additional context in comments.

5 Comments

Hmm. In the OP's example, "dead" always appeared before "beef". So is that not what the OP wanted?
I did have the words ordered in my example, but that was just an old habit to illustrate the problem. Would there be a succinct way to preserve order if that was a requirement?
@Guy The solution in your EDIT doesn't work as written. You want i.index, not lst.index.
@jheddings, the answer I posted preserves the order.
@Guy thanks for your alternatives... I've gone with your first method, although I had to give Tim the answer tick for reproducing the provided example.
4

Assuming that, as your example showed, you want the "real words" to appear in their original order, this can be done directly (each desired output string generated exactly once, with no duplicates and no need to "weed out" extraneous strings). You need one "slot" for each word, and one slot for each pad character, where the total number of pad characters is the specified final length minus the sum of the lengths of the words. In your example, you have two words of length 4, for a total word length of 8, so you need 12-8 = 4 pad characters in each output string. This gives a total of 6 slots (2 for words and 4 for pad characters). The total number of output strings is thus 6-choose-2 = 6-choose-4 = 6*5/2 = 15. That also explains how to use itertools.combinations to get the results: you pick the indices for the 2 words from the set of all slot indices, or - equivalently - pick the indices for the 4 pad characters from the set of all slot indices. The code here does the former:

def get_strings(words, pad, length):
    from itertools import combinations
    nwords = len(words)
    npad = length - sum(len(word) for word in words)
    nslots = nwords + npad
    for ix in combinations(range(nslots), nwords):
        result = [pad] * nslots
        i = 0
        for j in ix:
            result[j] = words[i]
            i += 1
        yield "".join(result)

Then:

for s in get_strings(["dead", "beef"], "0", 12):
    print s

displays the 15 strings you wanted.

7 Comments

I was just writing up a version of this (which wasn't as clever about the pad length), but used zip instead of incrementing i. Plus my arguments were in a different order, pfill(seq, size, padding). :-P
Since the OP said they're pretty new to Python, I wanted to show some mercy ;-) For myself, I would have written the inner loop as for i, j in enumerate(ix): result[j] = words[i].
This correctly answers the question based on the example. And the mercy is appreciated for an old dog learning a new trick :)
@TimPeters Wouldn't simply replacing combinations with permutations in your inner loop handle that?
... Actually... replacing combinations with permutations works if the input words are distinct. If they aren't, then you want a nested loop, but then to avoid duplicates you use combinations again instead of permutations ;)
|
1
for x in range(0, 5):
    pre = '0' * x
    for y in range(0, 5-x):
        mid = '0' * y
        z = 5 - x - y
        post = '0' * z
        val = pre + 'dead' + mid + 'beef' + post
        print val

Comments

1

I'm assuming that in your example, you want dead to appear before beef always. If so, I believe the below code works.

Basically, what it does is it generates all partitions of the number of padding characters necessary that have a number of compartments equal to one more than the number of strings you pass in. This solution uses generators, so that strings are generated on the fly (i.e. lazily). Note that this will not work correctly unless len(padding_character) == 1.

import itertools

def partitions(size, number):
    '''
Yields all partitions of `size` containing `number` compartments, where
a compartment may be 0, and where all permutations are returned
'''
    if number == 1:
        yield (size,)
    else:
        for i in range(0, size+1):
            for partition in partitions(size-i, number-1):
                yield (i,) + partition

def padding_generator(strings, padding_character, total_length):
    strings_length = sum(map(len, strings))
    padding_length = total_length - strings_length
    num_padding_locations = len(strings)+1
    for partition in partitions(padding_length, num_padding_locations):
        result = ''
        for padding_count, string in itertools.zip_longest(partition, strings, fillvalue=''):
            result += padding_character*padding_count
            result += string
        yield result

for string in padding_generator(['dead', 'beef'], '0', 12):
    print(string)

Results:

deadbeef0000
dead0beef000
dead00beef00
dead000beef0
dead0000beef
0deadbeef000
0dead0beef00
0dead00beef0
0dead000beef
00deadbeef00
00dead0beef0
00dead00beef
000deadbeef0
000dead0beef
0000deadbeef

2 Comments

Thanks for the explanation and thorough implementation. Helps to see other features of the language. If order of the words was not important, would you change your implementation?
@jheddings If word order is not important, Guy's original solution will work (and will be much simpler), because all you're doing mathematically is permuting the padding characters and the words. All this partition stuff is only necessary if the padding characters and the words need to be treated separately (which they don't if word order doesn't matter).
1

Just for fun, a direct recursive approach:

def padded(to_pad, pad_with, pad_amount):
    if not to_pad:
        yield pad_with * pad_amount
        return
    for i in range(pad_amount + 1):
        for suffix in padded(to_pad[1:], pad_with, pad_amount - i):
            yield pad_with * i + to_pad[0] + suffix

Testing:

>>> list(padded(['dead', 'beef'], '0', 4))
['deadbeef0000', 'dead0beef000', 'dead00beef00', 'dead000beef0', 'dead0000beef',
 '0deadbeef000', '0dead0beef00', '0dead00beef0', '0dead000beef', '00deadbeef00',
 '00dead0beef0', '00dead00beef', '000deadbeef0', '000dead0beef', '0000deadbeef']

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.