How to insert a character after every 2 characters in a string

Question

Is there a pythonic way to insert an element into every 2nd element in a string?

I have a string: 'aabbccdd' and I want the end result to be 'aa-bb-cc-dd'.

I am not sure how I would go about doing that.

In more recent Python versions, you can do addr[::-1].hex(':') if you are doing this to create a hex string for a MAC address. Thanks to nocomment for pointing this out (in a comment) below. — Josiah Yoder
– Josiah Yoder, Commented Feb 25 at 15:51

SilentGhost · Accepted Answer · 2010-07-15 18:21:08Z

78

>>> s = 'aabbccdd'
>>> '-'.join(s[i:i+2] for i in range(0, len(s), 2))
'aa-bb-cc-dd'

answered Jul 15, 2010 at 18:21

SilentGhost

322k67 gold badges312 silver badges294 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

Hamish Grubijan Over a year ago

What about sequences of odd length?

jsbueno Over a year ago

I consider this more pythonic than the zip voodoo in the aproved answer. The fact thet you don't need to use range(len(string)) in for loops in python, does not mean one have to go to invent crazy things just to avoid it.

SilentGhost Over a year ago

@hamish: it keeps last character in and inserts a hyphen in front of it. Is it not a desired behaviour?

kennytm · Accepted Answer · 2010-07-15 18:30:55Z

60

Assume the string's length is always an even number,

>>> s = '12345678'
>>> t = iter(s)
>>> '-'.join(a+b for a,b in zip(t, t))
'12-34-56-78'

The t can also be eliminated with

>>> '-'.join(a+b for a,b in zip(s[::2], s[1::2]))
'12-34-56-78'

The algorithm is to group the string into pairs, then join them with the - character.

The code is written like this. Firstly, it is split into odd digits and even digits.

>>> s[::2], s[1::2]
('1357', '2468')

Then the zip function is used to combine them into an iterable of tuples.

>>> list( zip(s[::2], s[1::2]) )
[('1', '2'), ('3', '4'), ('5', '6'), ('7', '8')]

But tuples aren't what we want. This should be a list of strings. This is the purpose of the list comprehension

>>> [a+b for a,b in zip(s[::2], s[1::2])]
['12', '34', '56', '78']

Finally we use str.join() to combine the list.

>>> '-'.join(a+b for a,b in zip(s[::2], s[1::2]))
'12-34-56-78'

The first piece of code is the same idea, but consumes less memory if the string is long.

edited Jul 15, 2010 at 18:30

answered Jul 15, 2010 at 18:19

kennytm

526k110 gold badges1.1k silver badges1k bronze badges

2 Comments

root Over a year ago

Can you explain the zip part? What is that doing?

kennytm Over a year ago

@Ham: The last character will be gone.

Peter Hansen · Accepted Answer · 2020-12-11 02:47:34Z

8

I tend to rely on a regular expression for this, as it seems less verbose and is usually faster than all the alternatives. Aside from having to face down the conventional wisdom regarding regular expressions, I'm not sure there's a drawback.

>>> s = 'aabbccdd'
>>> '-'.join(re.findall('..', s))
'aa-bb-cc-dd'

This version is strict about actual pairs though:

>>> t = s + 'e'
>>> '-'.join(re.findall('..', t)) 
'aa-bb-cc-dd'

... so with a tweak you can be tolerant of odd-length strings:

>>> '-'.join(re.findall('..?', t))
'aa-bb-cc-dd-e'

Usually you're doing this more than once, so maybe get a head start by creating a shortcut ahead of time:

PAIRS = re.compile('..').findall

out = '-'.join(PAIRS(in))

Or what I would use in real code:

def rejoined(src, sep='-', _split=re.compile('..').findall):
    return sep.join(_split(src))

>>> rejoined('aabbccdd', sep=':')
'aa:bb:cc:dd'

I use something like this from time to time to create MAC address representations from 6-byte binary input:

>>> addr = b'\xdc\xf7\x09\x11\xa0\x49'
>>> rejoined(addr[::-1].hex(), sep=':')
'49:a0:11:09:f7:dc'

answered Dec 11, 2020 at 2:47

Peter Hansen

22.3k6 gold badges57 silver badges75 bronze badges

4 Comments

Josiah Yoder Over a year ago

I recommend opening with the ..? version to share the best version of the regexp. The fact that it silently drops the last character for an odd-length string makes the .. version a weaker solution to start from, in my opinion.

Josiah Yoder Over a year ago

The MAC address example is sweet.

no comment Feb 25 at 10:14

@JosiahYoder Nah, should be addr[::-1].hex(':') without this rejoining. (Works since Python 3.8 in 2019.)

Josiah Yoder Feb 25 at 15:47

@nocomment Yes, that's nice too.

Dave Kirby · Accepted Answer · 2010-07-15 18:42:52Z

5

If you want to preserve the last character if the string has an odd length, then you can modify KennyTM's answer to use itertools.izip_longest:

>>> s = "aabbccd"
>>> from itertools import izip_longest
>>> '-'.join(a+b for a,b in izip_longest(s[::2], s[1::2], fillvalue=""))
'aa-bb-cc-d'

or

>>> t = iter(s)
>>> '-'.join(a+b  for a,b in izip_longest(t, t, fillvalue=""))
'aa-bb-cc-d'

answered Jul 15, 2010 at 18:42

Dave Kirby

26.7k5 gold badges72 silver badges84 bronze badges

Comments

Tony Veijalainen · Accepted Answer · 2010-07-15 19:26:26Z

1

Here is one list comprehension way with conditional value depending of modulus of enumeration, odd last character will be in group alone:

for s  in ['aabbccdd','aabbccdde']:
    print(''.join([ char if not ind or ind % 2 else '-' + char
                    for ind,char in enumerate(s)
                    ]
                  )
          )
""" Output:
aa-bb-cc-dd
aa-bb-cc-dd-e
"""

edited Jul 15, 2010 at 19:26

answered Jul 15, 2010 at 19:12

Tony Veijalainen

5,56525 silver badges32 bronze badges

Comments

chryss · Accepted Answer · 2010-07-15 18:26:16Z

0

This one-liner does the trick. It will drop the last character if your string has an odd number of characters.

"-".join([''.join(item) for item in zip(mystring1[::2],mystring1[1::2])])

answered Jul 15, 2010 at 18:26

chryss

7,51941 silver badges46 bronze badges

Comments

Noam-N · Accepted Answer · 2024-09-30 09:43:38Z

I added tests to @SilentGhost's answer

def insert_between_every_n_characters(original: str, inserted: str, step: int) -> str:
    """
    Insert a string between every N characters.

    >>> insert_between_every_n_characters('aabbccdd', '--', 1)
    'a--a--b--b--c--c--d--d'

    >>> insert_between_every_n_characters('aabbccdd', '-', 2)
    'aa-bb-cc-dd'

    >>> insert_between_every_n_characters('aabbccd', ':', 3)
    'aab:bcc:d'

    >>> insert_between_every_n_characters('aabbccdda', ':', 3)
    'aab:bcc:dda'

    >>> insert_between_every_n_characters('a', '-', 2)
    'a'

    >>> insert_between_every_n_characters('', '-', 2)
    ''
    """
    if step <= 0:
        raise ValueError(f"step must be greater than zero. Got: {step}")
    return inserted.join(original[i : i + step] for i in range(0, len(original), step))

Guy Gangemi · Accepted Answer · 2025-02-26 03:56:43Z

0

Using only generators is possible with current python.

Pros of this solution:

'length' is specified once and can be changed without needing to update the formula.

Doesn't use indices (I'm calling it, they're unpythonic)

from itertools import batched

s = 'aabbccdd'
r = '-'.join(''.join(b) for b in batched(s, 2))

Batched returns 2 chars from the string. They are 'joined' with a null char. Each resultant string is 'joined' with a '-' to those that came before until the string is exhausted.

edited Feb 26 at 3:56

answered Feb 25 at 5:59

Guy Gangemi

1,7911 gold badge15 silver badges30 bronze badges

3 Comments

no comment Feb 25 at 10:23

No need to reinvent map, use '-'.join(map(''.join, batched(s, 2))).

no comment Feb 25 at 11:28

And "minimal memory" isn't right. For example for the million characters s = 'a' * 10**6 it takes over 27 MB and it can be done with much less.

Guy Gangemi Feb 26 at 3:56

join does not operate the way i assumed it did, cheers. Didn't map reinvent the for loop?

Nuno André · Accepted Answer · 2020-06-09 16:37:11Z

-1

As PEP8 states:

Do not rely on CPython's efficient implementation of in-place string concatenation for statements in the form a += b or a = a + b. This optimization is fragile even in CPython (it only works for some types) and isn't present at all in implementations.

A pythonic way of doing this that avoids this kind of concatenation, and allows you to join iterables other than strings could be:

':'.join(f'{s[i:i+2]}' for i in range(0, len(s), 2))

And another more functional-like way could be:

':'.join(map('{}{}'.format, *(s[::2], s[1::2])))

This second approach has a particular feature (or bug) of only joining pairs of letters. So:

>>> s = 'abcdefghij'
'ab:cd:ef:gh:ij'

and:

>>> s = 'abcdefghi'
'ab:cd:ef:gh'

edited Jun 9, 2020 at 16:37

answered Jun 9, 2020 at 16:22

Nuno André

5,4161 gold badge39 silver badges47 bronze badges

Collectives™ on Stack Overflow

How to insert a character after every 2 characters in a string

9 Answers 9

3 Comments

2 Comments

4 Comments

Comments

Comments

Comments

Comments

3 Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

9 Answers 9

3 Comments

2 Comments

4 Comments

Comments

Comments

Comments

Comments

3 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related