Python list comparing characters and counting them

Question

I have a little question about how to check and compare two or more characters in the list in Python.

For example, I have a string "cdcdccddd". I made a list from this string to easier comparing the characters. And the needed output is: c: 1 d: 1 c: 1 d: 1 c: 2 d: 3 So it is counting the characters, if first is not the same as the second, the counter = 1, if the second is the same as third, then counter is +1 and need check the third with fourth and so on.

I got so far this algorithm:
text = "cdcdccddd"
l = []
l = list(text)
print list(text)

for n in range(0,len(l)):
    le = len(l[n])
    if l[n] == l[n+1]:
        le += 1
        if l[n+1] == l[n+2]:
            le += 1
        print l[n], ':' , le
    else: 
        print l[n], ':', le

but its not working good, because its counts the first and second element, but not the second and third. For this output will be:

c : 1
d : 1
c : 1
d : 1
c : 2
c : 1
d : 3

How to make this algorithm better?

Thank you!

As you said this algorithm is not correct in base because you cannot count all the occurrences like this (since you are not aware of the number of duplicate sequences). One way for overcoming to this problem is categorizing your characters then counting the number of characters in each sub set. — Kasravnd
– Kasravnd, Commented Apr 10, 2016 at 21:31

Padraic Cunningham · Accepted Answer · 2016-04-10 21:42:00Z

You can use itertools.groupby:

from itertools import groupby
s = "cdcdccddd"

print([(k, sum(1 for _ in v)) for k,v in groupby(s)])
[('c', 1), ('d', 1), ('c', 1), ('d', 1), ('c', 2), ('d', 3)]

Consecutive chars will be grouped together, so each k is the char of that group, calling sum(1 for _ in v) gives us the length of each group so we end up with (char, len(group)) pairs.

If we run it in ipython and call list on each v it should be really clear what is happening:

In [3]: from itertools import groupby

In [4]: s = "cdcdccddd"

In [5]: [(k, list(v)) for k,v in groupby(s)]
Out[5]: 
[('c', ['c']),
 ('d', ['d']),
 ('c', ['c']),
 ('d', ['d']),
 ('c', ['c', 'c']),
 ('d', ['d', 'd', 'd'])]

We can also roll our own pretty easily:

def my_groupby(s):
    # create an iterator
    it = iter(s)
    # set consec_count, to one and pull first char from s
    consec_count, prev = 1,  next(it)
    # iterate over the rest of the string
    for ele in it:
        # if last and current char are different
        # yield previous char, consec_count and reset
        if prev != ele:
            yield prev, 
            consec_count, = 0
        prev = ele
        consec_count, += 1
    yield ele, consec_count

Which gives us the same:

In [8]: list(my_groupby(s))
Out[8]: [('c', 1), ('d', 1), ('c', 1), ('d', 1), ('c', 2), ('d', 3)]

TigerhawkT3 · Accepted Answer · 2016-04-10 21:34:50Z

1

That looks like a regular expression of repeating characters, so you can use a regex with repeated characters and then find the length of each match:

import re
text = "cdcdccddd"
matches = re.findall(r'(.)(\1*)', text)
result = ['{}: {}'.format(match[0], len(''.join(match))) for match in matches]

Result:

>>> print(*result, sep='\n')
c: 1
d: 1
c: 1
d: 1
c: 2
d: 3

answered Apr 10, 2016 at 21:34

TigerhawkT3

49.5k6 gold badges65 silver badges101 bronze badges

Comments

tryexceptpass · Accepted Answer · 2016-04-10 21:35:23Z

1

First thing, strings are already lists in python, so you can just say for character in text: to get each of the characters out.

I would try something like this:

currentchar = text[0]
currentcount = 0

for c in text[1:]:
    if c == currentchar:
        currentcount += 1
    else:
        print(currentchar + ": " + str(currentcount+1))
        currentchar = c
        currentcount = 0

print(currentchar + ": " + str(currentcount+1))

answered Apr 10, 2016 at 21:35

tryexceptpass

5596 silver badges14 bronze badges

2 Comments

zondo Over a year ago

Strings are sequences, but not lists. A list is its own thing separate from strings.

tryexceptpass Over a year ago

Yes you are correct, I should've said they can be used as lists.

Collectives™ on Stack Overflow

Python list comparing characters and counting them

3 Answers 3

Comments

Comments

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

Comments

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related