How to find the longest repeating sequence using python

Question

I went through an interview, where they asked me to print the longest repeated character sequence.

I got stuck is there any way to get it?

But my code prints only the count of characters present in a string is there any approach to get the expected output

import pandas as pd
import collections

a   = 'abcxyzaaaabbbbbbb'
lst = collections.Counter(a)
df  = pd.Series(lst)
df

Expected output :

bbbbbbb

How to add logic to in above code?

@Psidom : I was giving it a try but don't know how to implement logic checking is it possible — user17086826
– user17086826, Commented Oct 24, 2021 at 5:58
res = max(("".join(g) for _, g in groupby(a)), key=len) -> 'bbbbbbb'. Here, groupby is itertools.groupby — Ch3steR
– Ch3steR, Commented Oct 24, 2021 at 6:00
@Ch3steR : can you explain in detail no idea what is happening — user17086826
– user17086826, Commented Oct 24, 2021 at 6:01
Few relevant links: Get the largest string using max, How do I use itertools.groupby — Ch3steR
– Ch3steR, Commented Oct 24, 2021 at 6:09

no comment · Accepted Answer · 2021-10-24 07:39:36Z

3

A regex solution:

max(re.split(r'((.)\2*)', a), key=len)

Or without library help (but less efficient):

s = ''
max((s := s * (c in s) + c for c in a), key=len)

Both compute the string 'bbbbbbb'.

edited Oct 24, 2021 at 7:39

answered Oct 24, 2021 at 6:56

no comment

10.9k5 gold badges21 silver badges44 bronze badges

Sign up to request clarification or add additional context in comments.

9 Comments

no comment Over a year ago

@RolfofSaxony Hmm, what's the point of repeating that? That's just the expected output already shown in the question. I might show my output if it deviated from the expected output (like yours does), but if it's just what's expected...

Rolf of Saxony Over a year ago

Feel free to reject the edit, I simply added it for clarity

Alain T. Over a year ago

You could improve the efficiency of the second one a little bit by changing (c in s) to s.endswith(c)

no comment Over a year ago

@AlainT. Yes, and I had considered c in s[:1], but neither change the worst case complexity, so I opted for brevity/beauty :-)

no comment Over a year ago

@AlainT. Hmm, actually... c in s is amortized O(1) here. And quite possibly faster than loading and calling the endswith method.

|

Alain T. · Accepted Answer · 2021-10-24 08:24:28Z

Without any modules, you could use a comprehension to go backward through possible sizes and get the first character multiplication that is present in the string:

next(c*s for s in range(len(a),0,-1) for c in a if c*s in a)

That's quite bad in terms of efficiency though

another approach would be to detect the positions of letter changes and take the longest subrange from those

chg = [i for i,(x,y) in enumerate(zip(a,a[1:]),1) if x!=y]
s,e = max(zip([0]+chg,chg+[len(a)]),key=lambda se:se[1]-se[0])
longest = a[s:e]

Of course a basic for-loop solution will also work:

si,sc = 0,"" # current streak (start, character)
ls,le = 0,0  # longest streak (start, end)
for i,c in enumerate(a+" "):      # extra space to force out last char.
    if i-si > le-ls: ls,le = si,i # new longest
    if sc != c:      si,sc = i,c  # new streak
longest = a[ls:le]

print(longest) # bbbbbbb

Rolf of Saxony · Accepted Answer · 2021-10-24 06:49:29Z

1

A more long winded solution, picked wholesale from:
maximum-consecutive-repeating-character-string

def maxRepeating(str):
 
    len_s = len(str)
    count = 0
 
    # Find the maximum repeating
    # character starting from str[i]
    res = str[0]
    for i in range(len_s):
         
        cur_count = 1
        for j in range(i + 1, len_s):
            if (str[i] != str[j]):
                break
            cur_count += 1
 
        # Update result if required
        if cur_count > count :
            count = cur_count
            res = str[i]
    return res, count
 
# Driver code
if __name__ == "__main__":
    str = "abcxyzaaaabbbbbbb"
    print(maxRepeating(str))

Solution:

('b', 7)

answered Oct 24, 2021 at 6:49

Rolf of Saxony

22.6k5 gold badges43 silver badges61 bronze badges

2 Comments

no comment Over a year ago

Did you pick the inefficient one intentionally?

Rolf of Saxony Over a year ago

@don'ttalkjustcode Ha! Why yes, as a matter of fact I did. Simply because of the environment which caused this question to be asked, namely an interview, without access to SO. I had a sneaking suspicion that this question might attract more than a few solutions. :)

Collectives™ on Stack Overflow

How to find the longest repeating sequence using python

3 Answers 3

9 Comments

Comments

2 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

9 Comments

Comments

2 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related