extracting characters from python string using variable index

Question

I'm trying to split a string of letters and numbers into a list of tuples like this:

[(37, 'M'), (1, 'I'), (5, 'M'), (1, 'D'), (25, 'M'), (33, 'S')]

This is what is kind of working, but when I try to get print "37" (print(cigar[d:pos])) it does not print the entire string, only 3.

#iterate through cigar sequence
print(cigar)
#count position in cigar sequence
pos=0
#count position of last key
d=0

splitCigar=[]

for char in cigar:
    
    #print(cigar[pos])
    if char.isalpha() == False:
        print("first for-loop")
        print(cigar[d])
        print(cigar[pos])
        print(cigar[d:pos])
        num=(cigar[d:pos])
        pos+=1

    if char.isalpha() == True:
        print("second for-loop")
        splitCigar.append((num,char))
        pos+=1
        d=pos   
    
print(splitCigar)

The output of this code:

37M1I5M1D25M33S
first for-loop
3
3

first for-loop
3
7
3
second for-loop

<and so on...>

second for-loop
[('3', 'M'), ('', 'I'), ('', 'M'), ('', 'D'), ('2', 'M'), ('3', 'S')]

can you clarify your input and expected output

Chase
– Chase

2020-11-04 13:10:20 +00:00
Commented Nov 4, 2020 at 13:10 — Chase
– Chase, Commented Nov 4, 2020 at 13:10

Mengard · Accepted Answer · 2020-11-04 13:23:21Z

1

Solution using regexp:

import re
cigar = "37M1I5M1D25M33S"

digits = re.findall('[0-9]+', cigar)
chars = re.findall('[A-Z]+', cigar)

results = list(zip(digits, chars))

Everything printed so you can see what it does:

>>> print(digits)
['37', '1', '5', '1', '25', '33']
>>> print(chars)
['M', 'I', 'M', 'D', 'M', 'S']
>>> print(results)
[('37', 'M'), ('1', 'I'), ('5', 'M'), ('1', 'D'), ('25', 'M'), ('33', 'S')]

I hope this "functional" approach suits you

answered Nov 4, 2020 at 13:23

Mengard

2221 silver badge8 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Sarah Over a year ago

Yes, thank you! Is there any way to convert the digits to integers in the final results?

Mengard Over a year ago

Sure things! You can do digits = [int(digit) for digit in digits] to convert them before zipping

Crawl Cycle · Accepted Answer · 2020-11-04 13:57:07Z

1

Pyparsing library makes writing parsers more maintainable and readable. If the format of the data changes, you can modify the parser without too much effort.

import pyparsing as pp


def make_grammar():
    # Number consists of several digits
    num = pp.Word(pp.nums).setName("Num")
    # Convert the num to int
    num = num.setParseAction(
        pp.pyparsing_common.convertToInteger)
    # 1 letter
    letter = pp.Word(pp.alphas, exact=1)\
        .setName("Letter")
    # 1 num followed by letter with possibly
    # some spaces in between
    package = pp.Group(num + letter)
    # 1 or more packages
    grammar = pp.OneOrMore(package)
    return grammar


def main():
    x = "37M1I5M1D25M33S"
    g = make_grammar()
    result = g.parseString(x, parseAll=True)
    print(result)
    # [[37, 'M'], [1, 'I'], [5, 'M'], 
    #  [1, 'D'], [25, 'M'], [33, 'S']]
    # If you really want tuples:
    print([tuple(r) for r in result])


main()

edited Nov 4, 2020 at 13:57

answered Nov 4, 2020 at 13:36

Crawl Cycle

2872 silver badges9 bronze badges

1 Comment

PaulMcG Over a year ago

Nice example. Note that pyparsing now has the Char class that you can use in place of Word(exact=1)

Chase · Accepted Answer · 2020-11-04 13:20:26Z

Sounds like a job for itertools.groupby

inp = '37M1I5M1D25M33S'
e = [''.join(g) for k, g in itertools.groupby(inp, key=lambda l: l.isdigit())]
print(e)

This will give you-

['37', 'M', '1', 'I', '5', 'M', '1', 'D', '25', 'M', '33', 'S']

Basically, groupby collects all consecutive elements that satisfy the key function (.isdigit) into groups, each of those groups is turned into a string using ''.join

Now, all you have to do is zip them together-

res = list(zip(e[::2], e[1::2]))
print(res)

That will give you

[('37', 'M'), ('1', 'I'), ('5', 'M'), ('1', 'D'), ('25', 'M'), ('33', 'S')]

If you want numericals instead of string representation of numbers, that's also super simple-

res = list(map(lambda l: (int(l[0]), l[1]), res))

Which yields

[(37, 'M'), (1, 'I'), (5, 'M'), (1, 'D'), (25, 'M'), (33, 'S')]

I'd say this is a pretty pythonic solution for your problem.

Dharman · Accepted Answer · 2020-11-04 13:30:18Z

0

You can simply attain the desired output as follows:

cigar= '37M1I5M1D25M33S'

splitCigar=[]
t=[]
num=''
for char in cigar:
    if char.isalpha()==False:
        num+= char
    else:
        t.append(num)
        num=''
        t.append(char)
        
        splitCigar.append(tuple(t))
        t=[]
print(splitCigar)

Output: [('37', 'M'), ('1', 'I'), ('5', 'M'), ('1', 'D'), ('25', 'M'), ('33', 'S')]

edited Nov 4, 2020 at 13:30

Dharman♦

33.9k27 gold badges106 silver badges157 bronze badges

answered Nov 4, 2020 at 13:25

Sarun Dahal

3771 gold badge5 silver badges17 bronze badges

Collectives™ on Stack Overflow

extracting characters from python string using variable index

4 Answers 4

2 Comments

1 Comment

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

2 Comments

1 Comment

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related