Split string multiple times in Python

Question

This should be a simple thing to do but I can't get it to work.

Say I have this string.

I want this string to be splitted into smaller strings.

And, well, I want to split it into smaller strings, but only take what is between a T and a S.

So, the result should yield

this, to be s, to s, trings

So far I've tried splitting on every S, and then up to every T (backwards). However, it will only get the first "this", and stop. How can I make it continue and get all the things that are between T's and S's?

(In this program I export the results to another text file)

matches = open('string.txt', 'r')

with open ('test.txt', 'a') as file:    
    for line in matches:
           test = line.split("S")
           file.write(test[0].split("T")[-1] + "\n")

matches.close()

Maybe using Regular Expressions would be better, though I don't know how to work with them too well?

@thefourtheye When finding the first S, it would go backwards looking for a T, when it finds the first T (aka this), it would forget about that part and keep on. After this it finds another S, but since it has already gone through what is before that S, it wouldn't care about it and simply not find a match until it gets to the first S in 'Splitted', goes back to find a T, etc. Mmmh maybe quite messy in my mind. :-) — Brick Top
– Brick Top, Commented Jan 6, 2014 at 14:27
@BrickTop: That makes no sense, really. Because there is a t in tring to be s, for example. — Martijn Pieters
– Martijn Pieters, Commented Jan 6, 2014 at 14:28
He obviously needs a computer to solve the task he cannot solve by hand without errors. I think that's okay. — Alfe
– Alfe, Commented Jan 6, 2014 at 14:29
Now that's my mind playing games. Indeed, Q was wrong, there was another T. I edited it now reflecting what it should actually show, sorry for misunderstanding. @Alfe , I'm not sure if your comment is set to be offensive, but I'm actually trying to do this with a 400K characters string. — Brick Top
– Brick Top, Commented Jan 6, 2014 at 14:36
No, no offense intended. Such "errors" in expected output derived from executing a wanted algorithm manually are quite common, actually. And you explained what your algorithm should do quite elaborately in your comment. (But of course Martijn's complaint on your imperfectness triggered my wording.) — Alfe
– Alfe, Commented Jan 6, 2014 at 14:40

Martijn Pieters · Accepted Answer · 2014-01-06 14:30:48Z

3

You want a re.findall() call instead:

re.findall(r't[^s]*s', line, flags=re.I)

Demo:

>>> import re
>>> sample = 'I want this string to be splitted into smaller strings.'
>>> re.findall(r't[^s]*s', sample, flags=re.I)
['t this', 'tring to be s', 'tted into s', 'trings']

Note that this matches 't this' and 'tted into s'; your rules need clarification as to why those first t characters shouldn't match when 'trings' does.

It sounds as if you want to match only text between t and s without including any other t:

>>> re.findall(r't[^ts]*s', sample, flags=re.I)
['this', 'to be s', 'to s', 'trings']

where tring in the second result and tted in the 3rd are not included because there is a later t in those results.

edited Jan 6, 2014 at 14:30

answered Jan 6, 2014 at 14:25

Martijn Pieters

1.1m326 gold badges4.2k silver badges3.4k bronze badges

Sign up to request clarification or add additional context in comments.

8 Comments

JoshG79 Over a year ago

probably want t[^st]*s as the RE, just based on his example output. But otherwise a very good solution

thefourtheye Over a year ago

Still, he expects this, tring to be s, ted into s, trings :(

Martijn Pieters Over a year ago

@thefourtheye: which, according to his own description, is the wrong output since there are other t characters before and after the ones his samples use as first character.

thefourtheye Over a year ago

Yup... Even I am quite confused

Brick Top Over a year ago

Sorry @thefourtheye, I was confused too - result of spending too many hours around. :-) My example was wrong, this is exactly what I wanted to achieve.

|

Collectives™ on Stack Overflow

Split string multiple times in Python

1 Answer 1

8 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

8 Comments

Your Answer

Sign up or log in

Post as a guest

Related