0

I need to split a text like:

//string
s = CS -135IntrotoComputingCS -154IntroToWonderLand...

in array like

inputarray[0]= CS -135 Intro to computing
inputarray[1]= CS -154 Intro to WonderLand
.
.

. and so on; I am trying something like this:

re.compile("[CS]+\s").split(s)

But it's just not ready to even break, even if I try something like

re.compile("[CS]").split(s)

If anyone can throw some light on this?

1
  • How are you going to convert 135IntrotoComputingCS to -135 Intro to computing? (Unless they are all in "intro to x" format which I don't think it is) Where are you getting this data? Maybe you can get it in a better format. Commented Oct 29, 2018 at 18:47

2 Answers 2

2

You may use findall with a lookahead regex as this:

>>> s = 'CS -135IntrotoComputingCS -154IntroToWonderLand'
>>> print re.findall(r'.+?(?=CS|$)', s)

['CS -135IntrotoComputing', 'CS -154IntroToWonderLand']

Regex: .+?(?=CS|$) matches 1+ any characters that has CS at next position or end of line.

Sign up to request clarification or add additional context in comments.

3 Comments

Thanks a lot @anubhava, it worked! the only thing is that it breaks when i have CS -135IntrotoCS, then in that case it again mark CS as a different member of the array!
Can you please update your question and show me input string with expected output.
May . be you can use: re.findall(r'.+?(?=CS -|$)', s)
0

Although findall is more straightforward but finditer can also be used here

s = 'CS -135IntrotoComputingCS -154IntroToWonderLand'
x=[i.start() for i in re.finditer('CS ',s)] # to get the starting positions of 'CS'
count=0
l=[]
while count+1<len(x):
    l.append(s[x[count]:x[count+1]])
    count+=1
l.append(s[x[count]:])
print(l) # ['CS -135IntrotoComputing', 'CS -154IntroToWonderLand']

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.