0

I have a list of strings and now I want to extract all strings between two strings containing specific keywords (including those two strings).

example_list = ['test sentence', 'the sky is blue', 'it is raining outside', 'mic check', 'vacation time']
keywords = ['sky', 'check']

The result I want to achieve:

result = ['the sky is blue', 'it is raining outside', 'mic check']

So far, I couldn't figure it out myself. Maybe it is possible with two loops and using regex?

0

4 Answers 4

1

You can find the indices of the strings with the keywords and then slice the values list with the indices of the first and last occurrences

indices = [i for i, x in enumerate(example_list) if any(k in x for k in keywords)]
result = example_list[indices[0]:indices[-1] + 1]
# ['the sky is blue', 'it is raining outside', 'mic check']
Sign up to request clarification or add additional context in comments.

Comments

0

It's a little bit of a more lengthy solution but here's another way to do it

found = False
s=0
c=0
for i in range(len(example_list)):
    if not found and keywords[0] in example_list[i]:
        found = True
        s = i
    elif found and keywords[1] in example_list[i]:
        c = i+1
out = example_list[s:c]

Comments

0

A generator solution that would work with any sequence of strings, not just a list:

def included(seq, start_text, end_text):
    do_yield = False
    for text in seq:
        if not do_yield and start_text in text:
            do_yield = True
        if do_yield:
            yield text
            if end_text in text:
                break

You can cast the result as a list, of course.

Comments

0

For each word, you have to check the presence in each sentence. So you'll have 2 loops.

The simplest way is to use the positions (indexes) of the sentences in the example list :

import numpy as np

example_list = ['test sentence', 'the sky is blue', 'it is raining outside', 'mic check', 'vacation time']
keywords = ['sky', 'check']

indexes=[]
for k in keywords : 
    for sentence in example_list :
        if k in sentence :
            indexes.append(example_list.index(sentence))

result = example_list[np.min(indexes):np.max(indexes)+1]
print(result)

it will return :

['the sky is blue', 'it is raining outside', 'mic check']

2 Comments

I think you miss the 'it is raining outside' which should also be included, if I am not mistaken
You're right @Vall0n ! I misread the question. I edited to answer it properly.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.