0

Have an array as shown below.

arrayy = ['top,tree,branch,bla-top,tree,ascb-red/blue', 'tree,leaves,mmn-tree,leaves,mscb-gra/gre', 'leaves,bird,responder,mon-leaves,bird,ascb-yoo/yee','tree,leaves,mount-road,cycle-roo/soo']
  1. Is there a simple way to find the index which contains a sub-string inside a list of string ?
  2. For eg, I wanted to search for "leaves,bird*-leaves,bird*" and wanted to return the index for the same.

Tried the below code,

def find_index_sub_string(needle,haystack):
    return [i for i, x in enumerate(haystack) if needle in x] 
  1. Able to search for leave,bird but unable to search like "leaves,bird*-leaves,bird*"

Is there any better way to search and get the require string ?

UPDATE:

Got it working with below code.

search_re = re.compile("leaves,bird.*-leaves,bird.*")

for i in range (len(arrayy)):
    if re.match(search_re, arrayy[i]):
        print i
8
  • 1
    For leaves,bird*-leaves,bird* search you would need to use search_re = re.compile("leaves,bird.*-leaves,bird.*") out of your loop and then do if search_re.search() within your loop Commented Dec 22, 2019 at 10:16
  • Hello. but when if I do search_re.search , will that return all the index's ? or do I need to cycle through to get the index ? Commented Dec 22, 2019 at 10:43
  • You will still need to iterate similar way with @Patrick Artner 's solution Commented Dec 22, 2019 at 10:50
  • "leaves,bird*-leaves,bird*" is not present in any of the string present in arrayy right? I didn't properly understand the question you asked. Can someone explain it to me please. Commented Dec 22, 2019 at 10:56
  • @Ch3steR I am trying to search for that string using * expression i.e. not giving the full absolute string. Commented Dec 22, 2019 at 10:58

1 Answer 1

2

Most of the time you want to find (complex) patterns in texts, regular expressions can do it:

import re

data = ['top,tree,branch,bla-top,tree,ascb-red/blue', 
        'tree,leaves,mmn-tree,leaves,mscb-gra/gre', 
        'leaves,bird,responder,mon-leaves,bird,ascb-yoo/yee',
        'tree,leaves,mount-road,cycle-roo/soo']

patt1 = r"leaves,bird.*-leaves,bird" 
patt2 = r"tree" 

for patt in (patt1,patt2):
    print (f"'{patt}' in text:") # py 3, for 2 use: print '{} in text:'.format(patt)
    for idx,text in enumerate(data): 
        if re.search(patt,text):   # modified from re.match wich only looks at start of text
            print(idx, text)    # py 3, for 2 use: print idx,text

Output:

'leaves,bird.*-leaves,bird' in text:
2 leaves,bird,responder,mon-leaves,bird,ascb-yoo/yee
'tree' in text:
1 tree,leaves,mmn-tree,leaves,mscb-gra/gre
3 tree,leaves,mount-road,cycle-roo/soo

You can develop your matching patterns online, f.e. on http://www.regex101.com - and get it to explain them to you as well.

If you want to start with regex, this is a fun way to do so: https://regexcrossword.com/ (just a fan, not affiliated ;o) ) - the official site to look at would be https://docs.python.org/3/library/re.html

My second pattern does not need regex - a simple if 'tree' in text: would have had the same effect.

Sign up to request clarification or add additional context in comments.

2 Comments

Isn't better to use re.search() instead of re.match(). Will be slower probably but will match cases where the entry does not start with "leaves", like: "mount-cycle,tree,leaves,mount-road,cycle-roo/soo"
@urban good point, if its inside the string, re.match won't get it. incorporating your advice.#

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.