0
def find_string(header,file_1,counter):
  ab = re.compile(str(header))
  for line in file_1:
    if re.search(ab,line) !=None:
       print line
  counter+=1
  return counter

file_1 = open("text_file_with_headers.txt",'r')
header_array = []
header_array.append("header1")
header_array.append("header2")
# ...

counter = 0
for header in header_array:
  counter = find_string(header,file_1,counter)

Every time I run this it searches for only one of the headers and I cannot figure out why.

0

2 Answers 2

3

Because when the loop for line in file_1: has ended for the first header, the file's pointer is at the end of the file. You must move this pointer to the file's beginning again, that is done with function seek() . You must add seek(0,0) like that

counter = 0 
for header in header_array:
    counter = find_string(header,file_1,counter)
    f1.seek(0,0)

.

EDIT

1) ab is a compiled regex, then you can write ab.search(line)

2) bool(None) is False, then you can write if ab.search(line): no need of != None

3)

def find_string(header,file_1,counter):
    lwh = re.compile('^.*?'+header+'.*$',re.MULTILINE)
    lines_with_header = lwh.findall(file-1.read())
    print ''.join(lines_with_header)
    return counter + 1

and even

def find_string(header,file_1,counter):
    lwh = re.compile('^.*?'+header+'.*$',re.MULTILINE)
    print ''.join(matline.group() for matline in lwh.finditer(file-1.read()) )
    return counter + 1

4)

def find_string(header,file_1):
    lwh = re.compile('^.*?'+header+'.*$',re.MULTILINE)
    lines_with_header = lwh.findall(file-1.read())
    print ''.join(lines_with_header)

file_1 = open("text_file_with_headers.txt",'r')
header_list = ["header1","header2",....]

for counter,header in header_list:
    find_string(header,file_1)
    file_1.seek(0,0)

counter += 1 # because counter began at 0

5) You run through file_1 as many times that there are headers in header_list.

You should run through it only one time and recording each line containing one of the headers in a list being one of the values of a dictionary whose keys should be the headers. It would be faster.

6) An array in Python is an array

Sign up to request clarification or add additional context in comments.

Comments

1

The file object keeps track of your position in the file, and after you've gone through the outer loop once, you're at the end of the file and there are no more lines to read.

If I were you, I would reverse the order in which your loops are nested: I would iterate through the file line by line, and for each line, iterate through the list of strings you want to find. That way, I would only have to read each line from the file once.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.