3

I am trying to loop through a text file and apply some logic but I am not able to loop through the text file. So currently I have a text file that is structured like this:

--- section1 ---
"a","b","c"
"d","e","f"
--- section2 ---
"1","2","3"
"4","5","6"
--- section3 ---
"12","12","12"
"11","11","11"

I am trying to filter out the first line which contains '---' and convert the lines below into json until the next '---' line appear in the text document.

However I got this error " fields1 = next(file).split(',') StopIteration

with open(fileName,'r') as file:
    for line in file:
        if line.startswith('-') and 'section1' in line:
            while '---' not in next(file):
                fields1 = next(file).split(',')
                for x in range(0,len(fields1)):
                    testarr.append({
                    config.get('test','test'): fields1[x]           
                    })

                with open(test_dir,'w') as test_file:
                    json.dump(testarr, test_file)

Any idea why my code is not working or how i can solve the error ?

5
  • 1
    Try writing as two loops in series instead of nesting them. Loop1: skip all lines until --- section1.... Loop2: Dump all lines until another --- is met. Commented Feb 1, 2017 at 4:33
  • Given past experiences with similar titles, I expected this to be a crap question that needed to be closed. Instead, found a well phrased, clear question with basically all the relevant information present. Many kudos OP. Commented Feb 1, 2017 at 4:34
  • Ignore my original (now deleted) comment. It's been a long time since I've seen an else to a while loop. Commented Feb 1, 2017 at 4:38
  • break shouldn't be in an else. Inner with most certainly should not be in the while loop. Commented Feb 1, 2017 at 4:40
  • 1
    Please don't edit the code in the question directly like that. Commented Feb 1, 2017 at 4:41

3 Answers 3

2

The cause of your error is that you are misusing the file object genrator by calling next on it twice as often as you think. Each call to next gets a line and returns it. Therefore, while '---' not in next(file): fields1 = next(file).split(',') gets a line, checks it for ---, then gets another line and tries to parse it. This means that you are able to skip a line containing the --- by having it come up in the second next. In that case you will get to the end of the file before you find the line you are looking for. StopIteration is how iterators normally indicate that their input has been exhausted.

There are a couple of other issues you may want to address in your code:

  1. Using next on a generator like a file when you are already inside a for loop may cause undefined behavior. You may be getting away with it this time, but it is not good practice in general. The main reason you are getting away with it, by the way, is possibly that you never actually return control to the for loop once the while is triggered, and not that files are particularly permissive in this regard.
  2. The inner with that dumps your data to a file is inside your while loop. That means that the file you open with 'w' permissions will get truncated for every iteration of the while (i.e., each line in the file). As the array grows, the output will actually appear fine, but you probably want to move that out of the inner loop.

The simplest solution would be to rewrite the code in two loops: one to find the start of the part you care about, and the other to process it until the end is found.

Something like this:

test_arr = []
with open(fileName, 'r') as file:
    for line in file:
        if line.startswith('---') and 'section1' in line:
            break

    for line in file:
        if '---' in line:
            break
        fields1 = line.split(',')
        for item in fields1:
            testarr.append({config.get('test','test'): item})

with open(test_dir,'w') as test_file:
    json.dump(testarr, test_file)

EDIT:

Taking @tripleee's advice, I have removed the regex check for the start line. While regex gives great precision and flexibility for finding a specific pattern, it is really overkill for this example. I would like to point out that if you are looking for a section other than section1, or if section1 appears after some other lines with dashes, you will absolutely need this two-loop approach. The one-loop solutions in the other answers will not work in a non-trivial case.

Sign up to request clarification or add additional context in comments.

2 Comments

While going to regex offers better precision for matching exactly the pattern(s) you want, in this case it looks like a complication and overkill to boot.
@tripleee. I don't disagree with you. I did notice that the two-loop solution is necessary though if you look for a section other than section1, or if section1 is not the first section.
2

Looks like you are overcomplicating matters massively. The next inside the inner while loop I imagine is tripping up the outer for loop, but that's just unnecessary anyway. You are already looping over lines; pick the ones you want, then quit when you're done.

with open(fileName,'r') as inputfile:
    for line in inputfile:
        if line.startswith('-') and 'section1' in line:
            continue
        elif line.startswith('-'):
            break
        else:
            testarr.append({config.get('test', 'test'): x
                for x in line.split(',')})

with open(test_dir,'w') as test_file:
    json.dump(testarr, test_file)

I hope I got the append right, as I wanted to also show you how to map the split fields more elegantly, but I'm not sure I completely understand what your original code did. (I'm guessing you'll want to trim the \n off the end of the line before splitting it, actually. Also, I imagine you want to trim the quotes from around each value. x.strip('"') for x in line.rstrip('\n').split(','))

I also renamed file to inputfile to avoid clashing with the reserved keyword file.

If you want to write more files, basically, add more states in the loop and move the write snippet back inside the loop. I don't particularly want to explain how this is equivalent to a state machine but it should not be hard to understand: with two states, you are skipping or collecting; to extend this, add one more state for the boundary when flipping back, where you write out the collected data and reinitialize the collected lines to none.

10 Comments

The outer for loop is not what causes the problem. This is clearly stated in the question.
Also, your code will add all lines that are not in section1 as well.
Huh? The elif skips to the end when you reach section 2.
The next in the inner loop in the OP's attempt manipulates the file iterator of the outer loop.
I agree that the next is a bad idea. It's the fact that there's two of them that messes things up though, not the fact that it's in a for loop.
|
0

next() raises a StopIteration exception when the iterator is exhausted. In other words, your code gets to the end of the file, and you call next() again, and there's nothing more for it to return, so it raises that exception.

As for how to solve your problem, I think this might be what you want:

with open(fileName, 'r') as file:
    for line in file:
        if line.startswith('---'):
            if 'section1' in line:
                continue
            else:
                break
        fields1 = line.split(',')
        for x in range(len(fields1)):
            testarr.append({
                config.get('test', 'test'): fields1[x]
            })

with open(test_dir, 'w') as test_file:
    json.dump(testarr, test_file)

4 Comments

Given the input file, you do not really explain how it is possibly to reach the end of the file when the stop condition is a line containing ---.
Also, that write needs to be outside the loop.
@MadPhysicist So it does. Thanks.
Also, same problem as with @tripleee's anser. It won't work for any section but the first. I'll bet OP will want to reuse this code for other sections too.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.