Python, nested for loop

Question

I’m new to python and I’m having difficulties implementing a nested ‘for loop’. This might be simple but the following sample code which I tried doesn’t give me the intended result. My task actually is to read the records from an attribute table (ArcGIS feature data) and compare it with every record in a CSV file. But initially I’m trying to do the same on with 2 CSV files and then apply the similar logic to my original problem. I’m trying to figure out the working of the loop and I can add the conditions of comparison later on. Any help is greatly appreciated. Thanks.

The idea is that first row in file 1(CSV) compares itself to all the rows (row by row) in file 2(CSV), and then second row in file 1 does the same until each row of file 1 compares itself to all the rows in file 2. So in the anticipated outcome, I’m trying to see if for every row in file 1, if each row in file2 is considered.

Example:

**File 1   File 2**
ALPHA      All
BETA       Bell
GAMMA      Cell
DELTA      Dell
ITA

Sample code:

import csv, sys, os, string 
table1 = os.path.join(path, 'table1.csv')
table2 = os.path.join(path, 'table2.csv')
file1 = csv.reader(open(table1, 'r'))
file2 = csv.reader(open(table2, 'r'))
for row in file1:
    print row
    for prow in file2:
        print prow

Anticipated outcome:

   ALPHA
    All
    Bell
    Cell
    Dell

    BETA
    All 
    …..

    ITA

All
..
Dell

@ Marcin: I was trying to see how the loop should be formatted to achieve the desired result.the solution suggested by @ Jonas Wielicki worked. — hydi
– hydi, Commented Jun 26, 2012 at 16:50
@aglassman: when I tried, it would display the contents of file2 only once and then print the contents of file1 — hydi
– hydi, Commented Jun 26, 2012 at 16:52

Jonas Schäfer · Accepted Answer · 2012-06-26 15:53:31Z

3

The problem here is, that file2 is just a one-shot iterator. So after iterating over file2 once (in the first iteration of file1), you completely deplete the data.

Instead, you have to store the contents of file2 in a list:

file2=list(csv.reader(open(table2,'r')))
for row in file1:
     print row
     for prow in file2:
         print prow

This will print you some lists, all of which contain only one element, the first cell of the respective row. This is due to parsing the file as CSV. Each iteration gets you a list of cells in the rows.

edited Jun 26, 2012 at 15:53

answered Jun 26, 2012 at 15:22

Jonas Schäfer

20.9k5 gold badges59 silver badges71 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Marcin Over a year ago

See my answer for an example. Also, I strongly suggest giving your examples as valid python.

hydi Over a year ago

@ Jonas Wielicki: Thank you. This gives me the intended result. I wasn't aware of one-shot iterator. I thought it would go over the data again each time the loop runs.

Ashwini Chaudhary · Accepted Answer · 2013-09-22 07:29:52Z

3

It's because as you iterate over a csv.reader object, it becomes empty with each iteration.

that's the reason that file2 iterator is behaving this way.

To get around this you should save the values from file2 in a list first.

file1=csv.reader(open(table1,’r’))
file2=list(csv.reader(open(table2, 'r'))) #edited this
for row in file1:
...     print row
...     for prow in file2:
...         print prow

edited Sep 22, 2013 at 7:29

answered Jun 26, 2012 at 15:23

Ashwini Chaudhary

252k60 gold badges478 silver badges519 bronze badges

2 Comments

Marcin Over a year ago

-1 A string would be much better, as it would eliminate the nested loop.

Piotr Kalinowski Over a year ago

I think that list(csv.reader(open(table2, 'r'))) is more readable than list comprehension that essentially only copies iterator contents.

Piotr Kalinowski · Accepted Answer · 2012-06-26 15:33:45Z

The problem is that after you iterate over all the rows of file2, its stream is consumed. There's simply nothing more to read. Next for loop will not re-set the csv.reader object, but rather recognise that everything was already read and parsed, and so there's nothing more to do.

Solutions might include:

file2_stream = open(table2, 'r')
for row in file1:
  print row
  file2_stream.seek(0)  # Reset file stream position
  file2 = csv.reader(file2_stream)  # Init CSV parsing
  for prow in file2:
    print prow

Or, you could ever reopen file each time:

for row in file1:
  print row
  file2 = csv.reader(open(table2, 'r'))
  for prow in file2:
    print prow

This, obviously will parse the second file at each outer iteration. If the file is not large in comparison to the memory size, you might want to parse it once, and then store result as a list in memory:

file2_rows = list(file2)
for row in file1:
  print row
  for prow in file2_rows:
    print prow

Marcin · Accepted Answer · 2012-06-26 16:03:10Z

1

Don't do this.

Read the first file into an appropriate datastructure (e.g. a set), then when reading the second file, test against the collected rows in the datastructure.

For this exercise, it might be best to create single string, as you seem to print the whole data read from the first file each time.

file2='\n'.join((l[0] for l in csv.reader(open(table2,’r’))))
for row in file1:
        print row
        print file2

If you need indenting, you can use textwrap to add indents to file2.

edited Jun 26, 2012 at 16:03

answered Jun 26, 2012 at 15:23

Marcin

50.1k18 gold badges137 silver badges207 bronze badges

7 Comments

Jonas Schäfer Over a year ago

A set is not appropriate, as it does not preserve ordering (which might be necessary). I.e. it removes information from the initial set of data.

Marcin Over a year ago

@JonasWielicki For real work involving comparisons (which is what OP refers to), a set will almost certainly be what is wanted. For the exercise described, a string would be much better (it's the same thing every time).

Jonas Schäfer Over a year ago

Okay, agreed. Still, it's not the solution to the point of having a depleted iterato. I think you should at least mention the loss of order in your answer, as it's really an unexpected result in that specific loop if elements are mixed up.

Marcin Over a year ago

@JonasWielicki It's exactly the solution to a depleted iterator. That's why iterators are used - to build up something else, if the data need to be accessed more than once.

Jonas Schäfer Over a year ago

No, at least not for the general case, because you're dropping information which was previously available (ordering). I do not argue that for doing comparisions later (which is what the OP seems to aim for) a set might be suitable. But this is not the general case.

|

Jonas Schäfer · Accepted Answer · 2012-06-26 19:23:30Z

1

The CSV module is going to return iterators for both of those files that will be "spent" after they are looped through. This is the typical Python behavior for files.

In order to use the values from one file in a loop for the other, you can load them into memory. Going off the best reading of your intention, I assume you want to associate the data in one file with every line in the other. I'll give an expository example:

greek = csv.reader(open('file1.csv'))
dells = csv.reader(open('file2.csv'))

second_file_data = list(dells)

#From here, dells is "spent."  If we would want to reuse it, we have to reopen it

for line in greek:
  print line
  for other in second_file_data:
      print other

edited Jun 26, 2012 at 19:23

Jonas Schäfer

20.9k5 gold badges59 silver badges71 bronze badges

answered Jun 26, 2012 at 15:22

Mark Grey

10.3k10 gold badges51 silver badges78 bronze badges

Collectives™ on Stack Overflow

Python, nested for loop

5 Answers 5

2 Comments

2 Comments

Comments

7 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

5 Answers 5

2 Comments

2 Comments

Comments

7 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related