Skip the last row of CSV file when iterating in Python

Question

I am working on a data analysis using a CSV file that I got from a datawarehouse(Cognos). The CSV file has the last row that sums up all the rows above, but I do not need this line for my analysis, so I would like to skip the last row.

I was thinking about adding "if" statement that checks a column name within my "for" loop like below.

import CSV

with open('COGNOS.csv', "rb") as f, open('New_COGNOS.csv', "wb") as w:
    #Open 2 CSV files. One to read and the other to save.
    CSV_raw = csv.reader(f)
    CSV_new = csv.writer(w)
    for row in CSV_raw:
        item_num = row[3].split(" ")[0]
        row.append(item_num)
        if row[0] == "All Materials (By Collection)": break
        CSV_new.writerow(row)

However, this looks like wasting a lot of resource. Is there any pythonian way to skip the last row when iterating through CSV file?

if your on ninx you can use head -n -1 yourfile.csv to echo file without the last line — dm03514
– dm03514, Commented May 30, 2013 at 21:54
Do you mean unix-lke OS? Unfortunately, I am using my corporate PC. Thank you though, it will come in handy when I get my hand dirty at home. — Yong Jun Kim
– Yong Jun Kim, Commented May 30, 2013 at 22:30

Martijn Pieters · Accepted Answer · 2013-05-30 22:21:03Z

18

You can write a generator that'll return everything but the last entry in an input iterator:

def skip_last(iterator):
    prev = next(iterator)
    for item in iterator:
        yield prev
        prev = item

then wrap your CSV_raw reader object in that:

for row in skip_last(CSV_raw):

The generator basically takes the first entry, then starts looping and on each iteration yield the previous entry. When the input iterator is done, there is still one line left, that is never returned.

A generic version, letting you skip the last n elements, would be:

from collections import deque
from itertools import islice

def skip_last_n(iterator, n=1):
    it = iter(iterator)
    prev = deque(islice(it, n), n)
    for item in it:
        yield prev.popleft()
        prev.append(item)

edited May 30, 2013 at 22:21

answered May 30, 2013 at 21:56

Martijn Pieters

1.1m326 gold badges4.2k silver badges3.4k bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

alecxe Over a year ago

Martijn, seems like there is a team of python devs behind your account :) Producing that fast and exact answers looks just amazing!

Yong Jun Kim Over a year ago

Thank you Martijin. That was amazingly fast. The code works like a charm too. Except ":" at the end of "prev = next(iterator):" has to be deleted.

Yong Jun Kim Over a year ago

There we go! Thank you very much.

kindall Over a year ago

This is exactly how I'd to it too. In general, when you want to "look ahead," it is usually easier to change the problem to "look behind."

iruvar · Accepted Answer · 2013-05-30 22:09:38Z

1

A generalized "skip-n" generator

from __future__ import print_function
from StringIO import StringIO
from itertools import tee
s = '''\
1
2
3
4
5
6
7
8
'''
def skip_last_n(iterator, n=1):
    a, b = tee(iterator)
    for x in xrange(n):
            next(a)
    for line in a:
            yield next(b)

i = StringIO(s)
for x in skip_last_n(i, 1):
    print(x, end='')
1
2
3
4
5
6
7

i = StringIO(s)
for x in skip_last_n(i, 3):
    print(x, end='')
1
2
3
4
5

answered May 30, 2013 at 22:09

iruvar

23.5k7 gold badges58 silver badges83 bronze badges

3 Comments

Martijn Pieters Over a year ago

Using tee as a n-sized buffer is a nice idea too. Use itertools.islice() to skip n items fast instead of a for x in xrange(n) loop: next(islice(a, n, n), None) consumes n items in C code, which will beat the for loop any time.

iruvar Over a year ago

@MartijnPieters, good point. I am leaning towards leaving the for loop in place for readability reasons. Your comment should be able to point everyone to the more efficient islice option!

Martijn Pieters Over a year ago

It is part of the consume recipe in the itertools documentation if you are interested.

Collectives™ on Stack Overflow

Skip the last row of CSV file when iterating in Python

2 Answers 2

4 Comments

3 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

4 Comments

3 Comments

Your Answer

Sign up or log in

Post as a guest

Related