Python: Simulating CSV.DictReader with OpenPyXL

Question

I have an Excel (.xlsx) file that I'm trying to parse, row by row. I have a header (first row) that has a bunch of column titles like School, First Name, Last Name, Email, etc.

When I loop through each row, I want to be able to say something like:

row['School']

and get back the value of the cell in the current row and the column with 'School' as its title.

I've looked through the OpenPyXL docs but can't seem to find anything terribly helpful.

Any suggestions?

I also want to use such a convenient function. So far, I'm using ordereddict to help me solve the problem. If you find any way more convenient, please share with us. — ramwin
– ramwin, Commented May 10, 2017 at 3:21

Sherpa · Accepted Answer · 2018-12-19 21:08:22Z

2

I'm not incredibly familiar with OpenPyXL, but as far as I can tell it doesn't have any kind of dict reader/iterator helper. However, it's fairly easy to iterate over the worksheet rows, as well as to create a dict from two lists of values.

def iter_worksheet(worksheet):
    # It's necessary to get a reference to the generator, as 
    # `worksheet.rows` returns a new iterator on each access.
    rows = worksheet.rows

    # Get the header values as keys and move the iterator to the next item
    keys = [c.value for c in next(rows)]
    for row in rows:
        values = [c.value for c in row]
        yield dict(zip(keys, values))

answered Dec 19, 2018 at 21:08

Sherpa

2,02814 silver badges27 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Hussain · Accepted Answer · 2017-12-08 04:02:20Z

0

Excel sheets are far more flexible than CSV files so it makes little sense to have something like DictReader.

Just create an auxiliary dictionary from the relevant column titles.

If you have columns like "School", "First Name", "Last Name", "EMail" you can create the dictionary like this.

keys = dict((value, idx) for (idx, value) in enumerate(values))
for row in ws.rows[1:]:
    school = row[keys['School'].value

edited Dec 8, 2017 at 4:02

Hussain

5,2176 gold badges48 silver badges72 bronze badges

answered Jun 25, 2016 at 12:16

Charlie Clark

19.7k4 gold badges56 silver badges64 bronze badges

2 Comments

anon_swe Over a year ago

Thanks for the reply; I'm a Python newbie so could you elaborate a bit more on your answer?

Charlie Clark Over a year ago

What is unclear? You can update the question in the light of the answer.

Pedram · Accepted Answer · 2019-11-25 00:38:15Z

I wrote DictReader based on openpyxl. Save the second listing to file 'excel.py' and use it as csv.DictReader. See usage example in the first listing.

with open('example01.xlsx', 'rb') as source_data:
    from excel import DictReader

    for row in DictReader(source_data, sheet_index=0):
        print(row)

excel.py:

__all__ = ['DictReader']

from openpyxl import load_workbook
from openpyxl.cell import Cell

Cell.__init__.__defaults__ = (None, None, '', None)   # Change the default value for the Cell from None to `` the same way as in csv.DictReader


class DictReader(object):
    def __init__(self, f, sheet_index,
                 fieldnames=None, restkey=None, restval=None):
        self._fieldnames = fieldnames   # list of keys for the dict
        self.restkey  = restkey         # key to catch long rows
        self.restval  = restval         # default value for short rows
        self.reader   = load_workbook(f, data_only=True).worksheets[sheet_index].iter_rows(values_only=True)
        self.line_num = 0

    def __iter__(self):
        return self

    @property
    def fieldnames(self):
        if self._fieldnames is None:
            try:
                self._fieldnames = next(self.reader)
                self.line_num += 1
            except StopIteration:
                pass

        return self._fieldnames

    @fieldnames.setter
    def fieldnames(self, value):
        self._fieldnames = value

    def __next__(self):
        if self.line_num == 0:
            # Used only for its side effect.
            self.fieldnames

        row = next(self.reader)
        self.line_num += 1

        # unlike the basic reader, we prefer not to return blanks,
        # because we will typically wind up with a dict full of None
        # values
        while row == ():
            row = next(self.reader)

        d = dict(zip(self.fieldnames, row))
        lf = len(self.fieldnames)
        lr = len(row)

        if lf < lr:
            d[self.restkey] = row[lf:]
        elif lf > lr:
            for key in self.fieldnames[lr:]:
                d[key] = self.restval

        return d

Russell McDonell · Accepted Answer · 2020-10-15 05:34:56Z

0

The following seems to work for me.

    header = True
    headings = []
    for row in ws.rows:
        if header:
            for cell in row:
                headings.append(cell.value)
            header = False
            continue
        rowData = dict(zip(headings, row))
        wantedValue = rowData['myHeading'].value

answered Oct 15, 2020 at 5:34

Russell McDonell

1064 bronze badges

Comments

Frank · Accepted Answer · 2022-01-26 06:52:23Z

0

I was running into the same issue as described above. Therefore I created a simple extension called openpyxl-dictreader that can be installed through pip. It is very similar to the suggestion made by @viktor earlier in this thread.

The package is largely based on source code of Python's native csv.DictReader class. It allows you to select items based on column names using openpyxl. For example:

import openpyxl_dictreader

reader = openpyxl_dictreader.DictReader("names.xlsx", "Sheet1")
for row in reader:
    print(row["First Name"], row["Last Name"])

Putting this here for reference.

answered Jan 26, 2022 at 6:52

Frank

10310 bronze badges

Collectives™ on Stack Overflow

Python: Simulating CSV.DictReader with OpenPyXL

5 Answers 5

Comments

2 Comments

Comments

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

5 Answers 5

Comments

2 Comments

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related