Excel using win32com and python

Question

I wanted to know how to read an entire column without iterating from an excel sheet using win32com client for python.

Sonicsmooth · Accepted Answer · 2024-09-19 01:39:49Z

14

You can read an entire column without iterating from a sheet using the Range collection. You should never use Cells if performance is any concern. Python uses the win32com module to interact with the Excel COM library. Whenever you use Python and COM (Excel, PowerPoint, Acess, ADODB, etc.) one of your biggest performance constraints will be IO between COM and Python. With the Range method you only make one COM method call while with Cells you make one for each row. This would also be faster if you were doing the same in VBA or .NET

In the following test I created a worksheet with 10 random characters in cells A1 through A2000. I then extracted these values into lists using both Range and Cells.

import win32com.client
app = win32com.client.Dispatch("Excel.Application")
s = app.ActiveWorkbook.Sheets(1)

def GetValuesByCells():
    startTime = time.time()
    vals = [s.Cells(r,1).Value for r in range(1,2001)]
    return time.time() - startTime

def GetValuesByRange():
    startTime = time.time()
    vals = [v[0] for v in s.Range('A1:A2000').Value]
    return time.time() - startTime

>>> GetValuesByRange()
0.03600001335144043

>>> GetValuesByCells()
5.27400016784668

In this case Range is 2 orders of magnitude faster (146x) faster than Cells. Note that the Range method returns a 2D list where each inner list is a row. The list iteration transposes vals into a 2D list where the inner list is a column.

edited Sep 19, 2024 at 1:39

Sonicsmooth

2,7972 gold badges24 silver badges40 bronze badges

answered Sep 3, 2013 at 14:46

Michael David Watson

3,08126 silver badges36 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

John Y Over a year ago

OK, I'm upvoting this even though, in its current form, I don't really consider it an answer. To me, this is a very long comment. Why I think it still deserves an upvote is that it is ultimately more useful and more helpful than either of the other answers presented so far (despite yuvi's being accepted already). The code snippet shown here clearly comes the closest to illustrating how to "read an entire column without iterating ... using win32com".

Michael David Watson Over a year ago

I just looked back on this answer and modified it to answer the original question

yuvi Over a year ago

I wonder how well it does against xlrd. If the differences are miniscule, then xlrd would be a clear winner

John Y Over a year ago

@yuvi: I haven't tested extensively myself, but I think it depends to a large extent on the nature of the data and the nature of what you're trying to do. The bigger and more complicated the workbook, the bigger the advantage Excel will have, in sheer loading time alone. The more you can rely on Excel itself to do the heavy lifting (via its ranges and its calculation engine), the bigger the advantage Excel will have. Just make sure you make as few and as productive COM calls as possible.

PattimusPrime · Accepted Answer · 2013-09-02 15:06:24Z

2

Have you looked into the openpyxl library? From the documentation:

from openpyxl import load_workbook
wb = load_workbook(filename='file.xlsx')
ws = wb.get_sheet_by_name(name='Sheet1')
columns = ws.columns()

There's also support for iterators and other goodies.

answered Sep 2, 2013 at 15:06

PattimusPrime

8659 silver badges22 bronze badges

Comments

yuvi · Accepted Answer · 2014-06-22 06:18:08Z

1

The fastest way would be to use the built in Range functionality through the win32com.client API. However, I'm not a big fan of it. I think the API is confusing and badly documented, and using it isn't very pythonic (but that's just me).

If efficiency is not an issue for you, you can use the excellent xlrd library. Like so:

import xlrd
book = xlrd.open_workbooks('Book1')
sheet = book.sheet_by_name('Sheet1')
sheel.col(1)
sheet.col(2)
# and so on...

That gives you the cell objects. To get pure values, use sheet.col_values (and there are a few other methods that are real nice to work with).

Just remember that xlrd stand for "excel read", so if you want to write to an excel file you need a different library called "xlwt" (which is also pretty good, though less so than xlrd in my opinion).

edited Jun 22, 2014 at 6:18

answered Sep 2, 2013 at 14:49

yuvi

18.5k9 gold badges63 silver badges98 bronze badges

14 Comments

Nischal Hp Over a year ago

Yeah i did try writing this piece of code and was thinking , python can be used to write as less as possible so i was just wondering if there was anything that would just return me a list with the required column values without me having to write the iteration part.

yuvi Over a year ago

You can use the xlrd library, wait a second I'll add an example

yuvi Over a year ago

There. hope that helps!

John Y Over a year ago

Conceptually, I think OP is looking for a way to retrieve an entire range "in one go", like retrieving a result set using SELECT from a database. That is, any "iterating" that must be done during retrieval is handled before it gets to Python. In the database case, the SQL engine may be iterating under the covers, but all you see is a single "return value", that happens to contain multiple values within it. So for Excel, OP is looking to specify a range, then grab all the values "at once" into, say, a tuple. This may or may not be possible; I don't know COM well enough.

John Y Over a year ago

All that being said, I don't see why iteration needs to be avoided in the first place. Perhaps it's because accessing Excel via COM cell by cell is notoriously slow. Operating on an Excel range is vastly faster than operating on one cell at a time. But reading the file directly via xlrd (and not involving COM at all) is usually fast enough.

|

Collectives™ on Stack Overflow

Excel using win32com and python

3 Answers 3

4 Comments

Comments

14 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

4 Comments

Comments

14 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related