0

I have an Excel worksheet and I want to create a dict which has the cell values as a list with cell column being the key. Let say that spreadsheet has data that looks like,

A B C (columns)
1 2 
3 4
5 f

I want a dict that looks like,

cbyc = {'A': [1,3,5]; 'B':[2,4,f]; 'C';[None, None, None]}

I am doing this with the following code

import openpyxl as oxl
wb = oxl.load_workbook('myxlworkbook.xlsx')
sheet = wb.get_sheet_by_name('Sheet1')
allcells = sheet.get_cell_collection()

cbyc = {}
for c in allcells:
    if c.value is not None:
        if c.column not in cbyc.keys():
            cbyc[c.column] = [c.value]
        else:
            cbyc[c.column].append(c.value)

And this work,.. but I am sure there is a more efficient way to create this dict with the if.. else logic

is there a better way? Maybe there is something in the openpyxl that can give such a list

3
  • looks like you need cbyc = collections.defaultdict(list) Commented Oct 12, 2017 at 19:43
  • 1
    if c.column not in cbyc.keys() is just horrible. Commented Oct 12, 2017 at 19:44
  • I'm reopening because Rob answer doesn't use any defaultdict or setdefault. Commented Oct 12, 2017 at 21:25

3 Answers 3

8

Whenever I see an unsightly loop filling a dict or a list, I try to replace it with a dict comprehension or a list comprehension. In this case, I'd use both.

This program might do what you want:

import openpyxl as oxl
wb = oxl.load_workbook('myxlworkbook.xlsx')
sheet = wb.get_sheet_by_name('Sheet1')

cybc = {
    col[0].value: [cell.value for cell in col[1:]]
    for col in sheet.columns
}

print(cybc)

However, we can avoid all of the .value code if we invoke sheet.values to start with:

cybc = { col[0]: col[1:] for col in zip(*sheet.values) }
Sign up to request clarification or add additional context in comments.

Comments

1

You can replace:

if c.column not in cbyc.keys():
    cbyc[c.column] = [c.value]
else:
    cbyc[c.column].append(c.value)

with:

cby.setdefault(c.column, []).append(c.value)
setdefault(key[, default])

If key is in the dictionary, return its value. If not, insert key with a value of default and return default. default defaults to None.

See docs.

Comments

1
from collections import defaultdict

cbyc = defaultdict(list)

for cell in all_cells:
    if cell.value is None:
        continue
    cbyc[cell.column].append(cell.value)

See documentation on defaultdict.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.