Python - Iterating through rows of excel files with Pandas

Question

I have problem with iterating through rows in excel file.

import os
import pandas as pd
import json

for file in os.listdir("./python_files"):
    if file.endswith(".xlsx"):
        df = pd.read_excel(os.path.join("./python_files", file)) 
        CRD_Array = df.iloc[:,1].values
        for single_CRD in CRD_Array:           
            with open("{}.json".format(single_CRD), 'w') as json_file:
                row_iterator = 0
                data = {}
                data['header']=[]
                data['header'].append({'Organization CRD#':  '{}'.format(df.iloc[row_iterator,1])})
                json.dump(data, json_file)
                row_iterator = row_iterator + 1

How u can see my script is

Reading files from python_files folder
Then it's reading second column with CRD number which returns an array of CRDs
Then it's looping CRD array
In that loop It's trying to save .json file with "hedar" field

What I get in output now

File name 172081.json

{"header": [{"Organization CRD#": "172081"}

File name 534123.json

{"header": [{"Organization CRD#": "172081"}

File name 184521.json

{"header": [{"Organization CRD#": "172081"}

I looks like df.iloc [row_iterator, 1] isn't changing row property despite adding +1 for each loop repeat

Can somebody help?

Edit: Excel file example-

What I want to achieve

File name 172081.json

{"header": [{"Organization CRD#": "172081"}

File name 534123.json

{"header": [{"Organization CRD#": "534123"}

File name 184521.json

{"header": [{"Organization CRD#": "184521"}

Give a sample of the data, raw and your target.

Johnny
– Johnny

2021-02-07 08:02:30 +00:00
Commented Feb 7, 2021 at 8:02 — Johnny
– Johnny, Commented Feb 7, 2021 at 8:02

Agnes Kis · Accepted Answer · 2021-02-07 08:07:44Z

2

In the for loop you are increasing the row_iterator, but in the first line after open you always set it back to 0. You need to take that line out from the loop. Like this:

row_iterator = 0

for single_CRD in CRD_Array: 
          
    with open("{}.json".format(single_CRD), 'w') as json_file:
       data = {}
       data['header']=[]
       data['header'].append({'Organization CRD#': '{}'.format(df.iloc[row_iterator,1])})
       json.dump(data, json_file)
       row_iterator = row_iterator + 1

answered Feb 7, 2021 at 8:07

Agnes Kis

4993 silver badges7 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Adam Szymański Over a year ago

Omg, I was thinking how to solve this problem for 30 minutes, I didn't see that I'm assigning 0 to row_iterator every time when loop in executed. Ty so much :D

Agnes Kis Over a year ago

You are welcome:) I also suggest avoid this much nesting in your code, because it will get hard to understand. Eg. you could do: if not file.endswith(...): continue then df = ...

Collectives™ on Stack Overflow

Python - Iterating through rows of excel files with Pandas

1 Answer 1

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related