Read table contents and loop through code using python

Question

I have a table of 10000 rows loaded in dataframe.

The below code pushes these using patch method to another source. I do not want to execute & push all 10000 rows at the same time using below command. Rather I want the first 100 rows from the table to be executed and pushed first, then again the next 100, and so on till the end of the table in a loop. My table doesn't have any row number column. How can this be achieved in python as a loop.

batch = clientlink.create_batch()
changeset = clientlink.create_changeset()
 for row in dfpatch.rdd.collect():
  changeset.add_request(clientlink.entity_sets.cc.update_entity(obj=row.obj, method='PATCH').set(seg=row.segment))
  print(row.obj,row.segment)
batch.add_request(changeset)
response = batch.execute()

Desty · Accepted Answer · 2022-03-08 08:21:19Z

I don't know what the function of clientlink is, but if you only think about the part that processes 100 in the for loop, you can implement it by adding count as shown below. For reference, since I don't know how to initialize a batch, I put an explanation in the comments.

batch = clientlink.create_batch()
changeset = clientlink.create_changeset()
count = 0
    for row in dfpatch.rdd.collect():
        changeset.add_request(clientlink.entity_sets.CorporateAccountCollection.update_entity(ObjectID=row.ObjectID, method='PATCH').set(CLMSegment_KUT=row.segment))
        count += 1
        print(row.ObjectID,row.segment)
        if count == 100:
            batch.add_request(changeset)
            response = batch.execute()
            # need to clear 'batch'
            count = 0
batch.add_request(changeset)
response = batch.execute()

Amiga500 · Accepted Answer · 2022-03-08 08:28:37Z

0

Try:

nSplits = int(dfpatch.shape[0]/100)  
    #Number of splits
    # we need for 100 rows per split

#You may want to put a check on the above to ensure 
#nSplits*100 >= nRows in the frame, should be OK for 
#the rounded numbers you have here

listOfFrames = np.array_split(dfpatch, nSplits)

After that, its pretty easy

for subFrame in listOfFrames:
    #Do something with the frame of 100

answered Mar 8, 2022 at 8:28

Amiga500

1,2851 gold badge7 silver badges12 bronze badges

7 Comments

Roho Over a year ago

I get this below error when trying to use shape: AttributeError: 'DataFrame' object has no attribute 'shape'

Amiga500 Over a year ago

Eh? Pandas dataframe? Try len(dfpatch.index)

Roho Over a year ago

AttributeError: 'DataFrame' object has no attribute 'index'

Amiga500 Over a year ago

What kind of dataframe is this? Its obviously not a pandas dataframe.... IF its RDD, use dfpatch.count()

Roho Over a year ago

dfpatch:pyspark.sql.dataframe.DataFrame AccountID:string Name:string ObjectID:string segment:string

|

Collectives™ on Stack Overflow

Read table contents and loop through code using python

2 Answers 2

Comments

7 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

7 Comments

Your Answer

Sign up or log in

Post as a guest

Related