0

I have a dataframe with six columns and around 27000 rows.
I'm trying to load this dataframe into my SQL Server (not localhost), but it takes forever.

Does anyone know of any faster way to load than this -
27000 rows shouldnt take long. No problem when reading from the database. :-)

for index, row in predict.iterrows():
        params = [(row.account_no, row.group_company, row.customer_type, row.invoice_date, row.lower, row.upper)]
        cursor.fast_executemany = True
        cursor.executemany("INSERT INTO ML.predictions (account_no,group_company,customer_type,invoice_date,lower, upper) values(?,?,?,?,?,?)",
                       params)
      bachelor.commit()

ANSWER

records = [str(tuple(x)) for x in predict.values]


insert_ = """

INSERT INTO ml.predictions(account_no, group_company, customer_type, invoice_date, lower, upper) VALUES
 
"""


    def chunker(seq, size):
        return (seq[pos:pos + size] for pos in range(0, len(seq), size))
    
    for batch in chunker(records, 1000):
        rows = ','.join(batch)
        insert_rows = insert_ + rows
        cursor.execute(insert_rows)
        bachelor.commit()
4
  • 1
    What is "forever" here? Commented Nov 21, 2021 at 16:30
  • 2
    Your indentation is messed up, so it's not clear whether you have commit() inside of the loop or whether that's a typo. If it's in the loop then your question is contradictory - you're using a bulk insertion method to insert single rows Commented Nov 21, 2021 at 16:31
  • You can pass bulk data to SQL Server using JSON. See stackoverflow.com/questions/60745932/… Commented Nov 21, 2021 at 16:35
  • Does this answer your question? How to speed up bulk insert to MS SQL Server using pyodbc Commented Nov 21, 2021 at 17:08

1 Answer 1

0

Thank for your answers - I tried them all besides using JSON, it might be me that its wrong with.

This was my solution

records = [str(tuple(x)) for x in predict.values]


insert_ = """

INSERT INTO ml.predictions(account_no, group_company, customer_type, invoice_date, lower, upper) VALUES
 
"""


def chunker(seq, size):
    return (seq[pos:pos + size] for pos in range(0, len(seq), size))

for batch in chunker(records, 1000):
    rows = ','.join(batch)
    insert_rows = insert_ + rows
    cursor.execute(insert_rows)
    bachelor.commit()
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.