2

The data I want to insert into the database likes this:

datalist =[['2012', '1', '3', '1', '832.0', '261.0', '100.00'],
            ['2012', '1', '5', '1', '507.0', '193.0', '92.50'],
            ['2012', '2', '3', '1', '412.0', '200.0', '95.00'],
            ['2012', '2', '5', '1', '560.0', '335.0', '90.00'],
            ['2012', '3', '3', '1', '584.0', '205.0', '100.00'],
            ['2012', '3', '5', '1', '595.0', '162.0', '92.50'],
            ['2012', '4', '3', '1', '504.0', '227.0', '100.00'],
            ['2012', '4', '5', '1', '591.0', '264.0', '92.50']]

But in fact, there are 500,000 rows in datalist. So I just listed a part of it.

The code I insert into the database likes this:

import pymssql

server = '127.0.0.1'
user = "test"
password = "test"
database='SQLTest'
datalist = [['2012', '1', '3', '1', '832.0', '261.0', '100.00'],
            ['2012', '1', '5', '1', '507.0', '193.0', '92.50'],
            ['2012', '2', '3', '1', '412.0', '200.0', '95.00'],
            ['2012', '2', '5', '1', '560.0', '335.0', '90.00'],
            ['2012', '3', '3', '1', '584.0', '205.0', '100.00'],
            ['2012', '3', '5', '1', '595.0', '162.0', '92.50'],
            ['2012', '4', '3', '1', '504.0', '227.0', '100.00'],
            ['2012', '4', '5', '1', '591.0', '264.0', '92.50']]

#But in fact, there are 500,000 rows in datalist

try:
    conn = pymssql.connect(server, user, password, database)
    cursor = conn.cursor()
    for one_row in datalist:
        val1 = one_row[4]
        val2 = one_row[5]
        val3 = one_row[6]
        sql = "insert into table_for_test values(col1, col2, col3)" % (val1, val2,val3)
        cursor.execute(sql)
        conn.commit()
except Exception as ex:
    conn.rollback()
    raise ex
finally:
    conn.close()

Because of the amount of data is too large,So I want to insert data in batchs,how to modify the code?

2
  • are the 500000 rows in data set in a file, for example csv or something similar? if so you can use BULK insert command. Commented Mar 14, 2018 at 6:45
  • Related question here. Commented Mar 14, 2018 at 12:57

3 Answers 3

2

Now I know how to do it. Use executeMany.The element must be tuple in the list.

import pymssql

server = '127.0.0.1'
user = "test"
password = "test"
database='SQLTest'
datalist = [('2012', '1', '3', '1', '832.0', '261.0', '100.00'),
            ('2012', '1', '5', '1', '507.0', '193.0', '92.50'),
            ('2012', '2', '3', '1', '412.0', '200.0', '95.00'),
            ('2012', '2', '5', '1', '560.0', '335.0', '90.00'),
            ('2012', '3', '3', '1', '584.0', '205.0', '100.00'),
            ('2012', '3', '5', '1', '595.0', '162.0', '92.50'),
            ('2012', '4', '3', '1', '504.0', '227.0', '100.00'),
            ('2012', '4', '5', '1', '591.0', '264.0', '92.50')]

try:
    conn = pymssql.connect(server, user, password, database)
    cursor = conn.cursor()
    sql = "insert into table_for_test values(col1, col2, col3, col4, col5, col6, col7) values(%s, %s, %s, %s, %s, %s, %s)"
    cursor.executemany(sql, datalist)
    conn.commit()
except Exception as ex:
    conn.rollback()
    raise ex
finally:
    conn.close()
Sign up to request clarification or add additional context in comments.

Comments

0

One way to do this is to use the BULK INSERT statement. https://learn.microsoft.com/en-us/sql/t-sql/statements/bulk-insert-transact-sql

The input should be from a file (e.g CSV).

So for example if the data is in CSV file

BULK INSERT table_for_test
    FROM C:\user\admin\downloads\mycsv.csv
    WITH (
        FIRSTROW=1
      , FIELDTERMINATOR=','
      , ROWTERMINATOR='\n'
    )

Comments

0

Executemany() is slow because it's using the for loop and in that it's using the execute() function itself.

the easiest way which i found it is to use the execute() command itself in range of 1000. for e.g i have data like this => data = [(1,"test1","123"),(2,"test2","543"),(3,"test3","876"),(4,"test4","098")]

so you can insert in this way

 def bulk_batch_insertion(self,data,tablename):
    print("Data==============>",len(data))
    start = 0
    end = 1000
    while data[start:end]:
        print("start ",start,'end ',end)
        new_data = ','.join(data[start:end])
        print("length of new data =========>",len(data[start:end]))
        query = f"INSERT INTO {tablename} VALUES {new_data}"
        print("Insert QUERY ====> ",query)
        self.cursor.execute(query)
        self.conn.commit()
        print("Successfully Inserted the data ")
        start = end
        end = start + 1000
    print("Execution  successfulll=======endss=====>>")

Note : I'm not using BULK insert because I have some encoding and some manipulations to be done before passing it into the database.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.