2

I'm trying to make bulk insert operation in SQLAlchemy with Oracle DB, which inserts 60k rows with blob data.

This is how my table and code looks like:

CREATE TABLE datatables (
    id INTEGER NOT NULL, 
    table_name VARCHAR2(50 CHAR), 
    row_id VARCHAR2(50 CHAR), 
    row_data BLOB, 
    PRIMARY KEY (id)
)
with Session() as session:
    session.execute(
        DataTables.__table__.insert(),
        datas
    )

Where DataTables is SQLAclhemy class mapping for table and datas is list of dicts like this {'id': 1, 'table_name': 'app', 'row_id': 'version', 'row_data': '....'}

with such code i got this sql statement

sqlalchemy.engine.Engine INSERT INTO datatables (id, table_name, row_id, row_data) VALUES (:id, :table_name, :row_id, :row_data)
sqlalchemy.engine.Engine [generated in 0.14718s] [{'id': 1, 'table_name': 'app', 'row_id': 'version', 'row_data': b'some_data'}, ...]

and it runs forever, even after 30 minutes it is not finished. When I enable DPI trace there are a lot of such rows:

ODPI [12912] 2021-12-18 00:07:36.019: ref 0000019C3D15C5B0 (dpiConn) -> 8
ODPI [12912] 2021-12-18 00:07:36.020: ref 0000019C413F0C20 (dpiLob) -> 1 [NEW]
ODPI [12912] 2021-12-18 00:07:36.020: ref 0000019C3D15C5B0 (dpiConn) -> 9
ODPI [12912] 2021-12-18 00:07:36.020: ref 0000019C413F1990 (dpiLob) -> 1 [NEW]

However if I use raw sql:

session.execute('insert into DATATABLES (id, table_name, row_id, row_data) values (:id, :table_name, :row_id, :row_data)', datas)

DPI trace changed to this:

ODPI [00796] 2021-12-18 00:14:55.246: ref 000002486741EAF0 (dpiVar) -> 0
ODPI [00796] 2021-12-18 00:14:55.246: ref 00000248617D2DF0 (dpiConn) -> 6
ODPI [00796] 2021-12-18 00:14:55.246: fn end dpiVar_release(000002486741EAF0) -> 0
ODPI [00796] 2021-12-18 00:14:55.246: fn start dpiVar_setFromBytes(000002486741EBB0)
ODPI [00796] 2021-12-18 00:14:55.247: fn end dpiVar_setFromBytes(000002486741EBB0) -> 0
ODPI [00796] 2021-12-18 00:14:55.247: fn start dpiVar_setFromBytes(0000024864E5FDE0)
ODPI [00796] 2021-12-18 00:14:55.247: fn end dpiVar_setFromBytes(0000024864E5FDE0) -> 0
ODPI [00796] 2021-12-18 00:14:55.247: fn start dpiVar_setFromBytes(0000024864E601A0)
ODPI [00796] 2021-12-18 00:14:55.247: fn end dpiVar_setFromBytes(0000024864E601A0) -> 0

and bulk insert is done in 15 seconds.

SQL statement is same in both cases. Why does it work diffent? Of course I can use raw sql, but I want to use DataTables.__table__.insert(), because if table or columns names will change - I don't need to fix sql for this every time.

6
  • Most probable explanation with Alchemy you are doing row by row insert. Activate a 10046 trace to prove this. Commented Dec 17, 2021 at 21:52
  • 10046 trace? Sorry, i'm not very familiar with oracle debugging Commented Dec 17, 2021 at 21:59
  • see e.g. here Commented Dec 17, 2021 at 22:17
  • i checked tracing - it's executemany, not row by row. the only difference is when i use DataTables.__table__.insert() it uses dpiLob_setFromBytes, but with raw sql it's dpiVar_setFromBytes. Don't know why it changes from lob to var and why lob is slower in this case. Commented Dec 18, 2021 at 19:29
  • 1
    In the case where dpiLob_setFromBytes() is used you are creating a temporary LOB, writing to it, and then transferring from the temporary LOB to the permanent LOB. In the other case where dpiVar_setFromBytes() is used the data is being transferred directly -- it is considerably more efficient, as you discovered! Commented Dec 19, 2021 at 2:43

1 Answer 1

1

issues like these should preferably be reported as bugs in SQLAlchemy, which you can do at : https://github.com/sqlalchemy/sqlalchemy/issues/ otherwise we were completely not aware of this problem.

in this case we've been alerted to this issue and will seek to update our LOB settings per cx_oracle developers for SQLAlchemy 2.0, which is tracked at https://github.com/sqlalchemy/sqlalchemy/issues/7494

Sign up to request clarification or add additional context in comments.

1 Comment

To be honest, I thought that it was me who was mistaken in something, and not the erroneous behavior of SQLAlchemy. Thanks for reporting issue

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.