Read and write Clob data using python and cx_oracle

Question

Am trying to read Clob data from test db and inserting it in dev db. Am able to do that but the performance is very poor. For 100K rows its taking 8 to 12 hours, and running it from my local machine. Am wondering if my approach is correct or is there any better way in doing it. Below is my code after connections:

for row in rows.fetchall()
   x = []
   data = row.read
   json_data = json.loads(data)
   x.append(json_data)

This is how am doing it. Just wanted to know if there is any better way to do it. Stack : Python, OracleDB, cx_oracle, json Thanks

For a start, print is really expensive. You could try for x, row in enumerate(rows.fetchall()): and then hide the print under if x % 10000 == 0: print row but I also don't think it's enough to explain the slowness — roganjosh
– roganjosh, Commented Mar 14, 2018 at 22:39
Also, I think you could probably just iterate through the cursor without using .fetchall(). After that I have no familiarity with Clob to say whether you could get around using json.loads() on each row. — roganjosh
– roganjosh, Commented Mar 14, 2018 at 22:43
Try fetching the LOBs as shown in github.com/oracle/python-cx_Oracle/issues/… Also see section 7 (LOBS) in the tutorial github.com/oracle/python-cx_Oracle/blob/master/samples/tutorial/… — Christopher Jones
– Christopher Jones, Commented Mar 15, 2018 at 6:52

Anthony Tuininga · Accepted Answer · 2018-03-15 14:52:17Z

2

From the cx_Oracle sample the following code is what you want to use. This should dramatically improve performance!

def OutputTypeHandler(cursor, name, defaultType, size, precision, scale):
    if defaultType == cx_Oracle.CLOB:
        return cursor.var(cx_Oracle.LONG_STRING, arraysize = cursor.arraysize)
    elif defaultType == cx_Oracle.BLOB:
        return cursor.var(cx_Oracle.LONG_BINARY, arraysize = cursor.arraysize)

conn = cx_Oracle.Connection("user/pw@dsn")
conn.outputtypehandler = OutputTypeHandler
cursor = conn.cursor()
cursor.execute("""
        select CLOBColumn
        from SomeTable""")
json_data = [json.loads(s) for s, in cursor]

answered Mar 15, 2018 at 14:52

Anthony Tuininga

7,2062 gold badges17 silver badges25 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Carlos.V Over a year ago

OutputTypeHandler is not called when you use cursor.callproc(). is it normal? how to handle on that scenario?

Anthony Tuininga Over a year ago

cursor.callproc() uses PL/SQL and that requires the use of CLOBs, unfortunately!

Collectives™ on Stack Overflow

Read and write Clob data using python and cx_oracle

1 Answer 1

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related