2

Am trying to read Clob data from test db and inserting it in dev db. Am able to do that but the performance is very poor. For 100K rows its taking 8 to 12 hours, and running it from my local machine. Am wondering if my approach is correct or is there any better way in doing it. Below is my code after connections:

for row in rows.fetchall()
   x = []
   data = row.read
   json_data = json.loads(data)
   x.append(json_data)

This is how am doing it. Just wanted to know if there is any better way to do it. Stack : Python, OracleDB, cx_oracle, json Thanks

3
  • For a start, print is really expensive. You could try for x, row in enumerate(rows.fetchall()): and then hide the print under if x % 10000 == 0: print row but I also don't think it's enough to explain the slowness Commented Mar 14, 2018 at 22:39
  • Also, I think you could probably just iterate through the cursor without using .fetchall(). After that I have no familiarity with Clob to say whether you could get around using json.loads() on each row. Commented Mar 14, 2018 at 22:43
  • Try fetching the LOBs as shown in github.com/oracle/python-cx_Oracle/issues/… Also see section 7 (LOBS) in the tutorial github.com/oracle/python-cx_Oracle/blob/master/samples/tutorial/… Commented Mar 15, 2018 at 6:52

1 Answer 1

2

From the cx_Oracle sample the following code is what you want to use. This should dramatically improve performance!

def OutputTypeHandler(cursor, name, defaultType, size, precision, scale):
    if defaultType == cx_Oracle.CLOB:
        return cursor.var(cx_Oracle.LONG_STRING, arraysize = cursor.arraysize)
    elif defaultType == cx_Oracle.BLOB:
        return cursor.var(cx_Oracle.LONG_BINARY, arraysize = cursor.arraysize)

conn = cx_Oracle.Connection("user/pw@dsn")
conn.outputtypehandler = OutputTypeHandler
cursor = conn.cursor()
cursor.execute("""
        select CLOBColumn
        from SomeTable""")
json_data = [json.loads(s) for s, in cursor]
Sign up to request clarification or add additional context in comments.

2 Comments

OutputTypeHandler is not called when you use cursor.callproc(). is it normal? how to handle on that scenario?
cursor.callproc() uses PL/SQL and that requires the use of CLOBs, unfortunately!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.