I'm having a heckuva time dealing with slow MySQL queries in Python. In one area of my application, "load data infile" goes quick. In an another area, the select queries are VERY slow.
Executing the same query in PhpMyAdmin AND Navicat (as a second test) yields a response ~5x faster than in Python.
A few notes...
- I switched to MySQLdb as the connector and am also using SSCursor. No performance increase.
- The database is optimized, indexed etc. I'm porting this application to Python from PHP/Codeigniter where it ran fine (I foolishly thought getting out of PHP would help speed it up)
- PHP/Codeigniter executes the select queries swiftly. For example, one key aspect of the application takes ~2 seconds in PHP/Codeigniter, but is taking 10 seconds in Python BEFORE any of the analysis of the data is done.
My link to the database is fairly standard...
dbconn=MySQLdb.connect(host="127.0.0.1",user="*",passwd="*",db="*", cursorclass = MySQLdb.cursors.SSCursor)
Any insights/help/advice would be greatly appreciated!
UPDATE
In terms of fetching/handling the results, I've tried it a few ways. The initial query is fairly standard...
# Run Query
cursor.execute(query)
I removed all of the code within this loop just to make sure it wasn't the case bottlekneck, and it's not. I put dummy code in its place. The entire process did not speed up at all.
db_results = "test"
# Loop Results
for row in cursor:
a = 0 (this was the dummy code I put in to test)
return db_results
The query result itself is only 501 rows (large amount of columns)... took 0.029 seconds outside of Python. Taking significantly longer than that within Python.
The project is related to horse racing. The query is done within this function. The query itself is long, however, it runs well outside of Python. I commented out the code within the loop on purpose for testing... also the print(query) in hopes of figuring this out.
# Get PPs
def get_pps(race_ids):
# Comma Race List
race_list = ','.join(map(str, race_ids))
# PPs Query
query = ("SELECT raceindex.race_id, entries.entry_id, entries.prognum, runlines.line_id, runlines.track_code, runlines.race_date, runlines.race_number, runlines.horse_name, runlines.line_date, runlines.line_track, runlines.line_race, runlines.surface, runlines.distance, runlines.starters, runlines.race_grade, runlines.post_position, runlines.c1pos, runlines.c1posn, runlines.c1len, runlines.c2pos, runlines.c2posn, runlines.c2len, runlines.c3pos, runlines.c3posn, runlines.c3len, runlines.c4pos, runlines.c4posn, runlines.c4len, runlines.c5pos, runlines.c5posn, runlines.c5len, runlines.finpos, runlines.finposn, runlines.finlen, runlines.dq, runlines.dh, runlines.dqplace, runlines.beyer, runlines.weight, runlines.comment, runlines.long_comment, runlines.odds, runlines.odds_position, runlines.entries, runlines.track_variant, runlines.speed_rating, runlines.sealed_track, runlines.frac1, runlines.frac2, runlines.frac3, runlines.frac4, runlines.frac5, runlines.frac6, runlines.final_time, charts.raceshape "
"FROM hrdb_raceindex raceindex "
"INNER JOIN hrdb_runlines runlines ON runlines.race_date = raceindex.race_date AND runlines.track_code = raceindex.track_code AND runlines.race_number = raceindex.race_number "
"INNER JOIN hrdb_entries entries ON entries.race_date=runlines.race_date AND entries.track_code=runlines.track_code AND entries.race_number=runlines.race_number AND entries.horse_name=runlines.horse_name "
"LEFT JOIN hrdb_charts charts ON runlines.line_date = charts.race_date AND runlines.line_track = charts.track_code AND runlines.line_race = charts.race_number "
"WHERE raceindex.race_id IN (" + race_list + ") "
"ORDER BY runlines.line_date DESC;")
print(query)
# Run Query
cursor.execute(query)
# Query Fields
fields = [i[0] for i in cursor.description]
# PPs List
pps = []
# Loop Results
for row in cursor:
a = 0
#this_pp = {}
#for i, value in enumerate(row):
# this_pp[fields[i]] = value
#pps.append(this_pp)
return pps
One final note... I haven't considered the ideal way to handle the result. I believe one cursor allows the result to come back as a set of dictionaries. I haven't even made it to that point yet as the query and return itself is so slow.