Python have slow db-query, but Perl not

Question

I use python (Django) for my web-shop.

When I tested high loading (db access) got interesting results:

python 10 process = 200sec / 100% CPU utilisation
perl 10 process  = 65sec / 35% CPU utilisation

Centos 6, python 2.6, mysql 5.5, standard libraries, mysql-server on other server. Table product_cars have 70 000 000 records.

Why python-program so slow?

Python program:

#!/usr/bin/python
import MySQLdb
import re
from MySQLdb import cursors
import shutil
import datetime
import random

db0 = MySQLdb.connect(user="X", passwd="X", db="parts")
cursor0 = db0.cursor()
cursor0.execute('SET NAMES utf8')

now = datetime.datetime.now()
for x in xrange(1, 100000):
    id = random.randint(10, 50000)
    cursor0.execute("SELECT * FROM product_cars WHERE car_id=%s LIMIT 500", [id])
    cursor0.fetchone()

Perl program:

#!/usr/bin/perl
use DBI;
my $INSTANCE=$ARGV[0];
my $user = "x";
my $pw = "x";
my $db = DBI->connect( "dbi:mysql:parts", "x", "x");
my $sql= "SELECT * FROM product_cars WHERE car_id=? LIMIT 500";
foreach $_ ( 1 .. 100000 )
{
 $random = int(rand(50000));
 $cursor = $db->prepare($sql);
 $cursor->execute($random) || die $cursor->errstr;
 @Data= $cursor->fetchrow_array();
}

$cursor->finish;
$db->disconnect;

update1

Interesting thing:

select always row with id=1:

Сlear that MYSQL use cache and query will be very fast, but again slow and 100% CPU utilisation. But same perl or ruby code work quick.

if replace string in python code:

# remove "SET NAMES utf8" string - this has no impact
# python-mysql use "%s", but not "?" as parameter marker
id = 1
for x in xrange(1, 100000):
    id = 1
    cursor0.execute("SELECT * FROM product_cars WHERE car_id=%s LIMIT 500", [id])
    cursor0.fetchone()

Same code in perl:

foreach $_ ( 1 .. 20000 )
{
 $cursor = $db->prepare( "SELECT * FROM product_cars WHERE car_id=? LIMIT 500";);
 $cursor->execute(1);
#    while (my @Data= $cursor->fetchrow_array())
 if ($_ % 1000 == 0) { print "$_\n" };.
 @Data= $cursor->fetchrow_array();
# print "$_\n";
}

Code in ruby:

pk=2
20000.times do |i|
    if i % 1000 == 0
        print i, "\n"
    end
    res = my.query("SELECT * FROM product_cars WHERE car_id='#{pk}' LIMIT 500")
    res.fetch_row
end

update 2

Exec SQL "SELECT * FROM product WHERE id=1" (string without params) 100000 times
Python: ~15 sec 100% CPU 100%
Perl:   ~9 sec CPU 70-90%
Ruby:   ~6 sec CPU 60-80%

MySQL-server on other machine.

update 3

Tried use oursql and pymysql - worse results.

I also notice you haven't executed 'SET NAMES utf8' in the Perl version. — Derek Litz
– Derek Litz, Commented Dec 4, 2011 at 18:59
@Derek: I don't use MySQLdb but AFAIK it uses %s as the parameter marker, and supplying the actual args as a list instead of a tuple is unconventional rather than incorrect. — John Machin
– John Machin, Commented Dec 4, 2011 at 19:21
Why are you preparing the same SQL each time you go around the loop? Move $cursor = $db->prepare($sql); to outside the loop. — Quentin
– Quentin, Commented Dec 4, 2011 at 20:34
Your various language benchmarks are not the same. Perl and Python are executing with a parameter while Ruby is not. Comparing performance of hard coded parameters isn't very interesting, you'll rarely be doing that. Perl is the only one actually storing the result, Python and Ruby are not giving them an opportunity to optimize the fetch away. Perl is fetching as an array rather than an array ref (fetchrow_arrayref) which costs a bit more. Finally, leaving the prepare inside the loop keeps Perl from taking advantage of prepared statements. — Schwern
– Schwern, Commented Dec 5, 2011 at 1:52

Schwern · Accepted Answer · 2011-12-04 22:57:03Z

10

As people have pointed out, the way you're preparing and executing statements between the two is not the same and is not the recommended practice. Both should be taking advantage of prepared statements, and both should be preparing outside the loop.

However, it looks like that Python MySQL driver does not take advantage of server side prepared statements at all. This probably accounts for the poor performance.

Server side prepared statements were added in MySQL 4.1, but some drivers have been very slow to adapt. The MySQLdb users guide makes no mention of prepared statements and thinks "there are no cursors in MySQL, and no parameter substitution" which hasn't been true since MySQL 4.1. It also says "MySQLdb's Connection and Cursor objects are written in Python" rather than taking advantage of the MySQL API.

You may want to look at the oursql driver. It looks like it was written to take advantage of the "new" MySQL API and let the database optimize itself.

DBD::mysql (the Perl MySQL driver) can take advantage of prepared statements, but it does not by default according to the documentation. You have to turn it on by adding mysql_server_prepare=1 to your dsn. That should make the Perl example run even faster. Or the documentation is lying and they're on by default.

As an aside, one thing that will throw off benchmarks, though not account for anything like 2 minutes difference, is generating random numbers. They have significant cost.

Python code

#!/usr/bin/python
import random

for x in xrange(1, 100000):
    id = random.randint(0, 50000)

Perl code

#!/usr/bin/perl
foreach $_ ( 1 .. 100000 )
{
 $random = int(rand(50000));
}

Python time

real    0m0.194s
user    0m0.184s
sys     0m0.008s

Perl time

real    0m0.019s
user    0m0.015s
sys     0m0.003s

To keep this from becoming an issue in more sensitive benchmarks, increment a counter instead.

edited Dec 4, 2011 at 22:57

answered Dec 4, 2011 at 22:50

Schwern

167k28 gold badges225 silver badges370 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Andrew G Over a year ago

Perl version on my mashine: real 0m0.195s user 0m0.195s sys 0m0.000s. Python for x in xrange(1, 100000): int(random.random() * 50000) real 0m0.060s user 0m0.058s sys 0m0.002s

Schwern Over a year ago

@AndrewG It's probably a quirk of OS X (mine) vs Linux (yours) support for random numbers in the two languages. It's not so important what numbers you get, but that it can throw off a benchmark.

daxim · Accepted Answer · 2011-12-04 22:54:09Z

9

In theory, your Perl code should speed up significantly if you execute $cursor = $db->prepare($sql); before the loop and simply reexecute the same prepared query repeatedly. I suspect either DBI or MySQL has simply cached and ignored your repeated identical query preparations.

Your Python code, on the other hand, demands that different queries be recompiled each time because you aren't using a prepared query. I'd expect the speed difference to evaporate if you prepare both queries properly before their loop. There are security benefits for to using prepared queries as well, by the way.

edited Dec 4, 2011 at 22:54

daxim

39.3k4 gold badges71 silver badges135 bronze badges

answered Dec 4, 2011 at 20:28

Jeff Burdges

4,27026 silver badges50 bronze badges

2 Comments

John Machin Over a year ago

If that's the case with the Python script then it's not his fault. Python db API docs": """A reference to the operation will be retained by the cursor. If the same operation object is passed in again, then the cursor can optimize its behavior. This is most effective for algorithms where the same operation is used, but different parameters are bound to it (many times)."""

Jeff Burdges Over a year ago

Interesting, but the point remains that his benchmarks are corrupted by side optimizations unless he prepares the query before the loop in each language.

Collectives™ on Stack Overflow

Python have slow db-query, but Perl not

2 Answers 2

2 Comments

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

2 Comments

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related