0

I'm using siege to test the speed of a new site and I've found it could only handle around 30 concurrent connections per second using an AWS RDS small instance - small database. (I tried larger database and got more connections but it was still strangely low).

I've done a lot of testing to find the weak link and (eg: tested nginx/php-fpm with std HTML page, with php included, with memcached sessions) and this all works fine... its the database that is the issue.

I have 2 queries below - the first is just a test and it works fine/fast - I can get 3500 hits if run 100 concurrent connections over 20 seconds:

  $database_users = new database('dbname');
  $sql='SELECT COUNT(userid) as yes FROM login;';
  $pds=$database_users->pdo->prepare($sql); $pds->execute(array()); $row=$pds->fetch();
  echo $row['yes'];

The query below however is slow and I only get around 70 hits - its the query I use:

      $database_users = new database('dbname');
      $sql='SELECT a.countryCode FROM geoCountry AS a LEFT JOIN geoIPv4 AS b ON a.pid=b.geoCountry_pid WHERE \'2091528364\' BETWEEN startipNum AND endipNum;';
      $pds=$database_users->pdo->prepare($sql); $pds->execute(array()); $row=$pds->fetch();
      echo $row['countryCode'];

First query runs in 0.1 seconds and second runs in 0.3 seconds when I use a remote query tool.

I'm trying to understand why I would get such bad performance with the second. Wouldn't php/database just wait for the query to complete and then respond. Its only 0.2 of a second.

I can send other details if needed like php-fpm config.

any advice would be greatly appreciated - thankyou


CREATE TABLE `geoCountry` (
  `pid` tinyint(3) unsigned NOT NULL AUTO_INCREMENT COMMENT 'Primary Key',
  `countryCode` char(2) NOT NULL COMMENT 'Country Code',
  `zipEnabled` tinyint(1) NOT NULL DEFAULT '0' COMMENT '1=Has Zip Codes, 0=No Zip Codes',
  `english` varchar(75) NOT NULL COMMENT 'Language',
  `indonesian` varchar(75) NOT NULL COMMENT 'Language',
  `japanese` varchar(75) NOT NULL COMMENT 'Language',
  PRIMARY KEY (`pid`),
  UNIQUE KEY `countryCode` (`countryCode`),
  KEY `zipEnabled` (`zipEnabled`),
  CONSTRAINT `geoCountry_zipEnabled` FOREIGN KEY (`zipEnabled`) REFERENCES `xfk_generic_binary` (`binary`) ON DELETE NO ACTION ON UPDATE NO ACTION
) ENGINE=InnoDB AUTO_INCREMENT=249 DEFAULT CHARSET=utf8 COMMENT='Country Codes linked to Country Names'


CREATE TABLE `geoIPv4` (
  `pid` int(10) unsigned NOT NULL AUTO_INCREMENT COMMENT 'Primary Key',
  `geoCountry_pid` tinyint(3) unsigned NOT NULL COMMENT 'geoCountry Pid',
  `startipNum` int(10) unsigned NOT NULL COMMENT 'Start IP Address',
  `endipNum` int(10) unsigned NOT NULL COMMENT 'End IP Address',
  PRIMARY KEY (`pid`),
  KEY `geoCountry_pid` (`geoCountry_pid`),
  CONSTRAINT `geoIPv4_geoCountry_pid` FOREIGN KEY (`geoCountry_pid`) REFERENCES `geoCountry` (`pid`) ON DELETE NO ACTION ON UPDATE NO ACTION
) ENGINE=InnoDB AUTO_INCREMENT=148890 DEFAULT CHARSET=utf8 COMMENT='IPv4 Ranges linked to Country Codes';

* is it possible its php-fpm not waiting for the reply to come back or something to do with how siege works? Note: seige seems to work fine if the number of concurrent connections is low.

5
  • 3
    What indexes do you have on your tables? What does an EXPLAIN show? Commented Jan 1, 2014 at 2:24
  • Make sure that geoIPv4.geoCountry_pid is indexed. Also, since you're not actually binding any variables, don't use PDO::prepare() - use PDO::query() instead - it'll save you a call to the database on every query. Commented Jan 1, 2014 at 2:33
  • yes both geoIPv4.geoCountry_pid is indexed. Commented Jan 1, 2014 at 2:45
  • Please show exact table schemas as the result of SHOW CREATE TABLE <table_name> statements along with SHOW INDEXES FROM <table_name>. Commented Jan 1, 2014 at 2:45
  • ok will do now... thx Commented Jan 1, 2014 at 2:46

2 Answers 2

2

The fundamental problem with the query, even with an index on (startipNum, endipNum) is that a B-Tree index is not the optimal structure for finding a value BETWEEN the two columns, since every row with `startipNum` <= the value you're searching for is a candidate match and the fact that `endipNum` is indexed doesn't really help anything, since every `endipNum` for every valid `startipNum` has to be compared, even though (at least with the MaxMind database, presumably that's what you're using) there's only ever going to be one matching row.

You can optimize the query substantially since you know there's only ever going to be one matching row, by adding LIMIT 1 to the end. The server will stop looking as soon as it's found the matching row. I have also found that adding the opposite index (endipNum, startipNum) too will let the optimizer choose which of the two seems the most effective for any given query.

A (potentally) better approach that I have discussed previously (though it apparently blows some people's minds, since it's somewhat "outside the box") is to build an R-Tree index using the spatial extensions in MySQL.

See also:

http://blog.jcole.us/2007/11/24/on-efficiently-geo-referencing-ips-with-maxmind-geoip-and-mysql-gis/

Sign up to request clarification or add additional context in comments.

1 Comment

limit 1 is a fantastic idea... never knew that and will never forget that it will stop at one reply... thx so much... researching your indexes now... thankyou very muche!!
2

Maybe it's me but usage of LEFT JOIN in this case makes no sense to me.

IMHO your query should've looked like this

SELECT a.countryCode 
  FROM geoCountry a JOIN geoIPv4 b 
    ON a.pid = b.geoCountry_pid 
 WHERE 2091528364 BETWEEN startipNum AND endipNum

Make sure that you have a covering index on (startipNum, endipNum)

CREATE INDEX idx_startipNum_endipNum ON geoIPv4 (startipNum, endipNum);

3 Comments

thx for your comments - dropped the LEFT and added the 2 indexes and the speed was the same...
Rather than 2 indexes, you might need a single compound index across both columns. ADD INDEX idx_startip_endip (startipNum, endipNum)
it improves the query speed a touch. I'm going to tick your answer as I appreciate the help - I"m thinking this could be more to do with nginx and php at the minute... thx so much :)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.