4

I would like to know why this query takes is slow (about 10 to 20 seconds), the three tables used have 500,000 records, this is the query:

SELECT  *, 'rg_egresos' AS nombre_tabla
    FROM  rg_detallexml DE
    INNER JOIN  rg_egresos EG
    INNER JOIN  rg_emisor EM  ON DE.idContador = EG.id
      AND  DE.idDetalleXml = EG.idDetalleXml
      AND  DE.idContador = EM.idContador
      AND  DE.idDetalleXml = EM.idDetalleXml
    WHERE  DE.idContador = '14894'
      AND  DATE_FORMAT(dateFechaHora, '%Y-%m-%d') BETWEEN '2017-10-01'
                                                      AND '2017-10-31'
      AND  strTipodeComprobante = 'egreso'
      AND  version_xml = '3.2'
      AND  estado_factura = 0
      AND  modificado = 0;

And this is what it shows when I use EXPLAIN

*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: EG
         type: index_merge
possible_keys: idx_idDetallexml,idx_estado_factura,idx_modificado,idx_idContador
          key: idx_idContador,idx_estado_factura,idx_modificado
      key_len: 4,4,4
          ref: NULL
         rows: 2111
        Extra: Using intersect(idx_idContador,idx_estado_factura,idx_modificado); Using where
*************************** 2. row ***************************
           id: 1
  select_type: SIMPLE
        table: DE
         type: eq_ref
possible_keys: PRIMARY,idx_strTipodeComprobante,idx_idContador,idx_version_xml
          key: PRIMARY
      key_len: 4
          ref: db_pwf.EG.idDetalleXml
         rows: 1
        Extra: Using where
*************************** 3. row ***************************
           id: 1
  select_type: SIMPLE
        table: EM
         type: ref
possible_keys: idx_idContador,idx_idDetallexml
          key: idx_idDetallexml
      key_len: 4
          ref: db_pwf.DE.idDetalleXml
         rows: 1
        Extra: Using where

Can you see a way to improve the query?, I have other queries working with bigger tables and they are faster, all the required fields have its index, thanks.

Table rg_detallexml:

+---------------------------------+--------------+------+-----+---------+----------------+
| Field                           | Type         | Null | Key | Default | Extra          |
+---------------------------------+--------------+------+-----+---------+----------------+
| idDetalleXml                    | int(10)      | NO   | PRI | NULL    | auto_increment |
| UUID                            | varchar(50)  | NO   | MUL | NULL    |                |
| dateFechaSubida                 | varchar(7)   | YES  |     | NULL    |                |
| idContador                      | int(10)      | NO   | MUL | NULL    |                |
| dateFechaHora                   | datetime     | YES  | MUL | NULL    |                |
| dateFechaHoraCertificacion      | datetime     | YES  |     | NULL    |                |
| dateFechaPago                   | datetime     | YES  |     | NULL    |                |
| intFolio                        | int(10)      | YES  |     | NULL    |                |
| strSerie                        | varchar(2)   | YES  |     | A       |                |
| doubleDescuento                 | double       | YES  |     | NULL    |                |
| doubleTotal                     | double       | YES  |     | NULL    |                |
| doubleSubtotal                  | double       | YES  |     | NULL    |                |
| duobleTotalImpuestosTrasladados | double       | YES  |     | NULL    |                |
| doubleTotalImpuestosRetenidos   | double       | YES  |     | NULL    |                |
| doubleTotalRetencionesLocales   | double       | YES  |     | NULL    |                |
| doubleTotalTrasladosLocales     | double       | YES  |     | NULL    |                |
| strTipodeComprobante            | varchar(15)  | YES  | MUL | NULL    |                |
| strMetodoDePago                 | varchar(150) | YES  |     | NULL    |                |
| strFormaDePago                  | varchar(150) | YES  |     | NULL    |                |
| strMoneda                       | varchar(10)  | YES  |     | NULL    |                |
| tipoCambio                      | double       | NO   |     | NULL    |                |
| strLugarExpedicion              | varchar(150) | YES  |     | NULL    |                |
| DIOT                            | int(1)       | YES  |     | 0       |                |
| version_xml                     | varchar(10)  | NO   | MUL | NULL    |                |
+---------------------------------+--------------+------+-----+---------+----------------+

Table rg_egresos:

+---------------------------+--------------+------+-----+---------+----------------+
| Field                     | Type         | Null | Key | Default | Extra          |
+---------------------------+--------------+------+-----+---------+----------------+
| id_egreso                 | int(11)      | NO   | PRI | NULL    | auto_increment |
| id                        | int(11)      | NO   | MUL | NULL    |                |
| idDetalleXml              | int(10)      | NO   | MUL | NULL    |                |
| idCatalogo                | int(19)      | NO   | MUL | NULL    |                |
| tipoCuenta                | int(11)      | NO   | MUL | NULL    |                |
| intRubro                  | int(1)       | NO   |     | NULL    |                |
| RFC                       | varchar(20)  | NO   | MUL | NULL    |                |
| compra_gastos_0_porciento | float        | NO   | MUL | NULL    |                |
| deducible                 | int(1)       | NO   |     | NULL    |                |
| compra_gastos_exentos     | float        | NO   |     | NULL    |                |
| no_deducibles             | float        | NO   |     | NULL    |                |
| estado_factura            | int(11)      | NO   | MUL | NULL    |                |
| fecha                     | date         | NO   | MUL | NULL    |                |
| total_xml                 | double       | NO   |     | NULL    |                |
| subtotal_xml              | double       | NO   |     | NULL    |                |
| iva_xml                   | double       | NO   |     | NULL    |                |
| total_impuestos           | double       | NO   |     | NULL    |                |
| abonado                   | double       | NO   |     | NULL    |                |
| subtotal                  | double       | NO   |     | NULL    |                |
| iva                       | double       | NO   |     | NULL    |                |
| pendiente                 | double       | NO   |     | NULL    |                |
| subtotal_sin_iva          | double       | NO   |     | NULL    |                |
| acreditable               | int(1)       | NO   | MUL | 0       |                |
| fecha_operacion           | datetime     | NO   | MUL | NULL    |                |
| modificado                | int(1)       | NO   | MUL | NULL    |                |
| UUID                      | varchar(50)  | NO   | MUL | NULL    |                |
| IEPS                      | double       | NO   |     | NULL    |                |
| retencion_iva             | double       | NO   |     | NULL    |                |
| retencion_isr             | double       | NO   |     | NULL    |                |
| imp_local                 | double       | NO   |     | 0       |                |
| enviado_a                 | int(11)      | NO   | MUL | NULL    |                |
| enviado_al_iva            | int(1)       | NO   |     | NULL    |                |
| EsNomina                  | int(1)       | NO   | MUL | 0       |                |
| dateFechaPago             | date         | NO   | MUL | NULL    |                |
| nota_credito              | int(1)       | NO   | MUL | NULL    |                |
| extranjero                | int(1)       | NO   | MUL | NULL    |                |
| pago_banco                | int(1)       | NO   | MUL | NULL    |                |
| idBanco_Pago              | int(20)      | NO   | MUL | NULL    |                |
| movimientoPago            | int(10)      | NO   |     | NULL    |                |
| saldo_banco               | varchar(50)  | NO   |     | NULL    |                |
| tipo_pago                 | int(1)       | NO   |     | 0       |                |
| responsable               | varchar(100) | NO   |     | NULL    |                |
+---------------------------+--------------+------+-----+---------+----------------+

Table rg_emisor:

+-----------------+--------------+------+-----+---------+----------------+
| Field           | Type         | Null | Key | Default | Extra          |
+-----------------+--------------+------+-----+---------+----------------+
| idEmisor        | int(10)      | NO   | PRI | NULL    | auto_increment |
| idDetalleXml    | int(10)      | NO   | MUL | NULL    |                |
| idContador      | int(10)      | NO   | MUL | NULL    |                |
| strRFC          | varchar(13)  | NO   |     | NULL    |                |
| strNombreEmisor | varchar(200) | YES  |     | NULL    |                |
| strRegimen      | varchar(250) | YES  |     | NULL    |                |
| strPais         | varchar(40)  | YES  |     | MX      |                |
| strEstado       | varchar(50)  | YES  |     | NULL    |                |
| intCP           | int(5)       | YES  |     | NULL    |                |
| strMunicipio    | varchar(250) | YES  |     | NULL    |                |
| strLocalidad    | varchar(250) | YES  |     | NULL    |                |
| strColonia      | varchar(250) | YES  |     | NULL    |                |
| intNumExt       | int(10)      | YES  |     | NULL    |                |
| intNumInt       | int(10)      | YES  |     | NULL    |                |
| strCalle        | varchar(250) | YES  |     | NULL    |                |
| regimenFiscal   | varchar(20)  | YES  |     | NULL    |                |
+-----------------+--------------+------+-----+---------+----------------+
13
  • first off, if the variable should be a certain type then cast it.... example the date variable.. cast('2018-01-01' as date) Commented Mar 14, 2018 at 21:01
  • @NatiGetch I'm working with PHP, batch? why? Commented Mar 14, 2018 at 21:02
  • Should I use CAST with both dates? Commented Mar 14, 2018 at 21:04
  • Is this query behaving how you intend? There is no 'on' clause here and is likely causing a massive crossjoin : FROM rg_detallexml DE INNER JOIN rg_egresos EG. er...or read scaisEdge's answer Commented Mar 14, 2018 at 21:24
  • Yeah, now i'm using the query by scalsEdge but still there is no difference. Commented Mar 14, 2018 at 21:26

3 Answers 3

3

Now that you've shown the tables, we see that rg_egresos.id is not the table's ID. There can hence be multiple records for one contador in the table. Let's look at the tables and the query more closely:

All tables contain a contador ID and a DetalleXml ID. You want to join them all on these two fields. So you start with the rg_detallexml and get all records for the contador. With the idDetalleXml thus found, you search for rg_egresos and rg_emisors.

This is a bit strange. First of all an rg_detallexml is obviously linked to one contador, but in the other tables the rg_detallexml can be linked to another contador. Well, that may be possible (some kind of from/to relation maybe). But with five rg_egresos records and four rg_emisors records for an rg_detallexml/contador, you'd select thirty records, because you are combining rg_egresos records with rg_emisors records that are not really related.

Anyway: you want to find rg_detallexml quickly.

create index idx_de on rg_detallexml(idcontador, strtipodecomprobante, version_xml,
                                     datefechahora, iddetallexml);

Then you look for rg_egresos:

create index idx_eg on rg_egresos(id, iddetallexml, estado_factura, modificad);

At last you look for rg_emisor:

create index idx_em on rg_emisor(idcontador, iddetallexml);

As the columns are present in all tables, we could of course go through them in any order. Starting with rg_detallexml seems most natural and most restrictive, too, but that is not necessarily best. So you may want to offer the DBMS yet another index:

create index idx_eg2 on rg_egresos(id, estado_factura, modificad, iddetallexml);

which would allow the DBMS to look up the contador's records in this table first and with the added criteria find related iddetallexml here.

Sign up to request clarification or add additional context in comments.

8 Comments

I will try this, one question is necessary delete the previous index of these tables? or it doesnt matter? thank you.
Indexes are merely offers to the DBMS. You can have as many as you want. When executing a query the DBMS will make its pick. More indexes don't do any harm (except for inserts and deletes slowing slightly down due to the additional work to update the index and for more memory being consumed; for these reasons it's a good idea to get rid of indexes that are never used).
I added these new indexes and the EXPLAIN output changed but right now I cant tell you if it was improved because no one is using the system right now I will check it in few hours and I will tell you if this works better, thank you.
I have another question i added my old indexes individually, I didnt know that you can add multiple columns with the same name of index, how do you decide which columns could be together in the same index? what is the criteria? thank you.
Indexes help queries find the records. Very often tables get accessed via their ID, so there is an index anyway. But then there is additional criteria, as in your example where you look for a contador's records. To help the query you think about the order in which it seems most natural to browse through the tables. I tried that first in my other answer and did this again here. Then you look at the WHERE clause and ON clauses to see what's needed to find records.
|
1

The biggest problem I see is on this part:

DATE_FORMAT(dateFechaHora, '%Y-%m-%d') BETWEEN '2017-10-01' AND '2017-10-31'

is dateFechaHora a datetime field? Why are you converting a datetime field to a string (DATE_FORMAT)? even if you have an index on the dateFechaHora field, it won't be used.

I would suggest you to use this code instead:

and DateFechaHora >= '2017-10-01' and DateFechaHora < '2017-11-01'
                                                       ^^^^^^^^^^

yes it's the following day and it won't be included.

So your query might look like this:

select
  *,
  'rg_egresos' AS nombre_tabla 
from
  rg_detallexml DE inner join rg_egresos EG
  on DE.idContador = EG.id and DE.idDetalleXml = EG.idDetalleXml 
  inner join rg_emisor EM on DE.idContador = EM.idContador 
  and DE.idDetalleXml = EM.idDetalleXml 
where
  DE.idContador = '14894' 
  and dateFechaHora >= '2017-10-01' and dateFechaHora < '2017-11-01' 
  and strTipodeComprobante = 'egreso'
  and version_xml = '3.2' 
  and estado_factura = 0 
  and modificado = 0
;

5 Comments

I try using CAST as maSTAShuFu said but still was the same, I will try with your answer and I will tell you what happened.
I tried using this query but still is the same, and yes dateFechaHora is a datetime field
@ashe no improvement at all? which indexes are defined?
no, I could show you the output of the explain of this query, all the fields in the WHERE clause have index and the fields idDetalleXml, 'idContador` of the tables
I added the structure of the tables.
1

I see two partial Answers in the other replies. Let's tie them together.

Change

  AND  DATE_FORMAT(dateFechaHora, '%Y-%m-%d') BETWEEN '2017-10-01'
                                                  AND '2017-10-31'

to

  AND DE.dateFechaHora >= '2017-10-01'
  AND DE.dateFechaHora  < '2017-10-01' + INTERVAL 1 MONTH

and

If DE is a good starting table:

DE:  INDEX(idContador, strTipodeComprobante, version_xml, dateFechaHora)
       -- date last; others in any order

If EG is a better starting table:

EG:  INDEX(estado_factura, modificado, id)   -- in any order
DE:  INDEX(idContador, idDetalleXml,
           strTipodeComprobante, version_xml, dateFechaHora)

Also have

EM:  INDEX(idContador, idDetalleXml)   -- in either order

"Using intersect" almost always is a clue that you should have a composite index instead of separate indexes. (The separate indexes may be useful for other queries.)

(That is, add all those indexes, then let the Optimizer decide.)

Please use SHOW CREATE TABLE, not the less-descriptive DESCRIBE.

Do you really need SELECT *?

The query, after my suggestions:

SELECT  DE.*,
        EG.*,
        EM.*,
        'rg_egresos' AS nombre_tabla
    FROM  rg_detallexml DE
    INNER JOIN  rg_egresos EG
       ON  DE.idContador = EG.id
      AND  DE.idDetalleXml = EG.idDetalleXml
    INNER JOIN  rg_emisor EM
       ON  DE.idContador = EM.idContador
      AND  DE.idDetalleXml = EM.idDetalleXml
    WHERE  DE.idContador = '14894'
      AND  DE.dateFechaHora >= '2017-10-01'
      AND  DE.dateFechaHora  < '2017-10-01' + INTERVAL 1 MONTH
      AND  DE.strTipodeComprobante = 'egreso'
      AND  DE.version_xml = '3.2'
      AND  EG.estado_factura = 0
      AND  EG.modificado = 0;

4 Comments

I will check this as well, so when Using intersect appears I need to consider to use a composite index?
Yes. Perhaps always it would be better. More
I tried this and is really faster, is better to use the range of dates like that? why is better than between?
@Ashe - The reformulation of the date range has no impact on performance. It improves clarity (obvious that you want 1 month), simplicity (don't need to compute last day of month), correctness (avoiding midnight of both ends), etc. The speed comes from better ("composite") indexes.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.