Very slow MySQL query performance












1














I've a query that takes about 18 seconds to finish:



THE QUERY:



SELECT YEAR(c.date), MONTH(c.date), p.district_id, COUNT(p.owner_id)
FROM commission c
INNER JOIN partner p ON c.customer_id = p.id
WHERE (c.date BETWEEN '2018-01-01' AND '2018-12-31')
AND (c.company_id = 90)
AND (c.source = 'ACTUAL')
AND (p.id IN (3062, 3063, 3064, 3065, 3066, 3067, 3068, 3069, 3070, 3071,
3072, 3073, 3074, 3075, 3076, 3077, 3078, 3079, 3081, 3082, 3083, 3084,
3085, 3086, 3087, 3088, 3089, 3090, 3091, 3092, 3093, 3094, 3095, 3096,
3097, 3098, 3099, 3448, 3449, 3450, 3451, 3452, 3453, 3454, 3455, 3456,
3457, 3458, 3459, 3460, 3461, 3471, 3490, 3491, 6307, 6368, 6421))
GROUP BY YEAR(c.date), MONTH(c.date), p.district_id


The commission table has around 2,8 millions of records, of which 860 000+ belong to the current year 2018. The partner table has at this moment 8600+ records.



RESULT



| `YEAR(c.date)` | `MONTH(c.date)` | district_id | `COUNT(c.id)` | 
|----------------|-----------------|-------------|---------------|
| 2018 | 1 | 1 | 19154 |
| 2018 | 1 | 5 | 9184 |
| 2018 | 1 | 6 | 2706 |
| 2018 | 1 | 12 | 36296 |
| 2018 | 1 | 15 | 13085 |
| 2018 | 2 | 1 | 21231 |
| 2018 | 2 | 5 | 10242 |
| ... | ... | ... | ... |

55 rows retrieved starting from 1 in 18 s 374 ms
(execution: 18 s 368 ms, fetching: 6 ms)


EXPLAIN:



| id | select_type | table | partitions | type  | possible_keys                                                                                        | key                  | key_len | ref             | rows | filtered | extra                                        | 
|----|-------------|-------|------------|-------|------------------------------------------------------------------------------------------------------|----------------------|---------|-----------------|------|----------|----------------------------------------------|
| 1 | SIMPLE | p | null | range | PRIMARY | PRIMARY | 4 | | 57 | 100 | Using where; Using temporary; Using filesort |
| 1 | SIMPLE | c | null | ref | UNIQ_6F7146F0979B1AD62FC0CB0F5F8A7F73,IDX_6F7146F09395C3F3,IDX_6F7146F0979B1AD6,IDX_6F7146F0AA9E377A | IDX_6F7146F09395C3F3 | 5 | p.id | 6716 | 8.33 | Using where |


DDL:



create table if not exists commission (
id int auto_increment
primary key,
date date not null,
source enum('ACTUAL', 'EXPECTED') not null,
customer_id int null,
transaction_id varchar(255) not null,
company_id int null,
constraint UNIQ_6F7146F0979B1AD62FC0CB0F5F8A7F73 unique (company_id, transaction_id, source),
constraint FK_6F7146F09395C3F3 foreign key (customer_id) references partner (id),
constraint FK_6F7146F0979B1AD6 foreign key (company_id) references companies (id)
) collate=utf8_unicode_ci;
create index IDX_6F7146F09395C3F3 on commission (customer_id);
create index IDX_6F7146F0979B1AD6 on commission (company_id);
create index IDX_6F7146F0AA9E377A on commission (date);


I noted that by removing the partner IN condition MySQL takes only 3s. I tried to replace it doing something crazy like this:



AND (',3062,3063,3064,3065,3066,3067,3068,3069,3070,3071,3072,3073,3074,3075,3076,3077,3078,3079,3081,3082,3083,3084,3085,3086,3087,3088,3089,3090,3091,3092,3093,3094,3095,3096,3097,3098,3099,3448,3449,3450,3451,3452,3453,3454,3455,3456,3457,3458,3459,3460,3461,3471,3490,3491,6307,6368,6421,'
LIKE CONCAT('%,', p.id, ',%'))


and the result was about 5s... great! but it's a hack.



WHY this query is taking a very long execution time when I uses IN statement? workaround, tips, links, etc. Thanks!










share|improve this question




















  • 1




    Where does the p.id IN (....) list come from?
    – danblack
    Nov 20 at 23:28










  • As far as I understand, MySQL can only use one of those indexes on commission in it's WHERE; that part would better be server by a composite index on (date, company_id, source).
    – Uueerdo
    Nov 20 at 23:30










  • Note if your data range is 2018, the GROUP BY YEAR(c.date) is constant. Its not the largest contributor to query time by far.
    – danblack
    Nov 20 at 23:57










  • The LIKE hack won't work well because it cannot use PRIMARY KEY(id) on p. You do have that??
    – Rick James
    Nov 21 at 19:21






  • 1




    @danblack - The time taken for a function call is usually so insignificant (compared to fetching rows) as to be ignorable.
    – Rick James
    Nov 21 at 23:50
















1














I've a query that takes about 18 seconds to finish:



THE QUERY:



SELECT YEAR(c.date), MONTH(c.date), p.district_id, COUNT(p.owner_id)
FROM commission c
INNER JOIN partner p ON c.customer_id = p.id
WHERE (c.date BETWEEN '2018-01-01' AND '2018-12-31')
AND (c.company_id = 90)
AND (c.source = 'ACTUAL')
AND (p.id IN (3062, 3063, 3064, 3065, 3066, 3067, 3068, 3069, 3070, 3071,
3072, 3073, 3074, 3075, 3076, 3077, 3078, 3079, 3081, 3082, 3083, 3084,
3085, 3086, 3087, 3088, 3089, 3090, 3091, 3092, 3093, 3094, 3095, 3096,
3097, 3098, 3099, 3448, 3449, 3450, 3451, 3452, 3453, 3454, 3455, 3456,
3457, 3458, 3459, 3460, 3461, 3471, 3490, 3491, 6307, 6368, 6421))
GROUP BY YEAR(c.date), MONTH(c.date), p.district_id


The commission table has around 2,8 millions of records, of which 860 000+ belong to the current year 2018. The partner table has at this moment 8600+ records.



RESULT



| `YEAR(c.date)` | `MONTH(c.date)` | district_id | `COUNT(c.id)` | 
|----------------|-----------------|-------------|---------------|
| 2018 | 1 | 1 | 19154 |
| 2018 | 1 | 5 | 9184 |
| 2018 | 1 | 6 | 2706 |
| 2018 | 1 | 12 | 36296 |
| 2018 | 1 | 15 | 13085 |
| 2018 | 2 | 1 | 21231 |
| 2018 | 2 | 5 | 10242 |
| ... | ... | ... | ... |

55 rows retrieved starting from 1 in 18 s 374 ms
(execution: 18 s 368 ms, fetching: 6 ms)


EXPLAIN:



| id | select_type | table | partitions | type  | possible_keys                                                                                        | key                  | key_len | ref             | rows | filtered | extra                                        | 
|----|-------------|-------|------------|-------|------------------------------------------------------------------------------------------------------|----------------------|---------|-----------------|------|----------|----------------------------------------------|
| 1 | SIMPLE | p | null | range | PRIMARY | PRIMARY | 4 | | 57 | 100 | Using where; Using temporary; Using filesort |
| 1 | SIMPLE | c | null | ref | UNIQ_6F7146F0979B1AD62FC0CB0F5F8A7F73,IDX_6F7146F09395C3F3,IDX_6F7146F0979B1AD6,IDX_6F7146F0AA9E377A | IDX_6F7146F09395C3F3 | 5 | p.id | 6716 | 8.33 | Using where |


DDL:



create table if not exists commission (
id int auto_increment
primary key,
date date not null,
source enum('ACTUAL', 'EXPECTED') not null,
customer_id int null,
transaction_id varchar(255) not null,
company_id int null,
constraint UNIQ_6F7146F0979B1AD62FC0CB0F5F8A7F73 unique (company_id, transaction_id, source),
constraint FK_6F7146F09395C3F3 foreign key (customer_id) references partner (id),
constraint FK_6F7146F0979B1AD6 foreign key (company_id) references companies (id)
) collate=utf8_unicode_ci;
create index IDX_6F7146F09395C3F3 on commission (customer_id);
create index IDX_6F7146F0979B1AD6 on commission (company_id);
create index IDX_6F7146F0AA9E377A on commission (date);


I noted that by removing the partner IN condition MySQL takes only 3s. I tried to replace it doing something crazy like this:



AND (',3062,3063,3064,3065,3066,3067,3068,3069,3070,3071,3072,3073,3074,3075,3076,3077,3078,3079,3081,3082,3083,3084,3085,3086,3087,3088,3089,3090,3091,3092,3093,3094,3095,3096,3097,3098,3099,3448,3449,3450,3451,3452,3453,3454,3455,3456,3457,3458,3459,3460,3461,3471,3490,3491,6307,6368,6421,'
LIKE CONCAT('%,', p.id, ',%'))


and the result was about 5s... great! but it's a hack.



WHY this query is taking a very long execution time when I uses IN statement? workaround, tips, links, etc. Thanks!










share|improve this question




















  • 1




    Where does the p.id IN (....) list come from?
    – danblack
    Nov 20 at 23:28










  • As far as I understand, MySQL can only use one of those indexes on commission in it's WHERE; that part would better be server by a composite index on (date, company_id, source).
    – Uueerdo
    Nov 20 at 23:30










  • Note if your data range is 2018, the GROUP BY YEAR(c.date) is constant. Its not the largest contributor to query time by far.
    – danblack
    Nov 20 at 23:57










  • The LIKE hack won't work well because it cannot use PRIMARY KEY(id) on p. You do have that??
    – Rick James
    Nov 21 at 19:21






  • 1




    @danblack - The time taken for a function call is usually so insignificant (compared to fetching rows) as to be ignorable.
    – Rick James
    Nov 21 at 23:50














1












1








1


1





I've a query that takes about 18 seconds to finish:



THE QUERY:



SELECT YEAR(c.date), MONTH(c.date), p.district_id, COUNT(p.owner_id)
FROM commission c
INNER JOIN partner p ON c.customer_id = p.id
WHERE (c.date BETWEEN '2018-01-01' AND '2018-12-31')
AND (c.company_id = 90)
AND (c.source = 'ACTUAL')
AND (p.id IN (3062, 3063, 3064, 3065, 3066, 3067, 3068, 3069, 3070, 3071,
3072, 3073, 3074, 3075, 3076, 3077, 3078, 3079, 3081, 3082, 3083, 3084,
3085, 3086, 3087, 3088, 3089, 3090, 3091, 3092, 3093, 3094, 3095, 3096,
3097, 3098, 3099, 3448, 3449, 3450, 3451, 3452, 3453, 3454, 3455, 3456,
3457, 3458, 3459, 3460, 3461, 3471, 3490, 3491, 6307, 6368, 6421))
GROUP BY YEAR(c.date), MONTH(c.date), p.district_id


The commission table has around 2,8 millions of records, of which 860 000+ belong to the current year 2018. The partner table has at this moment 8600+ records.



RESULT



| `YEAR(c.date)` | `MONTH(c.date)` | district_id | `COUNT(c.id)` | 
|----------------|-----------------|-------------|---------------|
| 2018 | 1 | 1 | 19154 |
| 2018 | 1 | 5 | 9184 |
| 2018 | 1 | 6 | 2706 |
| 2018 | 1 | 12 | 36296 |
| 2018 | 1 | 15 | 13085 |
| 2018 | 2 | 1 | 21231 |
| 2018 | 2 | 5 | 10242 |
| ... | ... | ... | ... |

55 rows retrieved starting from 1 in 18 s 374 ms
(execution: 18 s 368 ms, fetching: 6 ms)


EXPLAIN:



| id | select_type | table | partitions | type  | possible_keys                                                                                        | key                  | key_len | ref             | rows | filtered | extra                                        | 
|----|-------------|-------|------------|-------|------------------------------------------------------------------------------------------------------|----------------------|---------|-----------------|------|----------|----------------------------------------------|
| 1 | SIMPLE | p | null | range | PRIMARY | PRIMARY | 4 | | 57 | 100 | Using where; Using temporary; Using filesort |
| 1 | SIMPLE | c | null | ref | UNIQ_6F7146F0979B1AD62FC0CB0F5F8A7F73,IDX_6F7146F09395C3F3,IDX_6F7146F0979B1AD6,IDX_6F7146F0AA9E377A | IDX_6F7146F09395C3F3 | 5 | p.id | 6716 | 8.33 | Using where |


DDL:



create table if not exists commission (
id int auto_increment
primary key,
date date not null,
source enum('ACTUAL', 'EXPECTED') not null,
customer_id int null,
transaction_id varchar(255) not null,
company_id int null,
constraint UNIQ_6F7146F0979B1AD62FC0CB0F5F8A7F73 unique (company_id, transaction_id, source),
constraint FK_6F7146F09395C3F3 foreign key (customer_id) references partner (id),
constraint FK_6F7146F0979B1AD6 foreign key (company_id) references companies (id)
) collate=utf8_unicode_ci;
create index IDX_6F7146F09395C3F3 on commission (customer_id);
create index IDX_6F7146F0979B1AD6 on commission (company_id);
create index IDX_6F7146F0AA9E377A on commission (date);


I noted that by removing the partner IN condition MySQL takes only 3s. I tried to replace it doing something crazy like this:



AND (',3062,3063,3064,3065,3066,3067,3068,3069,3070,3071,3072,3073,3074,3075,3076,3077,3078,3079,3081,3082,3083,3084,3085,3086,3087,3088,3089,3090,3091,3092,3093,3094,3095,3096,3097,3098,3099,3448,3449,3450,3451,3452,3453,3454,3455,3456,3457,3458,3459,3460,3461,3471,3490,3491,6307,6368,6421,'
LIKE CONCAT('%,', p.id, ',%'))


and the result was about 5s... great! but it's a hack.



WHY this query is taking a very long execution time when I uses IN statement? workaround, tips, links, etc. Thanks!










share|improve this question















I've a query that takes about 18 seconds to finish:



THE QUERY:



SELECT YEAR(c.date), MONTH(c.date), p.district_id, COUNT(p.owner_id)
FROM commission c
INNER JOIN partner p ON c.customer_id = p.id
WHERE (c.date BETWEEN '2018-01-01' AND '2018-12-31')
AND (c.company_id = 90)
AND (c.source = 'ACTUAL')
AND (p.id IN (3062, 3063, 3064, 3065, 3066, 3067, 3068, 3069, 3070, 3071,
3072, 3073, 3074, 3075, 3076, 3077, 3078, 3079, 3081, 3082, 3083, 3084,
3085, 3086, 3087, 3088, 3089, 3090, 3091, 3092, 3093, 3094, 3095, 3096,
3097, 3098, 3099, 3448, 3449, 3450, 3451, 3452, 3453, 3454, 3455, 3456,
3457, 3458, 3459, 3460, 3461, 3471, 3490, 3491, 6307, 6368, 6421))
GROUP BY YEAR(c.date), MONTH(c.date), p.district_id


The commission table has around 2,8 millions of records, of which 860 000+ belong to the current year 2018. The partner table has at this moment 8600+ records.



RESULT



| `YEAR(c.date)` | `MONTH(c.date)` | district_id | `COUNT(c.id)` | 
|----------------|-----------------|-------------|---------------|
| 2018 | 1 | 1 | 19154 |
| 2018 | 1 | 5 | 9184 |
| 2018 | 1 | 6 | 2706 |
| 2018 | 1 | 12 | 36296 |
| 2018 | 1 | 15 | 13085 |
| 2018 | 2 | 1 | 21231 |
| 2018 | 2 | 5 | 10242 |
| ... | ... | ... | ... |

55 rows retrieved starting from 1 in 18 s 374 ms
(execution: 18 s 368 ms, fetching: 6 ms)


EXPLAIN:



| id | select_type | table | partitions | type  | possible_keys                                                                                        | key                  | key_len | ref             | rows | filtered | extra                                        | 
|----|-------------|-------|------------|-------|------------------------------------------------------------------------------------------------------|----------------------|---------|-----------------|------|----------|----------------------------------------------|
| 1 | SIMPLE | p | null | range | PRIMARY | PRIMARY | 4 | | 57 | 100 | Using where; Using temporary; Using filesort |
| 1 | SIMPLE | c | null | ref | UNIQ_6F7146F0979B1AD62FC0CB0F5F8A7F73,IDX_6F7146F09395C3F3,IDX_6F7146F0979B1AD6,IDX_6F7146F0AA9E377A | IDX_6F7146F09395C3F3 | 5 | p.id | 6716 | 8.33 | Using where |


DDL:



create table if not exists commission (
id int auto_increment
primary key,
date date not null,
source enum('ACTUAL', 'EXPECTED') not null,
customer_id int null,
transaction_id varchar(255) not null,
company_id int null,
constraint UNIQ_6F7146F0979B1AD62FC0CB0F5F8A7F73 unique (company_id, transaction_id, source),
constraint FK_6F7146F09395C3F3 foreign key (customer_id) references partner (id),
constraint FK_6F7146F0979B1AD6 foreign key (company_id) references companies (id)
) collate=utf8_unicode_ci;
create index IDX_6F7146F09395C3F3 on commission (customer_id);
create index IDX_6F7146F0979B1AD6 on commission (company_id);
create index IDX_6F7146F0AA9E377A on commission (date);


I noted that by removing the partner IN condition MySQL takes only 3s. I tried to replace it doing something crazy like this:



AND (',3062,3063,3064,3065,3066,3067,3068,3069,3070,3071,3072,3073,3074,3075,3076,3077,3078,3079,3081,3082,3083,3084,3085,3086,3087,3088,3089,3090,3091,3092,3093,3094,3095,3096,3097,3098,3099,3448,3449,3450,3451,3452,3453,3454,3455,3456,3457,3458,3459,3460,3461,3471,3490,3491,6307,6368,6421,'
LIKE CONCAT('%,', p.id, ',%'))


and the result was about 5s... great! but it's a hack.



WHY this query is taking a very long execution time when I uses IN statement? workaround, tips, links, etc. Thanks!







mysql query-performance mysql-5.7






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 21 at 19:16









Rick James

65.8k55797




65.8k55797










asked Nov 20 at 23:11









yceruto

7,07521945




7,07521945








  • 1




    Where does the p.id IN (....) list come from?
    – danblack
    Nov 20 at 23:28










  • As far as I understand, MySQL can only use one of those indexes on commission in it's WHERE; that part would better be server by a composite index on (date, company_id, source).
    – Uueerdo
    Nov 20 at 23:30










  • Note if your data range is 2018, the GROUP BY YEAR(c.date) is constant. Its not the largest contributor to query time by far.
    – danblack
    Nov 20 at 23:57










  • The LIKE hack won't work well because it cannot use PRIMARY KEY(id) on p. You do have that??
    – Rick James
    Nov 21 at 19:21






  • 1




    @danblack - The time taken for a function call is usually so insignificant (compared to fetching rows) as to be ignorable.
    – Rick James
    Nov 21 at 23:50














  • 1




    Where does the p.id IN (....) list come from?
    – danblack
    Nov 20 at 23:28










  • As far as I understand, MySQL can only use one of those indexes on commission in it's WHERE; that part would better be server by a composite index on (date, company_id, source).
    – Uueerdo
    Nov 20 at 23:30










  • Note if your data range is 2018, the GROUP BY YEAR(c.date) is constant. Its not the largest contributor to query time by far.
    – danblack
    Nov 20 at 23:57










  • The LIKE hack won't work well because it cannot use PRIMARY KEY(id) on p. You do have that??
    – Rick James
    Nov 21 at 19:21






  • 1




    @danblack - The time taken for a function call is usually so insignificant (compared to fetching rows) as to be ignorable.
    – Rick James
    Nov 21 at 23:50








1




1




Where does the p.id IN (....) list come from?
– danblack
Nov 20 at 23:28




Where does the p.id IN (....) list come from?
– danblack
Nov 20 at 23:28












As far as I understand, MySQL can only use one of those indexes on commission in it's WHERE; that part would better be server by a composite index on (date, company_id, source).
– Uueerdo
Nov 20 at 23:30




As far as I understand, MySQL can only use one of those indexes on commission in it's WHERE; that part would better be server by a composite index on (date, company_id, source).
– Uueerdo
Nov 20 at 23:30












Note if your data range is 2018, the GROUP BY YEAR(c.date) is constant. Its not the largest contributor to query time by far.
– danblack
Nov 20 at 23:57




Note if your data range is 2018, the GROUP BY YEAR(c.date) is constant. Its not the largest contributor to query time by far.
– danblack
Nov 20 at 23:57












The LIKE hack won't work well because it cannot use PRIMARY KEY(id) on p. You do have that??
– Rick James
Nov 21 at 19:21




The LIKE hack won't work well because it cannot use PRIMARY KEY(id) on p. You do have that??
– Rick James
Nov 21 at 19:21




1




1




@danblack - The time taken for a function call is usually so insignificant (compared to fetching rows) as to be ignorable.
– Rick James
Nov 21 at 23:50




@danblack - The time taken for a function call is usually so insignificant (compared to fetching rows) as to be ignorable.
– Rick James
Nov 21 at 23:50












3 Answers
3






active

oldest

votes


















2














MySQL can use one index at a time. For this query you need a compound index covering the aspects of the search. Constant aspects of the WHERE clause should be used before range aspects like:



ALTER TABLE commission
DROP INDEX IDX_6F7146F0979B1AD6,
ADD INDEX IDX_6F7146F0979B1AD6 (company_id, source, date)





share|improve this answer





















  • query optimization
    – danblack
    Nov 20 at 23:39










  • How MySQL Uses indexes
    – danblack
    Nov 20 at 23:59










  • Adding a composite index on (customer_id, company_id, date, source) solves the performance issue (takes now less than 1s). Thanks for your answer and links.
    – yceruto
    Nov 21 at 16:29






  • 1




    And an index cookbook
    – Rick James
    Nov 21 at 19:18






  • 1




    @yceruto - Your 4-column index works well if the Optimizer starts with p, but you need to move date to the end. Dan's 3-column index works well if it starts with c. Add both.
    – Rick James
    Nov 21 at 19:19



















1














Here's what the Optimizer sees in your query.



Checking whether to use an index for the GROUP BY:




  • Functions (YEAR()) in the GROUP BY, so no.

  • Multiple tables (c and p) mentioned, so no.


For a JOIN, Optimizer will (almost always) start with one, then reach into the other. So, let's look at the two options:



If starting with p:



Assuming you have PRIMARY KEY(id), there is not much to think about. It will simply use that index.



For each row selected from p, it will then look into c, and any variation of this INDEX would be optimal.



c: INDEX(company_id, source, customer_id,  -- in any order (all are tested "=")
date) -- last, since it is tested as a range


If starting with c:



c: INDEX(company_id, source,  -- in any order (all are tested "=")
date) -- last, since it is tested as a range
-- slightly better:
c: INDEX(company_id, source, -- in any order (all are tested "=")
date, -- last, since it is tested as a range
customer_id) -- really last -- added only to make it "covering".


The Optimizer will look at "statistics" to crudely decide which table to start with. So, add all the indexes I suggested.



A "covering" index is one that contains all the columns needed anywhere in the query. It is sometimes wise to extend a 'good' index with more columns to make it "covering".



But there is a monkey wrench in here. c.customer_id = p.id means that customer_id IN (...) effectively exists. But now there are two "range-like" constraints -- one is an IN, the other is a 'range'. In some newer versions, the Optimizer will happily jump around due to the IN and still be able to do "range" scans. So, I recommend this ordering:




  1. Test(s) of column = constant

  2. Test(s) with IN


  3. One 'range' test (BETWEEN, >=, LIKE with trailing wildcard, etc)

  4. Perhaps add more columns to make it "covering" -- but don't do this step if you end up with more than, say, 5 columns in the index.


Hence, for c, the following is optimal for the WHERE, and happens to be "covering".



INDEX(company_id, source,  -- first, but in any order (all "=")
customer_id, -- "IN"
date) -- last, since it is tested as a range

p: (same as above)


Since there was an IN or "range", there is no use seeing if the index can also handle the GROUP BY.



A note on COUNT(x) -- it checks that x is NOT NULL. It is usually just as correct to say COUNT(*), which counts the number of rows without any extra checking.



This is a non-starter since it hides the indexed column (id) in a function:



AND (',3062,3063,3064,3065,3066,...6368,6421,'
LIKE CONCAT('%,', p.id, ',%'))





share|improve this answer





















  • Yes, I tried with (company_id, date, customer_id, source) is slightly better. Thanks!
    – yceruto
    Nov 21 at 20:53










  • The order of the columns in the index is vital to performance. Your attempt has date too early.
    – Rick James
    Nov 21 at 21:40










  • Sorry (typo) I meant (company_id, source, customer_id, date) obviously :)
    – yceruto
    Nov 21 at 21:43










  • @yceruto - Interesting. How many rows in the resultset?
    – Rick James
    Nov 21 at 21:59










  • After the composite index > EXPLAIN > rows: 4548, filtered: 50 (for c table)
    – yceruto
    Nov 21 at 23:45





















0














With your LIKE-hack you are tricking optimizer so it uses different plan (most probably using IDX_6F7146F0AA9E377A index on the first place).
You should be able to see this in explain.



I think the real issue in your case is the second line of explain: server executing multiple functions (MONTH, YEAR) for 6716 rows and then trying to group all these rows. During this time all these 6716 rows should be stored (in memory or on disk that is based on your server configuration).



SELECT COUNT(*) FROM commission WHERE (date BETWEEN '2018-01-01' AND '2018-12-31') AND company_id = 90 AND source = 'ACTUAL';


=> How many rows are we talking about?



If the number in above query is much lower then 6716 I'd try to add covering index on columns customer_id, company_id, source and date. Not sure about the best order as it depends on data you have (check cardinality for these columns). I'd started with index (date, company_id, source, customer_id). Also, I'd add unique index (id, district_id, owner_id) on partner.



It is also possible to add additional generated stored columns _year and _month (if your server is a bit old you can add normal columns and fill them in with trigger) to rid off the multiple function executions.






share|improve this answer























    Your Answer






    StackExchange.ifUsing("editor", function () {
    StackExchange.using("externalEditor", function () {
    StackExchange.using("snippets", function () {
    StackExchange.snippets.init();
    });
    });
    }, "code-snippets");

    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "1"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














    draft saved

    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53403008%2fvery-slow-mysql-query-performance%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    3 Answers
    3






    active

    oldest

    votes








    3 Answers
    3






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    2














    MySQL can use one index at a time. For this query you need a compound index covering the aspects of the search. Constant aspects of the WHERE clause should be used before range aspects like:



    ALTER TABLE commission
    DROP INDEX IDX_6F7146F0979B1AD6,
    ADD INDEX IDX_6F7146F0979B1AD6 (company_id, source, date)





    share|improve this answer





















    • query optimization
      – danblack
      Nov 20 at 23:39










    • How MySQL Uses indexes
      – danblack
      Nov 20 at 23:59










    • Adding a composite index on (customer_id, company_id, date, source) solves the performance issue (takes now less than 1s). Thanks for your answer and links.
      – yceruto
      Nov 21 at 16:29






    • 1




      And an index cookbook
      – Rick James
      Nov 21 at 19:18






    • 1




      @yceruto - Your 4-column index works well if the Optimizer starts with p, but you need to move date to the end. Dan's 3-column index works well if it starts with c. Add both.
      – Rick James
      Nov 21 at 19:19
















    2














    MySQL can use one index at a time. For this query you need a compound index covering the aspects of the search. Constant aspects of the WHERE clause should be used before range aspects like:



    ALTER TABLE commission
    DROP INDEX IDX_6F7146F0979B1AD6,
    ADD INDEX IDX_6F7146F0979B1AD6 (company_id, source, date)





    share|improve this answer





















    • query optimization
      – danblack
      Nov 20 at 23:39










    • How MySQL Uses indexes
      – danblack
      Nov 20 at 23:59










    • Adding a composite index on (customer_id, company_id, date, source) solves the performance issue (takes now less than 1s). Thanks for your answer and links.
      – yceruto
      Nov 21 at 16:29






    • 1




      And an index cookbook
      – Rick James
      Nov 21 at 19:18






    • 1




      @yceruto - Your 4-column index works well if the Optimizer starts with p, but you need to move date to the end. Dan's 3-column index works well if it starts with c. Add both.
      – Rick James
      Nov 21 at 19:19














    2












    2








    2






    MySQL can use one index at a time. For this query you need a compound index covering the aspects of the search. Constant aspects of the WHERE clause should be used before range aspects like:



    ALTER TABLE commission
    DROP INDEX IDX_6F7146F0979B1AD6,
    ADD INDEX IDX_6F7146F0979B1AD6 (company_id, source, date)





    share|improve this answer












    MySQL can use one index at a time. For this query you need a compound index covering the aspects of the search. Constant aspects of the WHERE clause should be used before range aspects like:



    ALTER TABLE commission
    DROP INDEX IDX_6F7146F0979B1AD6,
    ADD INDEX IDX_6F7146F0979B1AD6 (company_id, source, date)






    share|improve this answer












    share|improve this answer



    share|improve this answer










    answered Nov 20 at 23:30









    danblack

    1,5071214




    1,5071214












    • query optimization
      – danblack
      Nov 20 at 23:39










    • How MySQL Uses indexes
      – danblack
      Nov 20 at 23:59










    • Adding a composite index on (customer_id, company_id, date, source) solves the performance issue (takes now less than 1s). Thanks for your answer and links.
      – yceruto
      Nov 21 at 16:29






    • 1




      And an index cookbook
      – Rick James
      Nov 21 at 19:18






    • 1




      @yceruto - Your 4-column index works well if the Optimizer starts with p, but you need to move date to the end. Dan's 3-column index works well if it starts with c. Add both.
      – Rick James
      Nov 21 at 19:19


















    • query optimization
      – danblack
      Nov 20 at 23:39










    • How MySQL Uses indexes
      – danblack
      Nov 20 at 23:59










    • Adding a composite index on (customer_id, company_id, date, source) solves the performance issue (takes now less than 1s). Thanks for your answer and links.
      – yceruto
      Nov 21 at 16:29






    • 1




      And an index cookbook
      – Rick James
      Nov 21 at 19:18






    • 1




      @yceruto - Your 4-column index works well if the Optimizer starts with p, but you need to move date to the end. Dan's 3-column index works well if it starts with c. Add both.
      – Rick James
      Nov 21 at 19:19
















    query optimization
    – danblack
    Nov 20 at 23:39




    query optimization
    – danblack
    Nov 20 at 23:39












    How MySQL Uses indexes
    – danblack
    Nov 20 at 23:59




    How MySQL Uses indexes
    – danblack
    Nov 20 at 23:59












    Adding a composite index on (customer_id, company_id, date, source) solves the performance issue (takes now less than 1s). Thanks for your answer and links.
    – yceruto
    Nov 21 at 16:29




    Adding a composite index on (customer_id, company_id, date, source) solves the performance issue (takes now less than 1s). Thanks for your answer and links.
    – yceruto
    Nov 21 at 16:29




    1




    1




    And an index cookbook
    – Rick James
    Nov 21 at 19:18




    And an index cookbook
    – Rick James
    Nov 21 at 19:18




    1




    1




    @yceruto - Your 4-column index works well if the Optimizer starts with p, but you need to move date to the end. Dan's 3-column index works well if it starts with c. Add both.
    – Rick James
    Nov 21 at 19:19




    @yceruto - Your 4-column index works well if the Optimizer starts with p, but you need to move date to the end. Dan's 3-column index works well if it starts with c. Add both.
    – Rick James
    Nov 21 at 19:19













    1














    Here's what the Optimizer sees in your query.



    Checking whether to use an index for the GROUP BY:




    • Functions (YEAR()) in the GROUP BY, so no.

    • Multiple tables (c and p) mentioned, so no.


    For a JOIN, Optimizer will (almost always) start with one, then reach into the other. So, let's look at the two options:



    If starting with p:



    Assuming you have PRIMARY KEY(id), there is not much to think about. It will simply use that index.



    For each row selected from p, it will then look into c, and any variation of this INDEX would be optimal.



    c: INDEX(company_id, source, customer_id,  -- in any order (all are tested "=")
    date) -- last, since it is tested as a range


    If starting with c:



    c: INDEX(company_id, source,  -- in any order (all are tested "=")
    date) -- last, since it is tested as a range
    -- slightly better:
    c: INDEX(company_id, source, -- in any order (all are tested "=")
    date, -- last, since it is tested as a range
    customer_id) -- really last -- added only to make it "covering".


    The Optimizer will look at "statistics" to crudely decide which table to start with. So, add all the indexes I suggested.



    A "covering" index is one that contains all the columns needed anywhere in the query. It is sometimes wise to extend a 'good' index with more columns to make it "covering".



    But there is a monkey wrench in here. c.customer_id = p.id means that customer_id IN (...) effectively exists. But now there are two "range-like" constraints -- one is an IN, the other is a 'range'. In some newer versions, the Optimizer will happily jump around due to the IN and still be able to do "range" scans. So, I recommend this ordering:




    1. Test(s) of column = constant

    2. Test(s) with IN


    3. One 'range' test (BETWEEN, >=, LIKE with trailing wildcard, etc)

    4. Perhaps add more columns to make it "covering" -- but don't do this step if you end up with more than, say, 5 columns in the index.


    Hence, for c, the following is optimal for the WHERE, and happens to be "covering".



    INDEX(company_id, source,  -- first, but in any order (all "=")
    customer_id, -- "IN"
    date) -- last, since it is tested as a range

    p: (same as above)


    Since there was an IN or "range", there is no use seeing if the index can also handle the GROUP BY.



    A note on COUNT(x) -- it checks that x is NOT NULL. It is usually just as correct to say COUNT(*), which counts the number of rows without any extra checking.



    This is a non-starter since it hides the indexed column (id) in a function:



    AND (',3062,3063,3064,3065,3066,...6368,6421,'
    LIKE CONCAT('%,', p.id, ',%'))





    share|improve this answer





















    • Yes, I tried with (company_id, date, customer_id, source) is slightly better. Thanks!
      – yceruto
      Nov 21 at 20:53










    • The order of the columns in the index is vital to performance. Your attempt has date too early.
      – Rick James
      Nov 21 at 21:40










    • Sorry (typo) I meant (company_id, source, customer_id, date) obviously :)
      – yceruto
      Nov 21 at 21:43










    • @yceruto - Interesting. How many rows in the resultset?
      – Rick James
      Nov 21 at 21:59










    • After the composite index > EXPLAIN > rows: 4548, filtered: 50 (for c table)
      – yceruto
      Nov 21 at 23:45


















    1














    Here's what the Optimizer sees in your query.



    Checking whether to use an index for the GROUP BY:




    • Functions (YEAR()) in the GROUP BY, so no.

    • Multiple tables (c and p) mentioned, so no.


    For a JOIN, Optimizer will (almost always) start with one, then reach into the other. So, let's look at the two options:



    If starting with p:



    Assuming you have PRIMARY KEY(id), there is not much to think about. It will simply use that index.



    For each row selected from p, it will then look into c, and any variation of this INDEX would be optimal.



    c: INDEX(company_id, source, customer_id,  -- in any order (all are tested "=")
    date) -- last, since it is tested as a range


    If starting with c:



    c: INDEX(company_id, source,  -- in any order (all are tested "=")
    date) -- last, since it is tested as a range
    -- slightly better:
    c: INDEX(company_id, source, -- in any order (all are tested "=")
    date, -- last, since it is tested as a range
    customer_id) -- really last -- added only to make it "covering".


    The Optimizer will look at "statistics" to crudely decide which table to start with. So, add all the indexes I suggested.



    A "covering" index is one that contains all the columns needed anywhere in the query. It is sometimes wise to extend a 'good' index with more columns to make it "covering".



    But there is a monkey wrench in here. c.customer_id = p.id means that customer_id IN (...) effectively exists. But now there are two "range-like" constraints -- one is an IN, the other is a 'range'. In some newer versions, the Optimizer will happily jump around due to the IN and still be able to do "range" scans. So, I recommend this ordering:




    1. Test(s) of column = constant

    2. Test(s) with IN


    3. One 'range' test (BETWEEN, >=, LIKE with trailing wildcard, etc)

    4. Perhaps add more columns to make it "covering" -- but don't do this step if you end up with more than, say, 5 columns in the index.


    Hence, for c, the following is optimal for the WHERE, and happens to be "covering".



    INDEX(company_id, source,  -- first, but in any order (all "=")
    customer_id, -- "IN"
    date) -- last, since it is tested as a range

    p: (same as above)


    Since there was an IN or "range", there is no use seeing if the index can also handle the GROUP BY.



    A note on COUNT(x) -- it checks that x is NOT NULL. It is usually just as correct to say COUNT(*), which counts the number of rows without any extra checking.



    This is a non-starter since it hides the indexed column (id) in a function:



    AND (',3062,3063,3064,3065,3066,...6368,6421,'
    LIKE CONCAT('%,', p.id, ',%'))





    share|improve this answer





















    • Yes, I tried with (company_id, date, customer_id, source) is slightly better. Thanks!
      – yceruto
      Nov 21 at 20:53










    • The order of the columns in the index is vital to performance. Your attempt has date too early.
      – Rick James
      Nov 21 at 21:40










    • Sorry (typo) I meant (company_id, source, customer_id, date) obviously :)
      – yceruto
      Nov 21 at 21:43










    • @yceruto - Interesting. How many rows in the resultset?
      – Rick James
      Nov 21 at 21:59










    • After the composite index > EXPLAIN > rows: 4548, filtered: 50 (for c table)
      – yceruto
      Nov 21 at 23:45
















    1












    1








    1






    Here's what the Optimizer sees in your query.



    Checking whether to use an index for the GROUP BY:




    • Functions (YEAR()) in the GROUP BY, so no.

    • Multiple tables (c and p) mentioned, so no.


    For a JOIN, Optimizer will (almost always) start with one, then reach into the other. So, let's look at the two options:



    If starting with p:



    Assuming you have PRIMARY KEY(id), there is not much to think about. It will simply use that index.



    For each row selected from p, it will then look into c, and any variation of this INDEX would be optimal.



    c: INDEX(company_id, source, customer_id,  -- in any order (all are tested "=")
    date) -- last, since it is tested as a range


    If starting with c:



    c: INDEX(company_id, source,  -- in any order (all are tested "=")
    date) -- last, since it is tested as a range
    -- slightly better:
    c: INDEX(company_id, source, -- in any order (all are tested "=")
    date, -- last, since it is tested as a range
    customer_id) -- really last -- added only to make it "covering".


    The Optimizer will look at "statistics" to crudely decide which table to start with. So, add all the indexes I suggested.



    A "covering" index is one that contains all the columns needed anywhere in the query. It is sometimes wise to extend a 'good' index with more columns to make it "covering".



    But there is a monkey wrench in here. c.customer_id = p.id means that customer_id IN (...) effectively exists. But now there are two "range-like" constraints -- one is an IN, the other is a 'range'. In some newer versions, the Optimizer will happily jump around due to the IN and still be able to do "range" scans. So, I recommend this ordering:




    1. Test(s) of column = constant

    2. Test(s) with IN


    3. One 'range' test (BETWEEN, >=, LIKE with trailing wildcard, etc)

    4. Perhaps add more columns to make it "covering" -- but don't do this step if you end up with more than, say, 5 columns in the index.


    Hence, for c, the following is optimal for the WHERE, and happens to be "covering".



    INDEX(company_id, source,  -- first, but in any order (all "=")
    customer_id, -- "IN"
    date) -- last, since it is tested as a range

    p: (same as above)


    Since there was an IN or "range", there is no use seeing if the index can also handle the GROUP BY.



    A note on COUNT(x) -- it checks that x is NOT NULL. It is usually just as correct to say COUNT(*), which counts the number of rows without any extra checking.



    This is a non-starter since it hides the indexed column (id) in a function:



    AND (',3062,3063,3064,3065,3066,...6368,6421,'
    LIKE CONCAT('%,', p.id, ',%'))





    share|improve this answer












    Here's what the Optimizer sees in your query.



    Checking whether to use an index for the GROUP BY:




    • Functions (YEAR()) in the GROUP BY, so no.

    • Multiple tables (c and p) mentioned, so no.


    For a JOIN, Optimizer will (almost always) start with one, then reach into the other. So, let's look at the two options:



    If starting with p:



    Assuming you have PRIMARY KEY(id), there is not much to think about. It will simply use that index.



    For each row selected from p, it will then look into c, and any variation of this INDEX would be optimal.



    c: INDEX(company_id, source, customer_id,  -- in any order (all are tested "=")
    date) -- last, since it is tested as a range


    If starting with c:



    c: INDEX(company_id, source,  -- in any order (all are tested "=")
    date) -- last, since it is tested as a range
    -- slightly better:
    c: INDEX(company_id, source, -- in any order (all are tested "=")
    date, -- last, since it is tested as a range
    customer_id) -- really last -- added only to make it "covering".


    The Optimizer will look at "statistics" to crudely decide which table to start with. So, add all the indexes I suggested.



    A "covering" index is one that contains all the columns needed anywhere in the query. It is sometimes wise to extend a 'good' index with more columns to make it "covering".



    But there is a monkey wrench in here. c.customer_id = p.id means that customer_id IN (...) effectively exists. But now there are two "range-like" constraints -- one is an IN, the other is a 'range'. In some newer versions, the Optimizer will happily jump around due to the IN and still be able to do "range" scans. So, I recommend this ordering:




    1. Test(s) of column = constant

    2. Test(s) with IN


    3. One 'range' test (BETWEEN, >=, LIKE with trailing wildcard, etc)

    4. Perhaps add more columns to make it "covering" -- but don't do this step if you end up with more than, say, 5 columns in the index.


    Hence, for c, the following is optimal for the WHERE, and happens to be "covering".



    INDEX(company_id, source,  -- first, but in any order (all "=")
    customer_id, -- "IN"
    date) -- last, since it is tested as a range

    p: (same as above)


    Since there was an IN or "range", there is no use seeing if the index can also handle the GROUP BY.



    A note on COUNT(x) -- it checks that x is NOT NULL. It is usually just as correct to say COUNT(*), which counts the number of rows without any extra checking.



    This is a non-starter since it hides the indexed column (id) in a function:



    AND (',3062,3063,3064,3065,3066,...6368,6421,'
    LIKE CONCAT('%,', p.id, ',%'))






    share|improve this answer












    share|improve this answer



    share|improve this answer










    answered Nov 21 at 20:04









    Rick James

    65.8k55797




    65.8k55797












    • Yes, I tried with (company_id, date, customer_id, source) is slightly better. Thanks!
      – yceruto
      Nov 21 at 20:53










    • The order of the columns in the index is vital to performance. Your attempt has date too early.
      – Rick James
      Nov 21 at 21:40










    • Sorry (typo) I meant (company_id, source, customer_id, date) obviously :)
      – yceruto
      Nov 21 at 21:43










    • @yceruto - Interesting. How many rows in the resultset?
      – Rick James
      Nov 21 at 21:59










    • After the composite index > EXPLAIN > rows: 4548, filtered: 50 (for c table)
      – yceruto
      Nov 21 at 23:45




















    • Yes, I tried with (company_id, date, customer_id, source) is slightly better. Thanks!
      – yceruto
      Nov 21 at 20:53










    • The order of the columns in the index is vital to performance. Your attempt has date too early.
      – Rick James
      Nov 21 at 21:40










    • Sorry (typo) I meant (company_id, source, customer_id, date) obviously :)
      – yceruto
      Nov 21 at 21:43










    • @yceruto - Interesting. How many rows in the resultset?
      – Rick James
      Nov 21 at 21:59










    • After the composite index > EXPLAIN > rows: 4548, filtered: 50 (for c table)
      – yceruto
      Nov 21 at 23:45


















    Yes, I tried with (company_id, date, customer_id, source) is slightly better. Thanks!
    – yceruto
    Nov 21 at 20:53




    Yes, I tried with (company_id, date, customer_id, source) is slightly better. Thanks!
    – yceruto
    Nov 21 at 20:53












    The order of the columns in the index is vital to performance. Your attempt has date too early.
    – Rick James
    Nov 21 at 21:40




    The order of the columns in the index is vital to performance. Your attempt has date too early.
    – Rick James
    Nov 21 at 21:40












    Sorry (typo) I meant (company_id, source, customer_id, date) obviously :)
    – yceruto
    Nov 21 at 21:43




    Sorry (typo) I meant (company_id, source, customer_id, date) obviously :)
    – yceruto
    Nov 21 at 21:43












    @yceruto - Interesting. How many rows in the resultset?
    – Rick James
    Nov 21 at 21:59




    @yceruto - Interesting. How many rows in the resultset?
    – Rick James
    Nov 21 at 21:59












    After the composite index > EXPLAIN > rows: 4548, filtered: 50 (for c table)
    – yceruto
    Nov 21 at 23:45






    After the composite index > EXPLAIN > rows: 4548, filtered: 50 (for c table)
    – yceruto
    Nov 21 at 23:45













    0














    With your LIKE-hack you are tricking optimizer so it uses different plan (most probably using IDX_6F7146F0AA9E377A index on the first place).
    You should be able to see this in explain.



    I think the real issue in your case is the second line of explain: server executing multiple functions (MONTH, YEAR) for 6716 rows and then trying to group all these rows. During this time all these 6716 rows should be stored (in memory or on disk that is based on your server configuration).



    SELECT COUNT(*) FROM commission WHERE (date BETWEEN '2018-01-01' AND '2018-12-31') AND company_id = 90 AND source = 'ACTUAL';


    => How many rows are we talking about?



    If the number in above query is much lower then 6716 I'd try to add covering index on columns customer_id, company_id, source and date. Not sure about the best order as it depends on data you have (check cardinality for these columns). I'd started with index (date, company_id, source, customer_id). Also, I'd add unique index (id, district_id, owner_id) on partner.



    It is also possible to add additional generated stored columns _year and _month (if your server is a bit old you can add normal columns and fill them in with trigger) to rid off the multiple function executions.






    share|improve this answer




























      0














      With your LIKE-hack you are tricking optimizer so it uses different plan (most probably using IDX_6F7146F0AA9E377A index on the first place).
      You should be able to see this in explain.



      I think the real issue in your case is the second line of explain: server executing multiple functions (MONTH, YEAR) for 6716 rows and then trying to group all these rows. During this time all these 6716 rows should be stored (in memory or on disk that is based on your server configuration).



      SELECT COUNT(*) FROM commission WHERE (date BETWEEN '2018-01-01' AND '2018-12-31') AND company_id = 90 AND source = 'ACTUAL';


      => How many rows are we talking about?



      If the number in above query is much lower then 6716 I'd try to add covering index on columns customer_id, company_id, source and date. Not sure about the best order as it depends on data you have (check cardinality for these columns). I'd started with index (date, company_id, source, customer_id). Also, I'd add unique index (id, district_id, owner_id) on partner.



      It is also possible to add additional generated stored columns _year and _month (if your server is a bit old you can add normal columns and fill them in with trigger) to rid off the multiple function executions.






      share|improve this answer


























        0












        0








        0






        With your LIKE-hack you are tricking optimizer so it uses different plan (most probably using IDX_6F7146F0AA9E377A index on the first place).
        You should be able to see this in explain.



        I think the real issue in your case is the second line of explain: server executing multiple functions (MONTH, YEAR) for 6716 rows and then trying to group all these rows. During this time all these 6716 rows should be stored (in memory or on disk that is based on your server configuration).



        SELECT COUNT(*) FROM commission WHERE (date BETWEEN '2018-01-01' AND '2018-12-31') AND company_id = 90 AND source = 'ACTUAL';


        => How many rows are we talking about?



        If the number in above query is much lower then 6716 I'd try to add covering index on columns customer_id, company_id, source and date. Not sure about the best order as it depends on data you have (check cardinality for these columns). I'd started with index (date, company_id, source, customer_id). Also, I'd add unique index (id, district_id, owner_id) on partner.



        It is also possible to add additional generated stored columns _year and _month (if your server is a bit old you can add normal columns and fill them in with trigger) to rid off the multiple function executions.






        share|improve this answer














        With your LIKE-hack you are tricking optimizer so it uses different plan (most probably using IDX_6F7146F0AA9E377A index on the first place).
        You should be able to see this in explain.



        I think the real issue in your case is the second line of explain: server executing multiple functions (MONTH, YEAR) for 6716 rows and then trying to group all these rows. During this time all these 6716 rows should be stored (in memory or on disk that is based on your server configuration).



        SELECT COUNT(*) FROM commission WHERE (date BETWEEN '2018-01-01' AND '2018-12-31') AND company_id = 90 AND source = 'ACTUAL';


        => How many rows are we talking about?



        If the number in above query is much lower then 6716 I'd try to add covering index on columns customer_id, company_id, source and date. Not sure about the best order as it depends on data you have (check cardinality for these columns). I'd started with index (date, company_id, source, customer_id). Also, I'd add unique index (id, district_id, owner_id) on partner.



        It is also possible to add additional generated stored columns _year and _month (if your server is a bit old you can add normal columns and fill them in with trigger) to rid off the multiple function executions.







        share|improve this answer














        share|improve this answer



        share|improve this answer








        edited Nov 21 at 2:51

























        answered Nov 21 at 1:20









        fifonik

        38518




        38518






























            draft saved

            draft discarded




















































            Thanks for contributing an answer to Stack Overflow!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.





            Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


            Please pay close attention to the following guidance:


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53403008%2fvery-slow-mysql-query-performance%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            404 Error Contact Form 7 ajax form submitting

            How to know if a Active Directory user can login interactively

            TypeError: fit_transform() missing 1 required positional argument: 'X'