Very slow MySQL query performance
I've a query that takes about 18 seconds to finish:
THE QUERY:
SELECT YEAR(c.date), MONTH(c.date), p.district_id, COUNT(p.owner_id)
FROM commission c
  INNER JOIN partner p ON c.customer_id = p.id
WHERE (c.date BETWEEN '2018-01-01' AND '2018-12-31')
  AND (c.company_id = 90)
  AND (c.source = 'ACTUAL')
  AND (p.id IN (3062, 3063, 3064, 3065, 3066, 3067, 3068, 3069, 3070, 3071,
    3072, 3073, 3074, 3075, 3076, 3077, 3078, 3079, 3081, 3082, 3083, 3084,
    3085, 3086, 3087, 3088, 3089, 3090, 3091, 3092, 3093, 3094, 3095, 3096,
    3097, 3098, 3099, 3448, 3449, 3450, 3451, 3452, 3453, 3454, 3455, 3456,
    3457, 3458, 3459, 3460, 3461, 3471, 3490, 3491, 6307, 6368, 6421))
  GROUP BY YEAR(c.date), MONTH(c.date), p.district_id
The commission table has around 2,8 millions of records, of which 860 000+ belong to the current year 2018. The partner table has at this moment 8600+ records.
RESULT
| `YEAR(c.date)` | `MONTH(c.date)` | district_id | `COUNT(c.id)` | 
|----------------|-----------------|-------------|---------------| 
| 2018           | 1               | 1           | 19154         | 
| 2018           | 1               | 5           | 9184          | 
| 2018           | 1               | 6           | 2706          | 
| 2018           | 1               | 12          | 36296         | 
| 2018           | 1               | 15          | 13085         | 
| 2018           | 2               | 1           | 21231         | 
| 2018           | 2               | 5           | 10242         | 
| ...            | ...             | ...         | ...           | 
55 rows retrieved starting from 1 in 18 s 374 ms 
(execution: 18 s 368 ms, fetching: 6 ms)
EXPLAIN:
| id | select_type | table | partitions | type  | possible_keys                                                                                        | key                  | key_len | ref             | rows | filtered | extra                                        | 
|----|-------------|-------|------------|-------|------------------------------------------------------------------------------------------------------|----------------------|---------|-----------------|------|----------|----------------------------------------------| 
| 1  | SIMPLE      | p     | null       | range | PRIMARY                                                                                              | PRIMARY              | 4       |                 | 57   | 100      | Using where; Using temporary; Using filesort | 
| 1  | SIMPLE      | c     | null       | ref   | UNIQ_6F7146F0979B1AD62FC0CB0F5F8A7F73,IDX_6F7146F09395C3F3,IDX_6F7146F0979B1AD6,IDX_6F7146F0AA9E377A | IDX_6F7146F09395C3F3 | 5       | p.id            | 6716 | 8.33     | Using where                                  | 
DDL:
create table if not exists commission (
    id int auto_increment
        primary key,
    date date not null,
    source enum('ACTUAL', 'EXPECTED') not null,
    customer_id int null,
    transaction_id varchar(255) not null,
    company_id int null,
    constraint UNIQ_6F7146F0979B1AD62FC0CB0F5F8A7F73 unique (company_id, transaction_id, source),
    constraint FK_6F7146F09395C3F3 foreign key (customer_id) references partner (id),
    constraint FK_6F7146F0979B1AD6 foreign key (company_id) references companies (id)
) collate=utf8_unicode_ci;
create index IDX_6F7146F09395C3F3 on commission (customer_id);
create index IDX_6F7146F0979B1AD6 on commission (company_id);
create index IDX_6F7146F0AA9E377A on commission (date);
I noted that by removing the partner IN condition MySQL takes only 3s. I tried to replace it doing something crazy like this:
AND (',3062,3063,3064,3065,3066,3067,3068,3069,3070,3071,3072,3073,3074,3075,3076,3077,3078,3079,3081,3082,3083,3084,3085,3086,3087,3088,3089,3090,3091,3092,3093,3094,3095,3096,3097,3098,3099,3448,3449,3450,3451,3452,3453,3454,3455,3456,3457,3458,3459,3460,3461,3471,3490,3491,6307,6368,6421,'
     LIKE CONCAT('%,', p.id, ',%')) 
and the result was about 5s... great! but it's a hack.
WHY this query is taking a very long execution time when I uses IN statement? workaround, tips, links, etc. Thanks!
mysql query-performance mysql-5.7
|
show 3 more comments
I've a query that takes about 18 seconds to finish:
THE QUERY:
SELECT YEAR(c.date), MONTH(c.date), p.district_id, COUNT(p.owner_id)
FROM commission c
  INNER JOIN partner p ON c.customer_id = p.id
WHERE (c.date BETWEEN '2018-01-01' AND '2018-12-31')
  AND (c.company_id = 90)
  AND (c.source = 'ACTUAL')
  AND (p.id IN (3062, 3063, 3064, 3065, 3066, 3067, 3068, 3069, 3070, 3071,
    3072, 3073, 3074, 3075, 3076, 3077, 3078, 3079, 3081, 3082, 3083, 3084,
    3085, 3086, 3087, 3088, 3089, 3090, 3091, 3092, 3093, 3094, 3095, 3096,
    3097, 3098, 3099, 3448, 3449, 3450, 3451, 3452, 3453, 3454, 3455, 3456,
    3457, 3458, 3459, 3460, 3461, 3471, 3490, 3491, 6307, 6368, 6421))
  GROUP BY YEAR(c.date), MONTH(c.date), p.district_id
The commission table has around 2,8 millions of records, of which 860 000+ belong to the current year 2018. The partner table has at this moment 8600+ records.
RESULT
| `YEAR(c.date)` | `MONTH(c.date)` | district_id | `COUNT(c.id)` | 
|----------------|-----------------|-------------|---------------| 
| 2018           | 1               | 1           | 19154         | 
| 2018           | 1               | 5           | 9184          | 
| 2018           | 1               | 6           | 2706          | 
| 2018           | 1               | 12          | 36296         | 
| 2018           | 1               | 15          | 13085         | 
| 2018           | 2               | 1           | 21231         | 
| 2018           | 2               | 5           | 10242         | 
| ...            | ...             | ...         | ...           | 
55 rows retrieved starting from 1 in 18 s 374 ms 
(execution: 18 s 368 ms, fetching: 6 ms)
EXPLAIN:
| id | select_type | table | partitions | type  | possible_keys                                                                                        | key                  | key_len | ref             | rows | filtered | extra                                        | 
|----|-------------|-------|------------|-------|------------------------------------------------------------------------------------------------------|----------------------|---------|-----------------|------|----------|----------------------------------------------| 
| 1  | SIMPLE      | p     | null       | range | PRIMARY                                                                                              | PRIMARY              | 4       |                 | 57   | 100      | Using where; Using temporary; Using filesort | 
| 1  | SIMPLE      | c     | null       | ref   | UNIQ_6F7146F0979B1AD62FC0CB0F5F8A7F73,IDX_6F7146F09395C3F3,IDX_6F7146F0979B1AD6,IDX_6F7146F0AA9E377A | IDX_6F7146F09395C3F3 | 5       | p.id            | 6716 | 8.33     | Using where                                  | 
DDL:
create table if not exists commission (
    id int auto_increment
        primary key,
    date date not null,
    source enum('ACTUAL', 'EXPECTED') not null,
    customer_id int null,
    transaction_id varchar(255) not null,
    company_id int null,
    constraint UNIQ_6F7146F0979B1AD62FC0CB0F5F8A7F73 unique (company_id, transaction_id, source),
    constraint FK_6F7146F09395C3F3 foreign key (customer_id) references partner (id),
    constraint FK_6F7146F0979B1AD6 foreign key (company_id) references companies (id)
) collate=utf8_unicode_ci;
create index IDX_6F7146F09395C3F3 on commission (customer_id);
create index IDX_6F7146F0979B1AD6 on commission (company_id);
create index IDX_6F7146F0AA9E377A on commission (date);
I noted that by removing the partner IN condition MySQL takes only 3s. I tried to replace it doing something crazy like this:
AND (',3062,3063,3064,3065,3066,3067,3068,3069,3070,3071,3072,3073,3074,3075,3076,3077,3078,3079,3081,3082,3083,3084,3085,3086,3087,3088,3089,3090,3091,3092,3093,3094,3095,3096,3097,3098,3099,3448,3449,3450,3451,3452,3453,3454,3455,3456,3457,3458,3459,3460,3461,3471,3490,3491,6307,6368,6421,'
     LIKE CONCAT('%,', p.id, ',%')) 
and the result was about 5s... great! but it's a hack.
WHY this query is taking a very long execution time when I uses IN statement? workaround, tips, links, etc. Thanks!
mysql query-performance mysql-5.7
 
 
 1
 
 
 
 
 Where does the- p.id IN (....)list come from?
 – danblack
 Nov 20 at 23:28
 
 
 
 
 
 
 
 
 
 As far as I understand, MySQL can only use one of those indexes on commission in it's WHERE; that part would better be server by a composite index on- (date, company_id, source).
 – Uueerdo
 Nov 20 at 23:30
 
 
 
 
 
 
 
 
 
 Note if your data range is 2018, the- GROUP BY YEAR(c.date)is constant. Its not the largest contributor to query time by far.
 – danblack
 Nov 20 at 23:57
 
 
 
 
 
 
 
 
 
 The- LIKEhack won't work well because it cannot use- PRIMARY KEY(id)on- p. You do have that??
 – Rick James
 Nov 21 at 19:21
 
 
 
 
 
 1
 
 
 
 
 @danblack - The time taken for a function call is usually so insignificant (compared to fetching rows) as to be ignorable.
 – Rick James
 Nov 21 at 23:50
 
 
 
|
show 3 more comments
I've a query that takes about 18 seconds to finish:
THE QUERY:
SELECT YEAR(c.date), MONTH(c.date), p.district_id, COUNT(p.owner_id)
FROM commission c
  INNER JOIN partner p ON c.customer_id = p.id
WHERE (c.date BETWEEN '2018-01-01' AND '2018-12-31')
  AND (c.company_id = 90)
  AND (c.source = 'ACTUAL')
  AND (p.id IN (3062, 3063, 3064, 3065, 3066, 3067, 3068, 3069, 3070, 3071,
    3072, 3073, 3074, 3075, 3076, 3077, 3078, 3079, 3081, 3082, 3083, 3084,
    3085, 3086, 3087, 3088, 3089, 3090, 3091, 3092, 3093, 3094, 3095, 3096,
    3097, 3098, 3099, 3448, 3449, 3450, 3451, 3452, 3453, 3454, 3455, 3456,
    3457, 3458, 3459, 3460, 3461, 3471, 3490, 3491, 6307, 6368, 6421))
  GROUP BY YEAR(c.date), MONTH(c.date), p.district_id
The commission table has around 2,8 millions of records, of which 860 000+ belong to the current year 2018. The partner table has at this moment 8600+ records.
RESULT
| `YEAR(c.date)` | `MONTH(c.date)` | district_id | `COUNT(c.id)` | 
|----------------|-----------------|-------------|---------------| 
| 2018           | 1               | 1           | 19154         | 
| 2018           | 1               | 5           | 9184          | 
| 2018           | 1               | 6           | 2706          | 
| 2018           | 1               | 12          | 36296         | 
| 2018           | 1               | 15          | 13085         | 
| 2018           | 2               | 1           | 21231         | 
| 2018           | 2               | 5           | 10242         | 
| ...            | ...             | ...         | ...           | 
55 rows retrieved starting from 1 in 18 s 374 ms 
(execution: 18 s 368 ms, fetching: 6 ms)
EXPLAIN:
| id | select_type | table | partitions | type  | possible_keys                                                                                        | key                  | key_len | ref             | rows | filtered | extra                                        | 
|----|-------------|-------|------------|-------|------------------------------------------------------------------------------------------------------|----------------------|---------|-----------------|------|----------|----------------------------------------------| 
| 1  | SIMPLE      | p     | null       | range | PRIMARY                                                                                              | PRIMARY              | 4       |                 | 57   | 100      | Using where; Using temporary; Using filesort | 
| 1  | SIMPLE      | c     | null       | ref   | UNIQ_6F7146F0979B1AD62FC0CB0F5F8A7F73,IDX_6F7146F09395C3F3,IDX_6F7146F0979B1AD6,IDX_6F7146F0AA9E377A | IDX_6F7146F09395C3F3 | 5       | p.id            | 6716 | 8.33     | Using where                                  | 
DDL:
create table if not exists commission (
    id int auto_increment
        primary key,
    date date not null,
    source enum('ACTUAL', 'EXPECTED') not null,
    customer_id int null,
    transaction_id varchar(255) not null,
    company_id int null,
    constraint UNIQ_6F7146F0979B1AD62FC0CB0F5F8A7F73 unique (company_id, transaction_id, source),
    constraint FK_6F7146F09395C3F3 foreign key (customer_id) references partner (id),
    constraint FK_6F7146F0979B1AD6 foreign key (company_id) references companies (id)
) collate=utf8_unicode_ci;
create index IDX_6F7146F09395C3F3 on commission (customer_id);
create index IDX_6F7146F0979B1AD6 on commission (company_id);
create index IDX_6F7146F0AA9E377A on commission (date);
I noted that by removing the partner IN condition MySQL takes only 3s. I tried to replace it doing something crazy like this:
AND (',3062,3063,3064,3065,3066,3067,3068,3069,3070,3071,3072,3073,3074,3075,3076,3077,3078,3079,3081,3082,3083,3084,3085,3086,3087,3088,3089,3090,3091,3092,3093,3094,3095,3096,3097,3098,3099,3448,3449,3450,3451,3452,3453,3454,3455,3456,3457,3458,3459,3460,3461,3471,3490,3491,6307,6368,6421,'
     LIKE CONCAT('%,', p.id, ',%')) 
and the result was about 5s... great! but it's a hack.
WHY this query is taking a very long execution time when I uses IN statement? workaround, tips, links, etc. Thanks!
mysql query-performance mysql-5.7
I've a query that takes about 18 seconds to finish:
THE QUERY:
SELECT YEAR(c.date), MONTH(c.date), p.district_id, COUNT(p.owner_id)
FROM commission c
  INNER JOIN partner p ON c.customer_id = p.id
WHERE (c.date BETWEEN '2018-01-01' AND '2018-12-31')
  AND (c.company_id = 90)
  AND (c.source = 'ACTUAL')
  AND (p.id IN (3062, 3063, 3064, 3065, 3066, 3067, 3068, 3069, 3070, 3071,
    3072, 3073, 3074, 3075, 3076, 3077, 3078, 3079, 3081, 3082, 3083, 3084,
    3085, 3086, 3087, 3088, 3089, 3090, 3091, 3092, 3093, 3094, 3095, 3096,
    3097, 3098, 3099, 3448, 3449, 3450, 3451, 3452, 3453, 3454, 3455, 3456,
    3457, 3458, 3459, 3460, 3461, 3471, 3490, 3491, 6307, 6368, 6421))
  GROUP BY YEAR(c.date), MONTH(c.date), p.district_id
The commission table has around 2,8 millions of records, of which 860 000+ belong to the current year 2018. The partner table has at this moment 8600+ records.
RESULT
| `YEAR(c.date)` | `MONTH(c.date)` | district_id | `COUNT(c.id)` | 
|----------------|-----------------|-------------|---------------| 
| 2018           | 1               | 1           | 19154         | 
| 2018           | 1               | 5           | 9184          | 
| 2018           | 1               | 6           | 2706          | 
| 2018           | 1               | 12          | 36296         | 
| 2018           | 1               | 15          | 13085         | 
| 2018           | 2               | 1           | 21231         | 
| 2018           | 2               | 5           | 10242         | 
| ...            | ...             | ...         | ...           | 
55 rows retrieved starting from 1 in 18 s 374 ms 
(execution: 18 s 368 ms, fetching: 6 ms)
EXPLAIN:
| id | select_type | table | partitions | type  | possible_keys                                                                                        | key                  | key_len | ref             | rows | filtered | extra                                        | 
|----|-------------|-------|------------|-------|------------------------------------------------------------------------------------------------------|----------------------|---------|-----------------|------|----------|----------------------------------------------| 
| 1  | SIMPLE      | p     | null       | range | PRIMARY                                                                                              | PRIMARY              | 4       |                 | 57   | 100      | Using where; Using temporary; Using filesort | 
| 1  | SIMPLE      | c     | null       | ref   | UNIQ_6F7146F0979B1AD62FC0CB0F5F8A7F73,IDX_6F7146F09395C3F3,IDX_6F7146F0979B1AD6,IDX_6F7146F0AA9E377A | IDX_6F7146F09395C3F3 | 5       | p.id            | 6716 | 8.33     | Using where                                  | 
DDL:
create table if not exists commission (
    id int auto_increment
        primary key,
    date date not null,
    source enum('ACTUAL', 'EXPECTED') not null,
    customer_id int null,
    transaction_id varchar(255) not null,
    company_id int null,
    constraint UNIQ_6F7146F0979B1AD62FC0CB0F5F8A7F73 unique (company_id, transaction_id, source),
    constraint FK_6F7146F09395C3F3 foreign key (customer_id) references partner (id),
    constraint FK_6F7146F0979B1AD6 foreign key (company_id) references companies (id)
) collate=utf8_unicode_ci;
create index IDX_6F7146F09395C3F3 on commission (customer_id);
create index IDX_6F7146F0979B1AD6 on commission (company_id);
create index IDX_6F7146F0AA9E377A on commission (date);
I noted that by removing the partner IN condition MySQL takes only 3s. I tried to replace it doing something crazy like this:
AND (',3062,3063,3064,3065,3066,3067,3068,3069,3070,3071,3072,3073,3074,3075,3076,3077,3078,3079,3081,3082,3083,3084,3085,3086,3087,3088,3089,3090,3091,3092,3093,3094,3095,3096,3097,3098,3099,3448,3449,3450,3451,3452,3453,3454,3455,3456,3457,3458,3459,3460,3461,3471,3490,3491,6307,6368,6421,'
     LIKE CONCAT('%,', p.id, ',%')) 
and the result was about 5s... great! but it's a hack.
WHY this query is taking a very long execution time when I uses IN statement? workaround, tips, links, etc. Thanks!
mysql query-performance mysql-5.7
mysql query-performance mysql-5.7
edited Nov 21 at 19:16
Rick James
65.8k55797
65.8k55797
asked Nov 20 at 23:11


yceruto
7,07521945
7,07521945
 
 
 1
 
 
 
 
 Where does the- p.id IN (....)list come from?
 – danblack
 Nov 20 at 23:28
 
 
 
 
 
 
 
 
 
 As far as I understand, MySQL can only use one of those indexes on commission in it's WHERE; that part would better be server by a composite index on- (date, company_id, source).
 – Uueerdo
 Nov 20 at 23:30
 
 
 
 
 
 
 
 
 
 Note if your data range is 2018, the- GROUP BY YEAR(c.date)is constant. Its not the largest contributor to query time by far.
 – danblack
 Nov 20 at 23:57
 
 
 
 
 
 
 
 
 
 The- LIKEhack won't work well because it cannot use- PRIMARY KEY(id)on- p. You do have that??
 – Rick James
 Nov 21 at 19:21
 
 
 
 
 
 1
 
 
 
 
 @danblack - The time taken for a function call is usually so insignificant (compared to fetching rows) as to be ignorable.
 – Rick James
 Nov 21 at 23:50
 
 
 
|
show 3 more comments
 
 
 1
 
 
 
 
 Where does the- p.id IN (....)list come from?
 – danblack
 Nov 20 at 23:28
 
 
 
 
 
 
 
 
 
 As far as I understand, MySQL can only use one of those indexes on commission in it's WHERE; that part would better be server by a composite index on- (date, company_id, source).
 – Uueerdo
 Nov 20 at 23:30
 
 
 
 
 
 
 
 
 
 Note if your data range is 2018, the- GROUP BY YEAR(c.date)is constant. Its not the largest contributor to query time by far.
 – danblack
 Nov 20 at 23:57
 
 
 
 
 
 
 
 
 
 The- LIKEhack won't work well because it cannot use- PRIMARY KEY(id)on- p. You do have that??
 – Rick James
 Nov 21 at 19:21
 
 
 
 
 
 1
 
 
 
 
 @danblack - The time taken for a function call is usually so insignificant (compared to fetching rows) as to be ignorable.
 – Rick James
 Nov 21 at 23:50
 
 
 
1
1
Where does the
p.id IN (....) list come from?– danblack
Nov 20 at 23:28
Where does the
p.id IN (....) list come from?– danblack
Nov 20 at 23:28
As far as I understand, MySQL can only use one of those indexes on commission in it's WHERE; that part would better be server by a composite index on
(date, company_id, source).– Uueerdo
Nov 20 at 23:30
As far as I understand, MySQL can only use one of those indexes on commission in it's WHERE; that part would better be server by a composite index on
(date, company_id, source).– Uueerdo
Nov 20 at 23:30
Note if your data range is 2018, the
GROUP BY YEAR(c.date) is constant. Its not the largest contributor to query time by far.– danblack
Nov 20 at 23:57
Note if your data range is 2018, the
GROUP BY YEAR(c.date) is constant. Its not the largest contributor to query time by far.– danblack
Nov 20 at 23:57
The
LIKE hack won't work well because it cannot use PRIMARY KEY(id) on p.  You do have that??– Rick James
Nov 21 at 19:21
The
LIKE hack won't work well because it cannot use PRIMARY KEY(id) on p.  You do have that??– Rick James
Nov 21 at 19:21
1
1
@danblack - The time taken for a function call is usually so insignificant (compared to fetching rows) as to be ignorable.
– Rick James
Nov 21 at 23:50
@danblack - The time taken for a function call is usually so insignificant (compared to fetching rows) as to be ignorable.
– Rick James
Nov 21 at 23:50
|
show 3 more comments
                                3 Answers
                                3
                        
active
oldest
votes
MySQL can use one index at a time. For this query you need a compound index covering the aspects of the search. Constant aspects of the WHERE clause should be used before range aspects like:
ALTER TABLE commission
DROP INDEX IDX_6F7146F0979B1AD6,
ADD INDEX IDX_6F7146F0979B1AD6 (company_id, source, date)
 
 
 
 
 
 
 query optimization
 – danblack
 Nov 20 at 23:39
 
 
 
 
 
 
 
 
 
 How MySQL Uses indexes
 – danblack
 Nov 20 at 23:59
 
 
 
 
 
 
 
 
 
 Adding a composite index on- (customer_id, company_id, date, source)solves the performance issue (takes now less than 1s). Thanks for your answer and links.
 – yceruto
 Nov 21 at 16:29
 
 
 
 
 
 1
 
 
 
 
 And an index cookbook
 – Rick James
 Nov 21 at 19:18
 
 
 
 
 
 1
 
 
 
 
 @yceruto - Your 4-column index works well if the Optimizer starts with- p, but you need to move- dateto the end. Dan's 3-column index works well if it starts with- c. Add both.
 – Rick James
 Nov 21 at 19:19
 
 
 
add a comment |
Here's what the Optimizer sees in your query.
Checking whether to use an index for the GROUP BY:
- Functions (YEAR()) in theGROUP BY, so no.
- Multiple tables (candp) mentioned, so no.
For a JOIN, Optimizer will (almost always) start with one, then reach into the other.  So, let's look at the two options:
If starting with p:
Assuming you have PRIMARY KEY(id), there is not much to think about.  It will simply use that index.
For each row selected from p, it will then look into c, and any variation of this INDEX would be optimal.
c: INDEX(company_id, source, customer_id,  -- in any order (all are tested "=")
         date)       -- last, since it is tested as a range
If starting with c:
c: INDEX(company_id, source,  -- in any order (all are tested "=")
         date)       -- last, since it is tested as a range
-- slightly better:
c: INDEX(company_id, source,  -- in any order (all are tested "=")
         date,       -- last, since it is tested as a range
         customer_id)  -- really last -- added only to make it "covering".
The Optimizer will look at "statistics" to crudely decide which table to start with. So, add all the indexes I suggested.
A "covering" index is one that contains all the columns needed anywhere in the query. It is sometimes wise to extend a 'good' index with more columns to make it "covering".
But there is a monkey wrench in here.  c.customer_id = p.id means that customer_id IN (...) effectively exists.  But now there are two "range-like" constraints -- one is an IN, the other is a 'range'.  In some newer versions, the Optimizer will happily jump around due to the IN and still be able to do "range" scans.  So, I recommend this ordering:
- Test(s) of column = constant
 
- Test(s) with IN
 
 One 'range' test (- BETWEEN,- >=,- LIKEwith trailing wildcard, etc)
- Perhaps add more columns to make it "covering" -- but don't do this step if you end up with more than, say, 5 columns in the index.
Hence, for c, the following is optimal for the WHERE, and happens to be "covering".
INDEX(company_id, source,  -- first, but in any order (all "=")
      customer_id,  -- "IN"
      date)       -- last, since it is tested as a range
p: (same as above)
Since there was an IN or "range", there is no use seeing if the index can also handle the GROUP BY.
A note on COUNT(x) -- it checks that x is NOT NULL.  It is usually just as correct to say COUNT(*), which counts the number of rows without any extra checking.
This is a non-starter since it hides the indexed column (id) in a function:
AND (',3062,3063,3064,3065,3066,...6368,6421,'
     LIKE CONCAT('%,', p.id, ',%'))
 
 
 
 
 
 
 Yes, I tried with- (company_id, date, customer_id, source)is slightly better. Thanks!
 – yceruto
 Nov 21 at 20:53
 
 
 
 
 
 
 
 
 
 The order of the columns in the index is vital to performance. Your attempt has- datetoo early.
 – Rick James
 Nov 21 at 21:40
 
 
 
 
 
 
 
 
 
 Sorry (typo) I meant- (company_id, source, customer_id, date)obviously :)
 – yceruto
 Nov 21 at 21:43
 
 
 
 
 
 
 
 
 
 @yceruto - Interesting. How many rows in the resultset?
 – Rick James
 Nov 21 at 21:59
 
 
 
 
 
 
 
 
 
 After the composite index > EXPLAIN > rows: 4548, filtered: 50 (for- ctable)
 – yceruto
 Nov 21 at 23:45
 
 
 
 
 
|
show 3 more comments
With your LIKE-hack you are tricking optimizer so it uses different plan (most probably using IDX_6F7146F0AA9E377A index on the first place).
You should be able to see this in explain.
I think the real issue in your case is the second line of explain: server executing multiple functions (MONTH, YEAR) for 6716 rows and then trying to group all these rows. During this time all these 6716 rows should be stored (in memory or on disk that is based on your server configuration).
SELECT COUNT(*) FROM commission WHERE (date BETWEEN '2018-01-01' AND '2018-12-31') AND company_id = 90 AND source = 'ACTUAL';
=> How many rows are we talking about?
If the number in above query is much lower then 6716 I'd try to add covering index on columns customer_id, company_id, source and date. Not sure about the best order as it depends on data you have (check cardinality for these columns). I'd started with index (date, company_id, source, customer_id). Also, I'd add unique index (id, district_id, owner_id) on partner.
It is also possible to add additional generated stored columns _year and _month (if your server is a bit old you can add normal columns and fill them in with trigger) to rid off the multiple function executions.
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53403008%2fvery-slow-mysql-query-performance%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
                                3 Answers
                                3
                        
active
oldest
votes
                                3 Answers
                                3
                        
active
oldest
votes
active
oldest
votes
active
oldest
votes
MySQL can use one index at a time. For this query you need a compound index covering the aspects of the search. Constant aspects of the WHERE clause should be used before range aspects like:
ALTER TABLE commission
DROP INDEX IDX_6F7146F0979B1AD6,
ADD INDEX IDX_6F7146F0979B1AD6 (company_id, source, date)
 
 
 
 
 
 
 query optimization
 – danblack
 Nov 20 at 23:39
 
 
 
 
 
 
 
 
 
 How MySQL Uses indexes
 – danblack
 Nov 20 at 23:59
 
 
 
 
 
 
 
 
 
 Adding a composite index on- (customer_id, company_id, date, source)solves the performance issue (takes now less than 1s). Thanks for your answer and links.
 – yceruto
 Nov 21 at 16:29
 
 
 
 
 
 1
 
 
 
 
 And an index cookbook
 – Rick James
 Nov 21 at 19:18
 
 
 
 
 
 1
 
 
 
 
 @yceruto - Your 4-column index works well if the Optimizer starts with- p, but you need to move- dateto the end. Dan's 3-column index works well if it starts with- c. Add both.
 – Rick James
 Nov 21 at 19:19
 
 
 
add a comment |
MySQL can use one index at a time. For this query you need a compound index covering the aspects of the search. Constant aspects of the WHERE clause should be used before range aspects like:
ALTER TABLE commission
DROP INDEX IDX_6F7146F0979B1AD6,
ADD INDEX IDX_6F7146F0979B1AD6 (company_id, source, date)
 
 
 
 
 
 
 query optimization
 – danblack
 Nov 20 at 23:39
 
 
 
 
 
 
 
 
 
 How MySQL Uses indexes
 – danblack
 Nov 20 at 23:59
 
 
 
 
 
 
 
 
 
 Adding a composite index on- (customer_id, company_id, date, source)solves the performance issue (takes now less than 1s). Thanks for your answer and links.
 – yceruto
 Nov 21 at 16:29
 
 
 
 
 
 1
 
 
 
 
 And an index cookbook
 – Rick James
 Nov 21 at 19:18
 
 
 
 
 
 1
 
 
 
 
 @yceruto - Your 4-column index works well if the Optimizer starts with- p, but you need to move- dateto the end. Dan's 3-column index works well if it starts with- c. Add both.
 – Rick James
 Nov 21 at 19:19
 
 
 
add a comment |
MySQL can use one index at a time. For this query you need a compound index covering the aspects of the search. Constant aspects of the WHERE clause should be used before range aspects like:
ALTER TABLE commission
DROP INDEX IDX_6F7146F0979B1AD6,
ADD INDEX IDX_6F7146F0979B1AD6 (company_id, source, date)
MySQL can use one index at a time. For this query you need a compound index covering the aspects of the search. Constant aspects of the WHERE clause should be used before range aspects like:
ALTER TABLE commission
DROP INDEX IDX_6F7146F0979B1AD6,
ADD INDEX IDX_6F7146F0979B1AD6 (company_id, source, date)
answered Nov 20 at 23:30


danblack
1,5071214
1,5071214
 
 
 
 
 
 
 query optimization
 – danblack
 Nov 20 at 23:39
 
 
 
 
 
 
 
 
 
 How MySQL Uses indexes
 – danblack
 Nov 20 at 23:59
 
 
 
 
 
 
 
 
 
 Adding a composite index on- (customer_id, company_id, date, source)solves the performance issue (takes now less than 1s). Thanks for your answer and links.
 – yceruto
 Nov 21 at 16:29
 
 
 
 
 
 1
 
 
 
 
 And an index cookbook
 – Rick James
 Nov 21 at 19:18
 
 
 
 
 
 1
 
 
 
 
 @yceruto - Your 4-column index works well if the Optimizer starts with- p, but you need to move- dateto the end. Dan's 3-column index works well if it starts with- c. Add both.
 – Rick James
 Nov 21 at 19:19
 
 
 
add a comment |
 
 
 
 
 
 
 query optimization
 – danblack
 Nov 20 at 23:39
 
 
 
 
 
 
 
 
 
 How MySQL Uses indexes
 – danblack
 Nov 20 at 23:59
 
 
 
 
 
 
 
 
 
 Adding a composite index on- (customer_id, company_id, date, source)solves the performance issue (takes now less than 1s). Thanks for your answer and links.
 – yceruto
 Nov 21 at 16:29
 
 
 
 
 
 1
 
 
 
 
 And an index cookbook
 – Rick James
 Nov 21 at 19:18
 
 
 
 
 
 1
 
 
 
 
 @yceruto - Your 4-column index works well if the Optimizer starts with- p, but you need to move- dateto the end. Dan's 3-column index works well if it starts with- c. Add both.
 – Rick James
 Nov 21 at 19:19
 
 
 
query optimization
– danblack
Nov 20 at 23:39
query optimization
– danblack
Nov 20 at 23:39
How MySQL Uses indexes
– danblack
Nov 20 at 23:59
How MySQL Uses indexes
– danblack
Nov 20 at 23:59
Adding a composite index on
(customer_id, company_id, date, source) solves the performance issue (takes now less than 1s). Thanks for your answer and links.– yceruto
Nov 21 at 16:29
Adding a composite index on
(customer_id, company_id, date, source) solves the performance issue (takes now less than 1s). Thanks for your answer and links.– yceruto
Nov 21 at 16:29
1
1
And an index cookbook
– Rick James
Nov 21 at 19:18
And an index cookbook
– Rick James
Nov 21 at 19:18
1
1
@yceruto - Your 4-column index works well if the Optimizer starts with
p, but you need to move date to the end.  Dan's 3-column index works well if it starts with c.  Add both.– Rick James
Nov 21 at 19:19
@yceruto - Your 4-column index works well if the Optimizer starts with
p, but you need to move date to the end.  Dan's 3-column index works well if it starts with c.  Add both.– Rick James
Nov 21 at 19:19
add a comment |
Here's what the Optimizer sees in your query.
Checking whether to use an index for the GROUP BY:
- Functions (YEAR()) in theGROUP BY, so no.
- Multiple tables (candp) mentioned, so no.
For a JOIN, Optimizer will (almost always) start with one, then reach into the other.  So, let's look at the two options:
If starting with p:
Assuming you have PRIMARY KEY(id), there is not much to think about.  It will simply use that index.
For each row selected from p, it will then look into c, and any variation of this INDEX would be optimal.
c: INDEX(company_id, source, customer_id,  -- in any order (all are tested "=")
         date)       -- last, since it is tested as a range
If starting with c:
c: INDEX(company_id, source,  -- in any order (all are tested "=")
         date)       -- last, since it is tested as a range
-- slightly better:
c: INDEX(company_id, source,  -- in any order (all are tested "=")
         date,       -- last, since it is tested as a range
         customer_id)  -- really last -- added only to make it "covering".
The Optimizer will look at "statistics" to crudely decide which table to start with. So, add all the indexes I suggested.
A "covering" index is one that contains all the columns needed anywhere in the query. It is sometimes wise to extend a 'good' index with more columns to make it "covering".
But there is a monkey wrench in here.  c.customer_id = p.id means that customer_id IN (...) effectively exists.  But now there are two "range-like" constraints -- one is an IN, the other is a 'range'.  In some newer versions, the Optimizer will happily jump around due to the IN and still be able to do "range" scans.  So, I recommend this ordering:
- Test(s) of column = constant
 
- Test(s) with IN
 
 One 'range' test (- BETWEEN,- >=,- LIKEwith trailing wildcard, etc)
- Perhaps add more columns to make it "covering" -- but don't do this step if you end up with more than, say, 5 columns in the index.
Hence, for c, the following is optimal for the WHERE, and happens to be "covering".
INDEX(company_id, source,  -- first, but in any order (all "=")
      customer_id,  -- "IN"
      date)       -- last, since it is tested as a range
p: (same as above)
Since there was an IN or "range", there is no use seeing if the index can also handle the GROUP BY.
A note on COUNT(x) -- it checks that x is NOT NULL.  It is usually just as correct to say COUNT(*), which counts the number of rows without any extra checking.
This is a non-starter since it hides the indexed column (id) in a function:
AND (',3062,3063,3064,3065,3066,...6368,6421,'
     LIKE CONCAT('%,', p.id, ',%'))
 
 
 
 
 
 
 Yes, I tried with- (company_id, date, customer_id, source)is slightly better. Thanks!
 – yceruto
 Nov 21 at 20:53
 
 
 
 
 
 
 
 
 
 The order of the columns in the index is vital to performance. Your attempt has- datetoo early.
 – Rick James
 Nov 21 at 21:40
 
 
 
 
 
 
 
 
 
 Sorry (typo) I meant- (company_id, source, customer_id, date)obviously :)
 – yceruto
 Nov 21 at 21:43
 
 
 
 
 
 
 
 
 
 @yceruto - Interesting. How many rows in the resultset?
 – Rick James
 Nov 21 at 21:59
 
 
 
 
 
 
 
 
 
 After the composite index > EXPLAIN > rows: 4548, filtered: 50 (for- ctable)
 – yceruto
 Nov 21 at 23:45
 
 
 
 
 
|
show 3 more comments
Here's what the Optimizer sees in your query.
Checking whether to use an index for the GROUP BY:
- Functions (YEAR()) in theGROUP BY, so no.
- Multiple tables (candp) mentioned, so no.
For a JOIN, Optimizer will (almost always) start with one, then reach into the other.  So, let's look at the two options:
If starting with p:
Assuming you have PRIMARY KEY(id), there is not much to think about.  It will simply use that index.
For each row selected from p, it will then look into c, and any variation of this INDEX would be optimal.
c: INDEX(company_id, source, customer_id,  -- in any order (all are tested "=")
         date)       -- last, since it is tested as a range
If starting with c:
c: INDEX(company_id, source,  -- in any order (all are tested "=")
         date)       -- last, since it is tested as a range
-- slightly better:
c: INDEX(company_id, source,  -- in any order (all are tested "=")
         date,       -- last, since it is tested as a range
         customer_id)  -- really last -- added only to make it "covering".
The Optimizer will look at "statistics" to crudely decide which table to start with. So, add all the indexes I suggested.
A "covering" index is one that contains all the columns needed anywhere in the query. It is sometimes wise to extend a 'good' index with more columns to make it "covering".
But there is a monkey wrench in here.  c.customer_id = p.id means that customer_id IN (...) effectively exists.  But now there are two "range-like" constraints -- one is an IN, the other is a 'range'.  In some newer versions, the Optimizer will happily jump around due to the IN and still be able to do "range" scans.  So, I recommend this ordering:
- Test(s) of column = constant
 
- Test(s) with IN
 
 One 'range' test (- BETWEEN,- >=,- LIKEwith trailing wildcard, etc)
- Perhaps add more columns to make it "covering" -- but don't do this step if you end up with more than, say, 5 columns in the index.
Hence, for c, the following is optimal for the WHERE, and happens to be "covering".
INDEX(company_id, source,  -- first, but in any order (all "=")
      customer_id,  -- "IN"
      date)       -- last, since it is tested as a range
p: (same as above)
Since there was an IN or "range", there is no use seeing if the index can also handle the GROUP BY.
A note on COUNT(x) -- it checks that x is NOT NULL.  It is usually just as correct to say COUNT(*), which counts the number of rows without any extra checking.
This is a non-starter since it hides the indexed column (id) in a function:
AND (',3062,3063,3064,3065,3066,...6368,6421,'
     LIKE CONCAT('%,', p.id, ',%'))
 
 
 
 
 
 
 Yes, I tried with- (company_id, date, customer_id, source)is slightly better. Thanks!
 – yceruto
 Nov 21 at 20:53
 
 
 
 
 
 
 
 
 
 The order of the columns in the index is vital to performance. Your attempt has- datetoo early.
 – Rick James
 Nov 21 at 21:40
 
 
 
 
 
 
 
 
 
 Sorry (typo) I meant- (company_id, source, customer_id, date)obviously :)
 – yceruto
 Nov 21 at 21:43
 
 
 
 
 
 
 
 
 
 @yceruto - Interesting. How many rows in the resultset?
 – Rick James
 Nov 21 at 21:59
 
 
 
 
 
 
 
 
 
 After the composite index > EXPLAIN > rows: 4548, filtered: 50 (for- ctable)
 – yceruto
 Nov 21 at 23:45
 
 
 
 
 
|
show 3 more comments
Here's what the Optimizer sees in your query.
Checking whether to use an index for the GROUP BY:
- Functions (YEAR()) in theGROUP BY, so no.
- Multiple tables (candp) mentioned, so no.
For a JOIN, Optimizer will (almost always) start with one, then reach into the other.  So, let's look at the two options:
If starting with p:
Assuming you have PRIMARY KEY(id), there is not much to think about.  It will simply use that index.
For each row selected from p, it will then look into c, and any variation of this INDEX would be optimal.
c: INDEX(company_id, source, customer_id,  -- in any order (all are tested "=")
         date)       -- last, since it is tested as a range
If starting with c:
c: INDEX(company_id, source,  -- in any order (all are tested "=")
         date)       -- last, since it is tested as a range
-- slightly better:
c: INDEX(company_id, source,  -- in any order (all are tested "=")
         date,       -- last, since it is tested as a range
         customer_id)  -- really last -- added only to make it "covering".
The Optimizer will look at "statistics" to crudely decide which table to start with. So, add all the indexes I suggested.
A "covering" index is one that contains all the columns needed anywhere in the query. It is sometimes wise to extend a 'good' index with more columns to make it "covering".
But there is a monkey wrench in here.  c.customer_id = p.id means that customer_id IN (...) effectively exists.  But now there are two "range-like" constraints -- one is an IN, the other is a 'range'.  In some newer versions, the Optimizer will happily jump around due to the IN and still be able to do "range" scans.  So, I recommend this ordering:
- Test(s) of column = constant
 
- Test(s) with IN
 
 One 'range' test (- BETWEEN,- >=,- LIKEwith trailing wildcard, etc)
- Perhaps add more columns to make it "covering" -- but don't do this step if you end up with more than, say, 5 columns in the index.
Hence, for c, the following is optimal for the WHERE, and happens to be "covering".
INDEX(company_id, source,  -- first, but in any order (all "=")
      customer_id,  -- "IN"
      date)       -- last, since it is tested as a range
p: (same as above)
Since there was an IN or "range", there is no use seeing if the index can also handle the GROUP BY.
A note on COUNT(x) -- it checks that x is NOT NULL.  It is usually just as correct to say COUNT(*), which counts the number of rows without any extra checking.
This is a non-starter since it hides the indexed column (id) in a function:
AND (',3062,3063,3064,3065,3066,...6368,6421,'
     LIKE CONCAT('%,', p.id, ',%'))
Here's what the Optimizer sees in your query.
Checking whether to use an index for the GROUP BY:
- Functions (YEAR()) in theGROUP BY, so no.
- Multiple tables (candp) mentioned, so no.
For a JOIN, Optimizer will (almost always) start with one, then reach into the other.  So, let's look at the two options:
If starting with p:
Assuming you have PRIMARY KEY(id), there is not much to think about.  It will simply use that index.
For each row selected from p, it will then look into c, and any variation of this INDEX would be optimal.
c: INDEX(company_id, source, customer_id,  -- in any order (all are tested "=")
         date)       -- last, since it is tested as a range
If starting with c:
c: INDEX(company_id, source,  -- in any order (all are tested "=")
         date)       -- last, since it is tested as a range
-- slightly better:
c: INDEX(company_id, source,  -- in any order (all are tested "=")
         date,       -- last, since it is tested as a range
         customer_id)  -- really last -- added only to make it "covering".
The Optimizer will look at "statistics" to crudely decide which table to start with. So, add all the indexes I suggested.
A "covering" index is one that contains all the columns needed anywhere in the query. It is sometimes wise to extend a 'good' index with more columns to make it "covering".
But there is a monkey wrench in here.  c.customer_id = p.id means that customer_id IN (...) effectively exists.  But now there are two "range-like" constraints -- one is an IN, the other is a 'range'.  In some newer versions, the Optimizer will happily jump around due to the IN and still be able to do "range" scans.  So, I recommend this ordering:
- Test(s) of column = constant
 
- Test(s) with IN
 
 One 'range' test (- BETWEEN,- >=,- LIKEwith trailing wildcard, etc)
- Perhaps add more columns to make it "covering" -- but don't do this step if you end up with more than, say, 5 columns in the index.
Hence, for c, the following is optimal for the WHERE, and happens to be "covering".
INDEX(company_id, source,  -- first, but in any order (all "=")
      customer_id,  -- "IN"
      date)       -- last, since it is tested as a range
p: (same as above)
Since there was an IN or "range", there is no use seeing if the index can also handle the GROUP BY.
A note on COUNT(x) -- it checks that x is NOT NULL.  It is usually just as correct to say COUNT(*), which counts the number of rows without any extra checking.
This is a non-starter since it hides the indexed column (id) in a function:
AND (',3062,3063,3064,3065,3066,...6368,6421,'
     LIKE CONCAT('%,', p.id, ',%'))
answered Nov 21 at 20:04
Rick James
65.8k55797
65.8k55797
 
 
 
 
 
 
 Yes, I tried with- (company_id, date, customer_id, source)is slightly better. Thanks!
 – yceruto
 Nov 21 at 20:53
 
 
 
 
 
 
 
 
 
 The order of the columns in the index is vital to performance. Your attempt has- datetoo early.
 – Rick James
 Nov 21 at 21:40
 
 
 
 
 
 
 
 
 
 Sorry (typo) I meant- (company_id, source, customer_id, date)obviously :)
 – yceruto
 Nov 21 at 21:43
 
 
 
 
 
 
 
 
 
 @yceruto - Interesting. How many rows in the resultset?
 – Rick James
 Nov 21 at 21:59
 
 
 
 
 
 
 
 
 
 After the composite index > EXPLAIN > rows: 4548, filtered: 50 (for- ctable)
 – yceruto
 Nov 21 at 23:45
 
 
 
 
 
|
show 3 more comments
 
 
 
 
 
 
 Yes, I tried with- (company_id, date, customer_id, source)is slightly better. Thanks!
 – yceruto
 Nov 21 at 20:53
 
 
 
 
 
 
 
 
 
 The order of the columns in the index is vital to performance. Your attempt has- datetoo early.
 – Rick James
 Nov 21 at 21:40
 
 
 
 
 
 
 
 
 
 Sorry (typo) I meant- (company_id, source, customer_id, date)obviously :)
 – yceruto
 Nov 21 at 21:43
 
 
 
 
 
 
 
 
 
 @yceruto - Interesting. How many rows in the resultset?
 – Rick James
 Nov 21 at 21:59
 
 
 
 
 
 
 
 
 
 After the composite index > EXPLAIN > rows: 4548, filtered: 50 (for- ctable)
 – yceruto
 Nov 21 at 23:45
 
 
 
 
 
Yes, I tried with
(company_id, date, customer_id, source) is slightly better. Thanks!– yceruto
Nov 21 at 20:53
Yes, I tried with
(company_id, date, customer_id, source) is slightly better. Thanks!– yceruto
Nov 21 at 20:53
The order of the columns in the index is vital to performance. Your attempt has
date too early.– Rick James
Nov 21 at 21:40
The order of the columns in the index is vital to performance. Your attempt has
date too early.– Rick James
Nov 21 at 21:40
Sorry (typo) I meant
(company_id, source, customer_id, date) obviously :)– yceruto
Nov 21 at 21:43
Sorry (typo) I meant
(company_id, source, customer_id, date) obviously :)– yceruto
Nov 21 at 21:43
@yceruto - Interesting. How many rows in the resultset?
– Rick James
Nov 21 at 21:59
@yceruto - Interesting. How many rows in the resultset?
– Rick James
Nov 21 at 21:59
After the composite index > EXPLAIN > rows: 4548, filtered: 50 (for
c table)– yceruto
Nov 21 at 23:45
After the composite index > EXPLAIN > rows: 4548, filtered: 50 (for
c table)– yceruto
Nov 21 at 23:45
|
show 3 more comments
With your LIKE-hack you are tricking optimizer so it uses different plan (most probably using IDX_6F7146F0AA9E377A index on the first place).
You should be able to see this in explain.
I think the real issue in your case is the second line of explain: server executing multiple functions (MONTH, YEAR) for 6716 rows and then trying to group all these rows. During this time all these 6716 rows should be stored (in memory or on disk that is based on your server configuration).
SELECT COUNT(*) FROM commission WHERE (date BETWEEN '2018-01-01' AND '2018-12-31') AND company_id = 90 AND source = 'ACTUAL';
=> How many rows are we talking about?
If the number in above query is much lower then 6716 I'd try to add covering index on columns customer_id, company_id, source and date. Not sure about the best order as it depends on data you have (check cardinality for these columns). I'd started with index (date, company_id, source, customer_id). Also, I'd add unique index (id, district_id, owner_id) on partner.
It is also possible to add additional generated stored columns _year and _month (if your server is a bit old you can add normal columns and fill them in with trigger) to rid off the multiple function executions.
add a comment |
With your LIKE-hack you are tricking optimizer so it uses different plan (most probably using IDX_6F7146F0AA9E377A index on the first place).
You should be able to see this in explain.
I think the real issue in your case is the second line of explain: server executing multiple functions (MONTH, YEAR) for 6716 rows and then trying to group all these rows. During this time all these 6716 rows should be stored (in memory or on disk that is based on your server configuration).
SELECT COUNT(*) FROM commission WHERE (date BETWEEN '2018-01-01' AND '2018-12-31') AND company_id = 90 AND source = 'ACTUAL';
=> How many rows are we talking about?
If the number in above query is much lower then 6716 I'd try to add covering index on columns customer_id, company_id, source and date. Not sure about the best order as it depends on data you have (check cardinality for these columns). I'd started with index (date, company_id, source, customer_id). Also, I'd add unique index (id, district_id, owner_id) on partner.
It is also possible to add additional generated stored columns _year and _month (if your server is a bit old you can add normal columns and fill them in with trigger) to rid off the multiple function executions.
add a comment |
With your LIKE-hack you are tricking optimizer so it uses different plan (most probably using IDX_6F7146F0AA9E377A index on the first place).
You should be able to see this in explain.
I think the real issue in your case is the second line of explain: server executing multiple functions (MONTH, YEAR) for 6716 rows and then trying to group all these rows. During this time all these 6716 rows should be stored (in memory or on disk that is based on your server configuration).
SELECT COUNT(*) FROM commission WHERE (date BETWEEN '2018-01-01' AND '2018-12-31') AND company_id = 90 AND source = 'ACTUAL';
=> How many rows are we talking about?
If the number in above query is much lower then 6716 I'd try to add covering index on columns customer_id, company_id, source and date. Not sure about the best order as it depends on data you have (check cardinality for these columns). I'd started with index (date, company_id, source, customer_id). Also, I'd add unique index (id, district_id, owner_id) on partner.
It is also possible to add additional generated stored columns _year and _month (if your server is a bit old you can add normal columns and fill them in with trigger) to rid off the multiple function executions.
With your LIKE-hack you are tricking optimizer so it uses different plan (most probably using IDX_6F7146F0AA9E377A index on the first place).
You should be able to see this in explain.
I think the real issue in your case is the second line of explain: server executing multiple functions (MONTH, YEAR) for 6716 rows and then trying to group all these rows. During this time all these 6716 rows should be stored (in memory or on disk that is based on your server configuration).
SELECT COUNT(*) FROM commission WHERE (date BETWEEN '2018-01-01' AND '2018-12-31') AND company_id = 90 AND source = 'ACTUAL';
=> How many rows are we talking about?
If the number in above query is much lower then 6716 I'd try to add covering index on columns customer_id, company_id, source and date. Not sure about the best order as it depends on data you have (check cardinality for these columns). I'd started with index (date, company_id, source, customer_id). Also, I'd add unique index (id, district_id, owner_id) on partner.
It is also possible to add additional generated stored columns _year and _month (if your server is a bit old you can add normal columns and fill them in with trigger) to rid off the multiple function executions.
edited Nov 21 at 2:51
answered Nov 21 at 1:20
fifonik
38518
38518
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53403008%2fvery-slow-mysql-query-performance%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
1
Where does the
p.id IN (....)list come from?– danblack
Nov 20 at 23:28
As far as I understand, MySQL can only use one of those indexes on commission in it's WHERE; that part would better be server by a composite index on
(date, company_id, source).– Uueerdo
Nov 20 at 23:30
Note if your data range is 2018, the
GROUP BY YEAR(c.date)is constant. Its not the largest contributor to query time by far.– danblack
Nov 20 at 23:57
The
LIKEhack won't work well because it cannot usePRIMARY KEY(id)onp. You do have that??– Rick James
Nov 21 at 19:21
1
@danblack - The time taken for a function call is usually so insignificant (compared to fetching rows) as to be ignorable.
– Rick James
Nov 21 at 23:50