Why optimiser choose Clustered Index + Sort instead of Non-Clustered Index





.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty{ margin-bottom:0;
}






up vote
4
down vote

favorite












Given the next example:



IF OBJECT_ID('dbo.my_table') IS NOT NULL
DROP TABLE [dbo].[my_table];
GO

CREATE TABLE [dbo].[my_table]
(
[id] int IDENTITY (1,1) NOT NULL PRIMARY KEY,
[foo] int NULL,
[bar] int NULL,
[nki] int NOT NULL
);
GO

/* Insert some random data */
INSERT INTO [dbo].[my_table] (foo, bar, nki)
SELECT TOP (100000)
ABS(CHECKSUM(NewId())) % 14,
ABS(CHECKSUM(NewId())) % 20,
n = CONVERT(INT, ROW_NUMBER() OVER (ORDER BY s1.[object_id]))
FROM
sys.all_objects AS s1
CROSS JOIN
sys.all_objects AS s2
GO

CREATE UNIQUE NONCLUSTERED INDEX [IX_my_table]
ON [dbo].[my_table] ([nki] ASC);
GO


If I fetch all records ordered by [nki] (Non-clustered index):



SET STATISTICS TIME ON;
SELECT id, foo, bar, nki FROM my_table ORDER BY nki;
SET STATISTICS TIME OFF;

SQL Server Execution Times: CPU time = 266 ms, elapsed time = 493 ms


Optimiser choose the clustered index and then applies a Sort algorithm.



enter image description here



Execution plan



But if I force it to use the non-clustered index:



SET STATISTICS TIME ON;
SELECT id, foo, bar, nki FROM my_table WITH(INDEX(IX_my_TABLE));
SET STATISTICS TIME OFF;

SQL Server Execution Times: CPU time = 311 ms, elapsed time = 188 ms


Then it uses non-clustered index with a Key Lookup:



enter image description here



Execution plan



Obviously if the non-clustered index is transformed into a covering index:



CREATE UNIQUE NONCLUSTERED INDEX [IX_my_table]
ON [dbo].[my_table] ([nki] ASC)
INCLUDE (id, foo, bar);
GO


Then it uses only this index:



SET STATISTICS TIME ON;
SELECT id, foo, bar, nki FROM my_table ORDER BY nki;
SET STATISTICS TIME OFF;

SQL Server Execution Times: CPU time = 32 ms, elapsed time = 106 ms


enter image description here



Execution plan





Question




  • Why SQL Server use the clustered index plus a sort algorithm instead of non-clustered index even if execution time is 38% faster?










share|improve this question


















  • 1




    Sounds like you should review Kimberly Tripp's article about the tipping point: sqlskills.com/blogs/kimberly/the-tipping-point-query-answers More specifically the section titled "When does the tipping point occur?" I think she does a good job of covering when/why this can happen.
    – John Eisbrener
    13 hours ago








  • 1




    The optimizer draws up the execution plan based on the information it has at compile time - it's not possible at this point to make decisions based on execution time (SQL Server is adding features that learn, like adaptive joins and adaptive memory grant feedback, but choosing between a seek and a scan based on duration is probably a long way off).
    – Aaron Bertrand
    9 hours ago






  • 1




    Did you mean to leave off the ORDER BY in your forced-index query?
    – Forrest
    8 hours ago

















up vote
4
down vote

favorite












Given the next example:



IF OBJECT_ID('dbo.my_table') IS NOT NULL
DROP TABLE [dbo].[my_table];
GO

CREATE TABLE [dbo].[my_table]
(
[id] int IDENTITY (1,1) NOT NULL PRIMARY KEY,
[foo] int NULL,
[bar] int NULL,
[nki] int NOT NULL
);
GO

/* Insert some random data */
INSERT INTO [dbo].[my_table] (foo, bar, nki)
SELECT TOP (100000)
ABS(CHECKSUM(NewId())) % 14,
ABS(CHECKSUM(NewId())) % 20,
n = CONVERT(INT, ROW_NUMBER() OVER (ORDER BY s1.[object_id]))
FROM
sys.all_objects AS s1
CROSS JOIN
sys.all_objects AS s2
GO

CREATE UNIQUE NONCLUSTERED INDEX [IX_my_table]
ON [dbo].[my_table] ([nki] ASC);
GO


If I fetch all records ordered by [nki] (Non-clustered index):



SET STATISTICS TIME ON;
SELECT id, foo, bar, nki FROM my_table ORDER BY nki;
SET STATISTICS TIME OFF;

SQL Server Execution Times: CPU time = 266 ms, elapsed time = 493 ms


Optimiser choose the clustered index and then applies a Sort algorithm.



enter image description here



Execution plan



But if I force it to use the non-clustered index:



SET STATISTICS TIME ON;
SELECT id, foo, bar, nki FROM my_table WITH(INDEX(IX_my_TABLE));
SET STATISTICS TIME OFF;

SQL Server Execution Times: CPU time = 311 ms, elapsed time = 188 ms


Then it uses non-clustered index with a Key Lookup:



enter image description here



Execution plan



Obviously if the non-clustered index is transformed into a covering index:



CREATE UNIQUE NONCLUSTERED INDEX [IX_my_table]
ON [dbo].[my_table] ([nki] ASC)
INCLUDE (id, foo, bar);
GO


Then it uses only this index:



SET STATISTICS TIME ON;
SELECT id, foo, bar, nki FROM my_table ORDER BY nki;
SET STATISTICS TIME OFF;

SQL Server Execution Times: CPU time = 32 ms, elapsed time = 106 ms


enter image description here



Execution plan





Question




  • Why SQL Server use the clustered index plus a sort algorithm instead of non-clustered index even if execution time is 38% faster?










share|improve this question


















  • 1




    Sounds like you should review Kimberly Tripp's article about the tipping point: sqlskills.com/blogs/kimberly/the-tipping-point-query-answers More specifically the section titled "When does the tipping point occur?" I think she does a good job of covering when/why this can happen.
    – John Eisbrener
    13 hours ago








  • 1




    The optimizer draws up the execution plan based on the information it has at compile time - it's not possible at this point to make decisions based on execution time (SQL Server is adding features that learn, like adaptive joins and adaptive memory grant feedback, but choosing between a seek and a scan based on duration is probably a long way off).
    – Aaron Bertrand
    9 hours ago






  • 1




    Did you mean to leave off the ORDER BY in your forced-index query?
    – Forrest
    8 hours ago













up vote
4
down vote

favorite









up vote
4
down vote

favorite











Given the next example:



IF OBJECT_ID('dbo.my_table') IS NOT NULL
DROP TABLE [dbo].[my_table];
GO

CREATE TABLE [dbo].[my_table]
(
[id] int IDENTITY (1,1) NOT NULL PRIMARY KEY,
[foo] int NULL,
[bar] int NULL,
[nki] int NOT NULL
);
GO

/* Insert some random data */
INSERT INTO [dbo].[my_table] (foo, bar, nki)
SELECT TOP (100000)
ABS(CHECKSUM(NewId())) % 14,
ABS(CHECKSUM(NewId())) % 20,
n = CONVERT(INT, ROW_NUMBER() OVER (ORDER BY s1.[object_id]))
FROM
sys.all_objects AS s1
CROSS JOIN
sys.all_objects AS s2
GO

CREATE UNIQUE NONCLUSTERED INDEX [IX_my_table]
ON [dbo].[my_table] ([nki] ASC);
GO


If I fetch all records ordered by [nki] (Non-clustered index):



SET STATISTICS TIME ON;
SELECT id, foo, bar, nki FROM my_table ORDER BY nki;
SET STATISTICS TIME OFF;

SQL Server Execution Times: CPU time = 266 ms, elapsed time = 493 ms


Optimiser choose the clustered index and then applies a Sort algorithm.



enter image description here



Execution plan



But if I force it to use the non-clustered index:



SET STATISTICS TIME ON;
SELECT id, foo, bar, nki FROM my_table WITH(INDEX(IX_my_TABLE));
SET STATISTICS TIME OFF;

SQL Server Execution Times: CPU time = 311 ms, elapsed time = 188 ms


Then it uses non-clustered index with a Key Lookup:



enter image description here



Execution plan



Obviously if the non-clustered index is transformed into a covering index:



CREATE UNIQUE NONCLUSTERED INDEX [IX_my_table]
ON [dbo].[my_table] ([nki] ASC)
INCLUDE (id, foo, bar);
GO


Then it uses only this index:



SET STATISTICS TIME ON;
SELECT id, foo, bar, nki FROM my_table ORDER BY nki;
SET STATISTICS TIME OFF;

SQL Server Execution Times: CPU time = 32 ms, elapsed time = 106 ms


enter image description here



Execution plan





Question




  • Why SQL Server use the clustered index plus a sort algorithm instead of non-clustered index even if execution time is 38% faster?










share|improve this question













Given the next example:



IF OBJECT_ID('dbo.my_table') IS NOT NULL
DROP TABLE [dbo].[my_table];
GO

CREATE TABLE [dbo].[my_table]
(
[id] int IDENTITY (1,1) NOT NULL PRIMARY KEY,
[foo] int NULL,
[bar] int NULL,
[nki] int NOT NULL
);
GO

/* Insert some random data */
INSERT INTO [dbo].[my_table] (foo, bar, nki)
SELECT TOP (100000)
ABS(CHECKSUM(NewId())) % 14,
ABS(CHECKSUM(NewId())) % 20,
n = CONVERT(INT, ROW_NUMBER() OVER (ORDER BY s1.[object_id]))
FROM
sys.all_objects AS s1
CROSS JOIN
sys.all_objects AS s2
GO

CREATE UNIQUE NONCLUSTERED INDEX [IX_my_table]
ON [dbo].[my_table] ([nki] ASC);
GO


If I fetch all records ordered by [nki] (Non-clustered index):



SET STATISTICS TIME ON;
SELECT id, foo, bar, nki FROM my_table ORDER BY nki;
SET STATISTICS TIME OFF;

SQL Server Execution Times: CPU time = 266 ms, elapsed time = 493 ms


Optimiser choose the clustered index and then applies a Sort algorithm.



enter image description here



Execution plan



But if I force it to use the non-clustered index:



SET STATISTICS TIME ON;
SELECT id, foo, bar, nki FROM my_table WITH(INDEX(IX_my_TABLE));
SET STATISTICS TIME OFF;

SQL Server Execution Times: CPU time = 311 ms, elapsed time = 188 ms


Then it uses non-clustered index with a Key Lookup:



enter image description here



Execution plan



Obviously if the non-clustered index is transformed into a covering index:



CREATE UNIQUE NONCLUSTERED INDEX [IX_my_table]
ON [dbo].[my_table] ([nki] ASC)
INCLUDE (id, foo, bar);
GO


Then it uses only this index:



SET STATISTICS TIME ON;
SELECT id, foo, bar, nki FROM my_table ORDER BY nki;
SET STATISTICS TIME OFF;

SQL Server Execution Times: CPU time = 32 ms, elapsed time = 106 ms


enter image description here



Execution plan





Question




  • Why SQL Server use the clustered index plus a sort algorithm instead of non-clustered index even if execution time is 38% faster?







sql-server sql-server-2012 nonclustered-index






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked 13 hours ago









McNets

13.6k41753




13.6k41753








  • 1




    Sounds like you should review Kimberly Tripp's article about the tipping point: sqlskills.com/blogs/kimberly/the-tipping-point-query-answers More specifically the section titled "When does the tipping point occur?" I think she does a good job of covering when/why this can happen.
    – John Eisbrener
    13 hours ago








  • 1




    The optimizer draws up the execution plan based on the information it has at compile time - it's not possible at this point to make decisions based on execution time (SQL Server is adding features that learn, like adaptive joins and adaptive memory grant feedback, but choosing between a seek and a scan based on duration is probably a long way off).
    – Aaron Bertrand
    9 hours ago






  • 1




    Did you mean to leave off the ORDER BY in your forced-index query?
    – Forrest
    8 hours ago














  • 1




    Sounds like you should review Kimberly Tripp's article about the tipping point: sqlskills.com/blogs/kimberly/the-tipping-point-query-answers More specifically the section titled "When does the tipping point occur?" I think she does a good job of covering when/why this can happen.
    – John Eisbrener
    13 hours ago








  • 1




    The optimizer draws up the execution plan based on the information it has at compile time - it's not possible at this point to make decisions based on execution time (SQL Server is adding features that learn, like adaptive joins and adaptive memory grant feedback, but choosing between a seek and a scan based on duration is probably a long way off).
    – Aaron Bertrand
    9 hours ago






  • 1




    Did you mean to leave off the ORDER BY in your forced-index query?
    – Forrest
    8 hours ago








1




1




Sounds like you should review Kimberly Tripp's article about the tipping point: sqlskills.com/blogs/kimberly/the-tipping-point-query-answers More specifically the section titled "When does the tipping point occur?" I think she does a good job of covering when/why this can happen.
– John Eisbrener
13 hours ago






Sounds like you should review Kimberly Tripp's article about the tipping point: sqlskills.com/blogs/kimberly/the-tipping-point-query-answers More specifically the section titled "When does the tipping point occur?" I think she does a good job of covering when/why this can happen.
– John Eisbrener
13 hours ago






1




1




The optimizer draws up the execution plan based on the information it has at compile time - it's not possible at this point to make decisions based on execution time (SQL Server is adding features that learn, like adaptive joins and adaptive memory grant feedback, but choosing between a seek and a scan based on duration is probably a long way off).
– Aaron Bertrand
9 hours ago




The optimizer draws up the execution plan based on the information it has at compile time - it's not possible at this point to make decisions based on execution time (SQL Server is adding features that learn, like adaptive joins and adaptive memory grant feedback, but choosing between a seek and a scan based on duration is probably a long way off).
– Aaron Bertrand
9 hours ago




1




1




Did you mean to leave off the ORDER BY in your forced-index query?
– Forrest
8 hours ago




Did you mean to leave off the ORDER BY in your forced-index query?
– Forrest
8 hours ago










2 Answers
2






active

oldest

votes

















up vote
4
down vote













If you were to compare the number of reads required in 100,000 lookups with what’s involved in doing a sort, you might quickly get an idea about why the Query Optimizer figures that the CIX+Sort would be the best choice.



The Lookup execution ends up being quicker because the pages being read are in memory (even if you clear the cache, you have a lot of rows per page, so you’re reading the same pages over and over, but with different amounts of fragmentation or different memory pressure from other activity, this might not be the case). It really wouldn’t take all that much to have CIX+Sort go quicker, but what you’re seeing is because the cost of a read doesn’t take into consideration the relative cheapness of hitting the same pages repeatedly.






share|improve this answer




























    up vote
    3
    down vote














    Why SQL Server use the clustered index plus a sort algorithm instead
    of non-clustered index even if execution time is 38% faster?




    Because SQL Server uses a cost-based optimizer based on statistics, not runtime info.



    During the cost estimation process for this query, it does actually evaluate the lookup plan, but estimates it will take more effort. (Note the "Estimated Subtree Cost" when hovering over SELECT in the execution plan). That's not necessarily a bad assumption either - on my test machine, the lookup plan takes 6X the CPU of the sort/scan.



    Look to Rob Farley's answer as to why SQL Server might cost the lookup plan higher.






    share|improve this answer





















      Your Answer








      StackExchange.ready(function() {
      var channelOptions = {
      tags: "".split(" "),
      id: "182"
      };
      initTagRenderer("".split(" "), "".split(" "), channelOptions);

      StackExchange.using("externalEditor", function() {
      // Have to fire editor after snippets, if snippets enabled
      if (StackExchange.settings.snippets.snippetsEnabled) {
      StackExchange.using("snippets", function() {
      createEditor();
      });
      }
      else {
      createEditor();
      }
      });

      function createEditor() {
      StackExchange.prepareEditor({
      heartbeatType: 'answer',
      convertImagesToLinks: false,
      noModals: true,
      showLowRepImageUploadWarning: true,
      reputationToPostImages: null,
      bindNavPrevention: true,
      postfix: "",
      imageUploader: {
      brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
      contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
      allowUrls: true
      },
      onDemand: true,
      discardSelector: ".discard-answer"
      ,immediatelyShowMarkdownHelp:true
      });


      }
      });














       

      draft saved


      draft discarded


















      StackExchange.ready(
      function () {
      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f222987%2fwhy-optimiser-choose-clustered-index-sort-instead-of-non-clustered-index%23new-answer', 'question_page');
      }
      );

      Post as a guest















      Required, but never shown

























      2 Answers
      2






      active

      oldest

      votes








      2 Answers
      2






      active

      oldest

      votes









      active

      oldest

      votes






      active

      oldest

      votes








      up vote
      4
      down vote













      If you were to compare the number of reads required in 100,000 lookups with what’s involved in doing a sort, you might quickly get an idea about why the Query Optimizer figures that the CIX+Sort would be the best choice.



      The Lookup execution ends up being quicker because the pages being read are in memory (even if you clear the cache, you have a lot of rows per page, so you’re reading the same pages over and over, but with different amounts of fragmentation or different memory pressure from other activity, this might not be the case). It really wouldn’t take all that much to have CIX+Sort go quicker, but what you’re seeing is because the cost of a read doesn’t take into consideration the relative cheapness of hitting the same pages repeatedly.






      share|improve this answer

























        up vote
        4
        down vote













        If you were to compare the number of reads required in 100,000 lookups with what’s involved in doing a sort, you might quickly get an idea about why the Query Optimizer figures that the CIX+Sort would be the best choice.



        The Lookup execution ends up being quicker because the pages being read are in memory (even if you clear the cache, you have a lot of rows per page, so you’re reading the same pages over and over, but with different amounts of fragmentation or different memory pressure from other activity, this might not be the case). It really wouldn’t take all that much to have CIX+Sort go quicker, but what you’re seeing is because the cost of a read doesn’t take into consideration the relative cheapness of hitting the same pages repeatedly.






        share|improve this answer























          up vote
          4
          down vote










          up vote
          4
          down vote









          If you were to compare the number of reads required in 100,000 lookups with what’s involved in doing a sort, you might quickly get an idea about why the Query Optimizer figures that the CIX+Sort would be the best choice.



          The Lookup execution ends up being quicker because the pages being read are in memory (even if you clear the cache, you have a lot of rows per page, so you’re reading the same pages over and over, but with different amounts of fragmentation or different memory pressure from other activity, this might not be the case). It really wouldn’t take all that much to have CIX+Sort go quicker, but what you’re seeing is because the cost of a read doesn’t take into consideration the relative cheapness of hitting the same pages repeatedly.






          share|improve this answer












          If you were to compare the number of reads required in 100,000 lookups with what’s involved in doing a sort, you might quickly get an idea about why the Query Optimizer figures that the CIX+Sort would be the best choice.



          The Lookup execution ends up being quicker because the pages being read are in memory (even if you clear the cache, you have a lot of rows per page, so you’re reading the same pages over and over, but with different amounts of fragmentation or different memory pressure from other activity, this might not be the case). It really wouldn’t take all that much to have CIX+Sort go quicker, but what you’re seeing is because the cost of a read doesn’t take into consideration the relative cheapness of hitting the same pages repeatedly.







          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered 9 hours ago









          Rob Farley

          13.3k12446




          13.3k12446
























              up vote
              3
              down vote














              Why SQL Server use the clustered index plus a sort algorithm instead
              of non-clustered index even if execution time is 38% faster?




              Because SQL Server uses a cost-based optimizer based on statistics, not runtime info.



              During the cost estimation process for this query, it does actually evaluate the lookup plan, but estimates it will take more effort. (Note the "Estimated Subtree Cost" when hovering over SELECT in the execution plan). That's not necessarily a bad assumption either - on my test machine, the lookup plan takes 6X the CPU of the sort/scan.



              Look to Rob Farley's answer as to why SQL Server might cost the lookup plan higher.






              share|improve this answer

























                up vote
                3
                down vote














                Why SQL Server use the clustered index plus a sort algorithm instead
                of non-clustered index even if execution time is 38% faster?




                Because SQL Server uses a cost-based optimizer based on statistics, not runtime info.



                During the cost estimation process for this query, it does actually evaluate the lookup plan, but estimates it will take more effort. (Note the "Estimated Subtree Cost" when hovering over SELECT in the execution plan). That's not necessarily a bad assumption either - on my test machine, the lookup plan takes 6X the CPU of the sort/scan.



                Look to Rob Farley's answer as to why SQL Server might cost the lookup plan higher.






                share|improve this answer























                  up vote
                  3
                  down vote










                  up vote
                  3
                  down vote










                  Why SQL Server use the clustered index plus a sort algorithm instead
                  of non-clustered index even if execution time is 38% faster?




                  Because SQL Server uses a cost-based optimizer based on statistics, not runtime info.



                  During the cost estimation process for this query, it does actually evaluate the lookup plan, but estimates it will take more effort. (Note the "Estimated Subtree Cost" when hovering over SELECT in the execution plan). That's not necessarily a bad assumption either - on my test machine, the lookup plan takes 6X the CPU of the sort/scan.



                  Look to Rob Farley's answer as to why SQL Server might cost the lookup plan higher.






                  share|improve this answer













                  Why SQL Server use the clustered index plus a sort algorithm instead
                  of non-clustered index even if execution time is 38% faster?




                  Because SQL Server uses a cost-based optimizer based on statistics, not runtime info.



                  During the cost estimation process for this query, it does actually evaluate the lookup plan, but estimates it will take more effort. (Note the "Estimated Subtree Cost" when hovering over SELECT in the execution plan). That's not necessarily a bad assumption either - on my test machine, the lookup plan takes 6X the CPU of the sort/scan.



                  Look to Rob Farley's answer as to why SQL Server might cost the lookup plan higher.







                  share|improve this answer












                  share|improve this answer



                  share|improve this answer










                  answered 8 hours ago









                  Forrest

                  1,655516




                  1,655516






























                       

                      draft saved


                      draft discarded



















































                       


                      draft saved


                      draft discarded














                      StackExchange.ready(
                      function () {
                      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f222987%2fwhy-optimiser-choose-clustered-index-sort-instead-of-non-clustered-index%23new-answer', 'question_page');
                      }
                      );

                      Post as a guest















                      Required, but never shown





















































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown

































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown







                      Popular posts from this blog

                      404 Error Contact Form 7 ajax form submitting

                      How to know if a Active Directory user can login interactively

                      TypeError: fit_transform() missing 1 required positional argument: 'X'