How to design Dimension and Fact table in Azure Data Warehouse?












0















I am working on migrating the SQL Server Databases to Azure Data Warehouse. I had dimension tables in earlier db which had dim ids (basically integer values) and those were referred in fact tables for easy data fetch.
These dim ids were generated using identity columns.



Now, in Azure Data Warehouse, the identity column values are randomly generated and I am not sure whether to use these dim ids here. If these ids are not used then text columns can be used, but that will increase the seek time in turn hitting the performance.



So, can anyone suggest how I should handle these dim ids in Azure Data Warehouse?



Regards,



Pratik










share|improve this question



























    0















    I am working on migrating the SQL Server Databases to Azure Data Warehouse. I had dimension tables in earlier db which had dim ids (basically integer values) and those were referred in fact tables for easy data fetch.
    These dim ids were generated using identity columns.



    Now, in Azure Data Warehouse, the identity column values are randomly generated and I am not sure whether to use these dim ids here. If these ids are not used then text columns can be used, but that will increase the seek time in turn hitting the performance.



    So, can anyone suggest how I should handle these dim ids in Azure Data Warehouse?



    Regards,



    Pratik










    share|improve this question

























      0












      0








      0








      I am working on migrating the SQL Server Databases to Azure Data Warehouse. I had dimension tables in earlier db which had dim ids (basically integer values) and those were referred in fact tables for easy data fetch.
      These dim ids were generated using identity columns.



      Now, in Azure Data Warehouse, the identity column values are randomly generated and I am not sure whether to use these dim ids here. If these ids are not used then text columns can be used, but that will increase the seek time in turn hitting the performance.



      So, can anyone suggest how I should handle these dim ids in Azure Data Warehouse?



      Regards,



      Pratik










      share|improve this question














      I am working on migrating the SQL Server Databases to Azure Data Warehouse. I had dimension tables in earlier db which had dim ids (basically integer values) and those were referred in fact tables for easy data fetch.
      These dim ids were generated using identity columns.



      Now, in Azure Data Warehouse, the identity column values are randomly generated and I am not sure whether to use these dim ids here. If these ids are not used then text columns can be used, but that will increase the seek time in turn hitting the performance.



      So, can anyone suggest how I should handle these dim ids in Azure Data Warehouse?



      Regards,



      Pratik







      azure data-warehouse azure-sqldw sql-data-warehouse






      share|improve this question













      share|improve this question











      share|improve this question




      share|improve this question










      asked Nov 23 '18 at 10:27









      Pratik SomaiyaPratik Somaiya

      1085




      1085
























          1 Answer
          1






          active

          oldest

          votes


















          2














          You're fine using identity columns in Azure SQL Data Warehouse. Those values are not "random", they're just not sequenced the same way that you'd see on an SMP version of SQL Server.



          https://docs.microsoft.com/en-us/azure/sql-data-warehouse/sql-data-warehouse-tables-identity



          If you'd prefer an alternative, you can use a calculation like this to assign a surrogate key during row insertion:



          cast((select max(dim_sk) from dw.dim_xxx) + row_number() over (order by getdate()) as bigint) as dim_sk





          share|improve this answer
























          • Thanks for your reply. Does it follows the star schema method. I want to link the dim and fact tables using identity columns as foreign keys.

            – Pratik Somaiya
            Nov 26 '18 at 5:24








          • 1





            Yes and no. You can join tables and index columns to support your outcome, but there is no referential integrity ... no direct support for primary key / foreign key concepts such as deleting rows or nulling values. You don't want that overhead in a very large DW.

            – Ron Dunn
            Nov 26 '18 at 10:30











          • Yes, DW doesn't have primary and foreign key concepts. What do you recommend us to use to connect dim and fact table. Ids or a text column?

            – Pratik Somaiya
            Nov 26 '18 at 10:43






          • 1





            Assuming that you are using ASDW because you have a large volume of data, BIGINT is probably your best data type for IDENTITY columns. You'll notice that the alternative algorithm is also cast to a BIGINT. Integer values, in general, are the most compressible and fastest comparable data types, which mean you get best performance from your DW.

            – Ron Dunn
            Nov 26 '18 at 13:01











          Your Answer






          StackExchange.ifUsing("editor", function () {
          StackExchange.using("externalEditor", function () {
          StackExchange.using("snippets", function () {
          StackExchange.snippets.init();
          });
          });
          }, "code-snippets");

          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "1"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: true,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });














          draft saved

          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53444916%2fhow-to-design-dimension-and-fact-table-in-azure-data-warehouse%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown

























          1 Answer
          1






          active

          oldest

          votes








          1 Answer
          1






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes









          2














          You're fine using identity columns in Azure SQL Data Warehouse. Those values are not "random", they're just not sequenced the same way that you'd see on an SMP version of SQL Server.



          https://docs.microsoft.com/en-us/azure/sql-data-warehouse/sql-data-warehouse-tables-identity



          If you'd prefer an alternative, you can use a calculation like this to assign a surrogate key during row insertion:



          cast((select max(dim_sk) from dw.dim_xxx) + row_number() over (order by getdate()) as bigint) as dim_sk





          share|improve this answer
























          • Thanks for your reply. Does it follows the star schema method. I want to link the dim and fact tables using identity columns as foreign keys.

            – Pratik Somaiya
            Nov 26 '18 at 5:24








          • 1





            Yes and no. You can join tables and index columns to support your outcome, but there is no referential integrity ... no direct support for primary key / foreign key concepts such as deleting rows or nulling values. You don't want that overhead in a very large DW.

            – Ron Dunn
            Nov 26 '18 at 10:30











          • Yes, DW doesn't have primary and foreign key concepts. What do you recommend us to use to connect dim and fact table. Ids or a text column?

            – Pratik Somaiya
            Nov 26 '18 at 10:43






          • 1





            Assuming that you are using ASDW because you have a large volume of data, BIGINT is probably your best data type for IDENTITY columns. You'll notice that the alternative algorithm is also cast to a BIGINT. Integer values, in general, are the most compressible and fastest comparable data types, which mean you get best performance from your DW.

            – Ron Dunn
            Nov 26 '18 at 13:01
















          2














          You're fine using identity columns in Azure SQL Data Warehouse. Those values are not "random", they're just not sequenced the same way that you'd see on an SMP version of SQL Server.



          https://docs.microsoft.com/en-us/azure/sql-data-warehouse/sql-data-warehouse-tables-identity



          If you'd prefer an alternative, you can use a calculation like this to assign a surrogate key during row insertion:



          cast((select max(dim_sk) from dw.dim_xxx) + row_number() over (order by getdate()) as bigint) as dim_sk





          share|improve this answer
























          • Thanks for your reply. Does it follows the star schema method. I want to link the dim and fact tables using identity columns as foreign keys.

            – Pratik Somaiya
            Nov 26 '18 at 5:24








          • 1





            Yes and no. You can join tables and index columns to support your outcome, but there is no referential integrity ... no direct support for primary key / foreign key concepts such as deleting rows or nulling values. You don't want that overhead in a very large DW.

            – Ron Dunn
            Nov 26 '18 at 10:30











          • Yes, DW doesn't have primary and foreign key concepts. What do you recommend us to use to connect dim and fact table. Ids or a text column?

            – Pratik Somaiya
            Nov 26 '18 at 10:43






          • 1





            Assuming that you are using ASDW because you have a large volume of data, BIGINT is probably your best data type for IDENTITY columns. You'll notice that the alternative algorithm is also cast to a BIGINT. Integer values, in general, are the most compressible and fastest comparable data types, which mean you get best performance from your DW.

            – Ron Dunn
            Nov 26 '18 at 13:01














          2












          2








          2







          You're fine using identity columns in Azure SQL Data Warehouse. Those values are not "random", they're just not sequenced the same way that you'd see on an SMP version of SQL Server.



          https://docs.microsoft.com/en-us/azure/sql-data-warehouse/sql-data-warehouse-tables-identity



          If you'd prefer an alternative, you can use a calculation like this to assign a surrogate key during row insertion:



          cast((select max(dim_sk) from dw.dim_xxx) + row_number() over (order by getdate()) as bigint) as dim_sk





          share|improve this answer













          You're fine using identity columns in Azure SQL Data Warehouse. Those values are not "random", they're just not sequenced the same way that you'd see on an SMP version of SQL Server.



          https://docs.microsoft.com/en-us/azure/sql-data-warehouse/sql-data-warehouse-tables-identity



          If you'd prefer an alternative, you can use a calculation like this to assign a surrogate key during row insertion:



          cast((select max(dim_sk) from dw.dim_xxx) + row_number() over (order by getdate()) as bigint) as dim_sk






          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Nov 23 '18 at 12:41









          Ron DunnRon Dunn

          1,4211017




          1,4211017













          • Thanks for your reply. Does it follows the star schema method. I want to link the dim and fact tables using identity columns as foreign keys.

            – Pratik Somaiya
            Nov 26 '18 at 5:24








          • 1





            Yes and no. You can join tables and index columns to support your outcome, but there is no referential integrity ... no direct support for primary key / foreign key concepts such as deleting rows or nulling values. You don't want that overhead in a very large DW.

            – Ron Dunn
            Nov 26 '18 at 10:30











          • Yes, DW doesn't have primary and foreign key concepts. What do you recommend us to use to connect dim and fact table. Ids or a text column?

            – Pratik Somaiya
            Nov 26 '18 at 10:43






          • 1





            Assuming that you are using ASDW because you have a large volume of data, BIGINT is probably your best data type for IDENTITY columns. You'll notice that the alternative algorithm is also cast to a BIGINT. Integer values, in general, are the most compressible and fastest comparable data types, which mean you get best performance from your DW.

            – Ron Dunn
            Nov 26 '18 at 13:01



















          • Thanks for your reply. Does it follows the star schema method. I want to link the dim and fact tables using identity columns as foreign keys.

            – Pratik Somaiya
            Nov 26 '18 at 5:24








          • 1





            Yes and no. You can join tables and index columns to support your outcome, but there is no referential integrity ... no direct support for primary key / foreign key concepts such as deleting rows or nulling values. You don't want that overhead in a very large DW.

            – Ron Dunn
            Nov 26 '18 at 10:30











          • Yes, DW doesn't have primary and foreign key concepts. What do you recommend us to use to connect dim and fact table. Ids or a text column?

            – Pratik Somaiya
            Nov 26 '18 at 10:43






          • 1





            Assuming that you are using ASDW because you have a large volume of data, BIGINT is probably your best data type for IDENTITY columns. You'll notice that the alternative algorithm is also cast to a BIGINT. Integer values, in general, are the most compressible and fastest comparable data types, which mean you get best performance from your DW.

            – Ron Dunn
            Nov 26 '18 at 13:01

















          Thanks for your reply. Does it follows the star schema method. I want to link the dim and fact tables using identity columns as foreign keys.

          – Pratik Somaiya
          Nov 26 '18 at 5:24







          Thanks for your reply. Does it follows the star schema method. I want to link the dim and fact tables using identity columns as foreign keys.

          – Pratik Somaiya
          Nov 26 '18 at 5:24






          1




          1





          Yes and no. You can join tables and index columns to support your outcome, but there is no referential integrity ... no direct support for primary key / foreign key concepts such as deleting rows or nulling values. You don't want that overhead in a very large DW.

          – Ron Dunn
          Nov 26 '18 at 10:30





          Yes and no. You can join tables and index columns to support your outcome, but there is no referential integrity ... no direct support for primary key / foreign key concepts such as deleting rows or nulling values. You don't want that overhead in a very large DW.

          – Ron Dunn
          Nov 26 '18 at 10:30













          Yes, DW doesn't have primary and foreign key concepts. What do you recommend us to use to connect dim and fact table. Ids or a text column?

          – Pratik Somaiya
          Nov 26 '18 at 10:43





          Yes, DW doesn't have primary and foreign key concepts. What do you recommend us to use to connect dim and fact table. Ids or a text column?

          – Pratik Somaiya
          Nov 26 '18 at 10:43




          1




          1





          Assuming that you are using ASDW because you have a large volume of data, BIGINT is probably your best data type for IDENTITY columns. You'll notice that the alternative algorithm is also cast to a BIGINT. Integer values, in general, are the most compressible and fastest comparable data types, which mean you get best performance from your DW.

          – Ron Dunn
          Nov 26 '18 at 13:01





          Assuming that you are using ASDW because you have a large volume of data, BIGINT is probably your best data type for IDENTITY columns. You'll notice that the alternative algorithm is also cast to a BIGINT. Integer values, in general, are the most compressible and fastest comparable data types, which mean you get best performance from your DW.

          – Ron Dunn
          Nov 26 '18 at 13:01


















          draft saved

          draft discarded




















































          Thanks for contributing an answer to Stack Overflow!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53444916%2fhow-to-design-dimension-and-fact-table-in-azure-data-warehouse%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          404 Error Contact Form 7 ajax form submitting

          How to know if a Active Directory user can login interactively

          TypeError: fit_transform() missing 1 required positional argument: 'X'