How to design Dimension and Fact table in Azure Data Warehouse?
I am working on migrating the SQL Server Databases to Azure Data Warehouse. I had dimension tables in earlier db which had dim ids (basically integer values) and those were referred in fact tables for easy data fetch.
These dim ids were generated using identity columns.
Now, in Azure Data Warehouse, the identity column values are randomly generated and I am not sure whether to use these dim ids here. If these ids are not used then text columns can be used, but that will increase the seek time in turn hitting the performance.
So, can anyone suggest how I should handle these dim ids in Azure Data Warehouse?
Regards,
Pratik
azure data-warehouse azure-sqldw sql-data-warehouse
add a comment |
I am working on migrating the SQL Server Databases to Azure Data Warehouse. I had dimension tables in earlier db which had dim ids (basically integer values) and those were referred in fact tables for easy data fetch.
These dim ids were generated using identity columns.
Now, in Azure Data Warehouse, the identity column values are randomly generated and I am not sure whether to use these dim ids here. If these ids are not used then text columns can be used, but that will increase the seek time in turn hitting the performance.
So, can anyone suggest how I should handle these dim ids in Azure Data Warehouse?
Regards,
Pratik
azure data-warehouse azure-sqldw sql-data-warehouse
add a comment |
I am working on migrating the SQL Server Databases to Azure Data Warehouse. I had dimension tables in earlier db which had dim ids (basically integer values) and those were referred in fact tables for easy data fetch.
These dim ids were generated using identity columns.
Now, in Azure Data Warehouse, the identity column values are randomly generated and I am not sure whether to use these dim ids here. If these ids are not used then text columns can be used, but that will increase the seek time in turn hitting the performance.
So, can anyone suggest how I should handle these dim ids in Azure Data Warehouse?
Regards,
Pratik
azure data-warehouse azure-sqldw sql-data-warehouse
I am working on migrating the SQL Server Databases to Azure Data Warehouse. I had dimension tables in earlier db which had dim ids (basically integer values) and those were referred in fact tables for easy data fetch.
These dim ids were generated using identity columns.
Now, in Azure Data Warehouse, the identity column values are randomly generated and I am not sure whether to use these dim ids here. If these ids are not used then text columns can be used, but that will increase the seek time in turn hitting the performance.
So, can anyone suggest how I should handle these dim ids in Azure Data Warehouse?
Regards,
Pratik
azure data-warehouse azure-sqldw sql-data-warehouse
azure data-warehouse azure-sqldw sql-data-warehouse
asked Nov 23 '18 at 10:27
Pratik SomaiyaPratik Somaiya
1085
1085
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
You're fine using identity columns in Azure SQL Data Warehouse. Those values are not "random", they're just not sequenced the same way that you'd see on an SMP version of SQL Server.
https://docs.microsoft.com/en-us/azure/sql-data-warehouse/sql-data-warehouse-tables-identity
If you'd prefer an alternative, you can use a calculation like this to assign a surrogate key during row insertion:
cast((select max(dim_sk) from dw.dim_xxx) + row_number() over (order by getdate()) as bigint) as dim_sk
Thanks for your reply. Does it follows the star schema method. I want to link the dim and fact tables using identity columns as foreign keys.
– Pratik Somaiya
Nov 26 '18 at 5:24
1
Yes and no. You can join tables and index columns to support your outcome, but there is no referential integrity ... no direct support for primary key / foreign key concepts such as deleting rows or nulling values. You don't want that overhead in a very large DW.
– Ron Dunn
Nov 26 '18 at 10:30
Yes, DW doesn't have primary and foreign key concepts. What do you recommend us to use to connect dim and fact table. Ids or a text column?
– Pratik Somaiya
Nov 26 '18 at 10:43
1
Assuming that you are using ASDW because you have a large volume of data, BIGINT is probably your best data type for IDENTITY columns. You'll notice that the alternative algorithm is also cast to a BIGINT. Integer values, in general, are the most compressible and fastest comparable data types, which mean you get best performance from your DW.
– Ron Dunn
Nov 26 '18 at 13:01
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53444916%2fhow-to-design-dimension-and-fact-table-in-azure-data-warehouse%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
You're fine using identity columns in Azure SQL Data Warehouse. Those values are not "random", they're just not sequenced the same way that you'd see on an SMP version of SQL Server.
https://docs.microsoft.com/en-us/azure/sql-data-warehouse/sql-data-warehouse-tables-identity
If you'd prefer an alternative, you can use a calculation like this to assign a surrogate key during row insertion:
cast((select max(dim_sk) from dw.dim_xxx) + row_number() over (order by getdate()) as bigint) as dim_sk
Thanks for your reply. Does it follows the star schema method. I want to link the dim and fact tables using identity columns as foreign keys.
– Pratik Somaiya
Nov 26 '18 at 5:24
1
Yes and no. You can join tables and index columns to support your outcome, but there is no referential integrity ... no direct support for primary key / foreign key concepts such as deleting rows or nulling values. You don't want that overhead in a very large DW.
– Ron Dunn
Nov 26 '18 at 10:30
Yes, DW doesn't have primary and foreign key concepts. What do you recommend us to use to connect dim and fact table. Ids or a text column?
– Pratik Somaiya
Nov 26 '18 at 10:43
1
Assuming that you are using ASDW because you have a large volume of data, BIGINT is probably your best data type for IDENTITY columns. You'll notice that the alternative algorithm is also cast to a BIGINT. Integer values, in general, are the most compressible and fastest comparable data types, which mean you get best performance from your DW.
– Ron Dunn
Nov 26 '18 at 13:01
add a comment |
You're fine using identity columns in Azure SQL Data Warehouse. Those values are not "random", they're just not sequenced the same way that you'd see on an SMP version of SQL Server.
https://docs.microsoft.com/en-us/azure/sql-data-warehouse/sql-data-warehouse-tables-identity
If you'd prefer an alternative, you can use a calculation like this to assign a surrogate key during row insertion:
cast((select max(dim_sk) from dw.dim_xxx) + row_number() over (order by getdate()) as bigint) as dim_sk
Thanks for your reply. Does it follows the star schema method. I want to link the dim and fact tables using identity columns as foreign keys.
– Pratik Somaiya
Nov 26 '18 at 5:24
1
Yes and no. You can join tables and index columns to support your outcome, but there is no referential integrity ... no direct support for primary key / foreign key concepts such as deleting rows or nulling values. You don't want that overhead in a very large DW.
– Ron Dunn
Nov 26 '18 at 10:30
Yes, DW doesn't have primary and foreign key concepts. What do you recommend us to use to connect dim and fact table. Ids or a text column?
– Pratik Somaiya
Nov 26 '18 at 10:43
1
Assuming that you are using ASDW because you have a large volume of data, BIGINT is probably your best data type for IDENTITY columns. You'll notice that the alternative algorithm is also cast to a BIGINT. Integer values, in general, are the most compressible and fastest comparable data types, which mean you get best performance from your DW.
– Ron Dunn
Nov 26 '18 at 13:01
add a comment |
You're fine using identity columns in Azure SQL Data Warehouse. Those values are not "random", they're just not sequenced the same way that you'd see on an SMP version of SQL Server.
https://docs.microsoft.com/en-us/azure/sql-data-warehouse/sql-data-warehouse-tables-identity
If you'd prefer an alternative, you can use a calculation like this to assign a surrogate key during row insertion:
cast((select max(dim_sk) from dw.dim_xxx) + row_number() over (order by getdate()) as bigint) as dim_sk
You're fine using identity columns in Azure SQL Data Warehouse. Those values are not "random", they're just not sequenced the same way that you'd see on an SMP version of SQL Server.
https://docs.microsoft.com/en-us/azure/sql-data-warehouse/sql-data-warehouse-tables-identity
If you'd prefer an alternative, you can use a calculation like this to assign a surrogate key during row insertion:
cast((select max(dim_sk) from dw.dim_xxx) + row_number() over (order by getdate()) as bigint) as dim_sk
answered Nov 23 '18 at 12:41
Ron DunnRon Dunn
1,4211017
1,4211017
Thanks for your reply. Does it follows the star schema method. I want to link the dim and fact tables using identity columns as foreign keys.
– Pratik Somaiya
Nov 26 '18 at 5:24
1
Yes and no. You can join tables and index columns to support your outcome, but there is no referential integrity ... no direct support for primary key / foreign key concepts such as deleting rows or nulling values. You don't want that overhead in a very large DW.
– Ron Dunn
Nov 26 '18 at 10:30
Yes, DW doesn't have primary and foreign key concepts. What do you recommend us to use to connect dim and fact table. Ids or a text column?
– Pratik Somaiya
Nov 26 '18 at 10:43
1
Assuming that you are using ASDW because you have a large volume of data, BIGINT is probably your best data type for IDENTITY columns. You'll notice that the alternative algorithm is also cast to a BIGINT. Integer values, in general, are the most compressible and fastest comparable data types, which mean you get best performance from your DW.
– Ron Dunn
Nov 26 '18 at 13:01
add a comment |
Thanks for your reply. Does it follows the star schema method. I want to link the dim and fact tables using identity columns as foreign keys.
– Pratik Somaiya
Nov 26 '18 at 5:24
1
Yes and no. You can join tables and index columns to support your outcome, but there is no referential integrity ... no direct support for primary key / foreign key concepts such as deleting rows or nulling values. You don't want that overhead in a very large DW.
– Ron Dunn
Nov 26 '18 at 10:30
Yes, DW doesn't have primary and foreign key concepts. What do you recommend us to use to connect dim and fact table. Ids or a text column?
– Pratik Somaiya
Nov 26 '18 at 10:43
1
Assuming that you are using ASDW because you have a large volume of data, BIGINT is probably your best data type for IDENTITY columns. You'll notice that the alternative algorithm is also cast to a BIGINT. Integer values, in general, are the most compressible and fastest comparable data types, which mean you get best performance from your DW.
– Ron Dunn
Nov 26 '18 at 13:01
Thanks for your reply. Does it follows the star schema method. I want to link the dim and fact tables using identity columns as foreign keys.
– Pratik Somaiya
Nov 26 '18 at 5:24
Thanks for your reply. Does it follows the star schema method. I want to link the dim and fact tables using identity columns as foreign keys.
– Pratik Somaiya
Nov 26 '18 at 5:24
1
1
Yes and no. You can join tables and index columns to support your outcome, but there is no referential integrity ... no direct support for primary key / foreign key concepts such as deleting rows or nulling values. You don't want that overhead in a very large DW.
– Ron Dunn
Nov 26 '18 at 10:30
Yes and no. You can join tables and index columns to support your outcome, but there is no referential integrity ... no direct support for primary key / foreign key concepts such as deleting rows or nulling values. You don't want that overhead in a very large DW.
– Ron Dunn
Nov 26 '18 at 10:30
Yes, DW doesn't have primary and foreign key concepts. What do you recommend us to use to connect dim and fact table. Ids or a text column?
– Pratik Somaiya
Nov 26 '18 at 10:43
Yes, DW doesn't have primary and foreign key concepts. What do you recommend us to use to connect dim and fact table. Ids or a text column?
– Pratik Somaiya
Nov 26 '18 at 10:43
1
1
Assuming that you are using ASDW because you have a large volume of data, BIGINT is probably your best data type for IDENTITY columns. You'll notice that the alternative algorithm is also cast to a BIGINT. Integer values, in general, are the most compressible and fastest comparable data types, which mean you get best performance from your DW.
– Ron Dunn
Nov 26 '18 at 13:01
Assuming that you are using ASDW because you have a large volume of data, BIGINT is probably your best data type for IDENTITY columns. You'll notice that the alternative algorithm is also cast to a BIGINT. Integer values, in general, are the most compressible and fastest comparable data types, which mean you get best performance from your DW.
– Ron Dunn
Nov 26 '18 at 13:01
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53444916%2fhow-to-design-dimension-and-fact-table-in-azure-data-warehouse%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown