How to design Dimension and Fact table in Azure Data Warehouse?

I am working on migrating the SQL Server Databases to Azure Data Warehouse. I had dimension tables in earlier db which had dim ids (basically integer values) and those were referred in fact tables for easy data fetch.
These dim ids were generated using identity columns.

Now, in Azure Data Warehouse, the identity column values are randomly generated and I am not sure whether to use these dim ids here. If these ids are not used then text columns can be used, but that will increase the seek time in turn hitting the performance.

So, can anyone suggest how I should handle these dim ids in Azure Data Warehouse?

Regards,

Pratik

asked Nov 23 '18 at 10:27

Pratik Somaiya

1085

add a comment |

So, can anyone suggest how I should handle these dim ids in Azure Data Warehouse?

Regards,

Pratik

asked Nov 23 '18 at 10:27

Pratik Somaiya

1085

add a comment |

So, can anyone suggest how I should handle these dim ids in Azure Data Warehouse?

Regards,

Pratik

asked Nov 23 '18 at 10:27

Pratik Somaiya

1085

So, can anyone suggest how I should handle these dim ids in Azure Data Warehouse?

Regards,

Pratik

azure data-warehouse azure-sqldw sql-data-warehouse

asked Nov 23 '18 at 10:27

Pratik Somaiya

1085

asked Nov 23 '18 at 10:27

Pratik Somaiya

1085

asked Nov 23 '18 at 10:27

Pratik Somaiya

1085

asked Nov 23 '18 at 10:27

Pratik Somaiya

1085

asked Nov 23 '18 at 10:27

Pratik Somaiya

1085

add a comment |

1 Answer
1

active

oldest

votes

You're fine using identity columns in Azure SQL Data Warehouse. Those values are not "random", they're just not sequenced the same way that you'd see on an SMP version of SQL Server.

https://docs.microsoft.com/en-us/azure/sql-data-warehouse/sql-data-warehouse-tables-identity

If you'd prefer an alternative, you can use a calculation like this to assign a surrogate key during row insertion:

cast((select max(dim_sk) from dw.dim_xxx) + row_number() over (order by getdate()) as bigint) as dim_sk

answered Nov 23 '18 at 12:41

Ron Dunn

1,4211017

Thanks for your reply. Does it follows the star schema method. I want to link the dim and fact tables using identity columns as foreign keys.

– Pratik Somaiya
Nov 26 '18 at 5:24

1

Yes and no. You can join tables and index columns to support your outcome, but there is no referential integrity ... no direct support for primary key / foreign key concepts such as deleting rows or nulling values. You don't want that overhead in a very large DW.

– Ron Dunn
Nov 26 '18 at 10:30

Yes, DW doesn't have primary and foreign key concepts. What do you recommend us to use to connect dim and fact table. Ids or a text column?

– Pratik Somaiya
Nov 26 '18 at 10:43

1

Assuming that you are using ASDW because you have a large volume of data, BIGINT is probably your best data type for IDENTITY columns. You'll notice that the alternative algorithm is also cast to a BIGINT. Integer values, in general, are the most compressible and fastest comparable data types, which mean you get best performance from your DW.

– Ron Dunn
Nov 26 '18 at 13:01

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53444916%2fhow-to-design-dimension-and-fact-table-in-azure-data-warehouse%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

You're fine using identity columns in Azure SQL Data Warehouse. Those values are not "random", they're just not sequenced the same way that you'd see on an SMP version of SQL Server.

https://docs.microsoft.com/en-us/azure/sql-data-warehouse/sql-data-warehouse-tables-identity

If you'd prefer an alternative, you can use a calculation like this to assign a surrogate key during row insertion:

cast((select max(dim_sk) from dw.dim_xxx) + row_number() over (order by getdate()) as bigint) as dim_sk

answered Nov 23 '18 at 12:41

Ron Dunn

1,4211017

Thanks for your reply. Does it follows the star schema method. I want to link the dim and fact tables using identity columns as foreign keys.

– Pratik Somaiya
Nov 26 '18 at 5:24

1

Yes and no. You can join tables and index columns to support your outcome, but there is no referential integrity ... no direct support for primary key / foreign key concepts such as deleting rows or nulling values. You don't want that overhead in a very large DW.

– Ron Dunn
Nov 26 '18 at 10:30

Yes, DW doesn't have primary and foreign key concepts. What do you recommend us to use to connect dim and fact table. Ids or a text column?

– Pratik Somaiya
Nov 26 '18 at 10:43

1

Assuming that you are using ASDW because you have a large volume of data, BIGINT is probably your best data type for IDENTITY columns. You'll notice that the alternative algorithm is also cast to a BIGINT. Integer values, in general, are the most compressible and fastest comparable data types, which mean you get best performance from your DW.

– Ron Dunn
Nov 26 '18 at 13:01

add a comment |

You're fine using identity columns in Azure SQL Data Warehouse. Those values are not "random", they're just not sequenced the same way that you'd see on an SMP version of SQL Server.

https://docs.microsoft.com/en-us/azure/sql-data-warehouse/sql-data-warehouse-tables-identity

If you'd prefer an alternative, you can use a calculation like this to assign a surrogate key during row insertion:

cast((select max(dim_sk) from dw.dim_xxx) + row_number() over (order by getdate()) as bigint) as dim_sk

answered Nov 23 '18 at 12:41

Ron Dunn

1,4211017

Thanks for your reply. Does it follows the star schema method. I want to link the dim and fact tables using identity columns as foreign keys.

– Pratik Somaiya
Nov 26 '18 at 5:24

1

Yes and no. You can join tables and index columns to support your outcome, but there is no referential integrity ... no direct support for primary key / foreign key concepts such as deleting rows or nulling values. You don't want that overhead in a very large DW.

– Ron Dunn
Nov 26 '18 at 10:30

Yes, DW doesn't have primary and foreign key concepts. What do you recommend us to use to connect dim and fact table. Ids or a text column?

– Pratik Somaiya
Nov 26 '18 at 10:43

1

Assuming that you are using ASDW because you have a large volume of data, BIGINT is probably your best data type for IDENTITY columns. You'll notice that the alternative algorithm is also cast to a BIGINT. Integer values, in general, are the most compressible and fastest comparable data types, which mean you get best performance from your DW.

– Ron Dunn
Nov 26 '18 at 13:01

add a comment |

You're fine using identity columns in Azure SQL Data Warehouse. Those values are not "random", they're just not sequenced the same way that you'd see on an SMP version of SQL Server.

https://docs.microsoft.com/en-us/azure/sql-data-warehouse/sql-data-warehouse-tables-identity

If you'd prefer an alternative, you can use a calculation like this to assign a surrogate key during row insertion:

cast((select max(dim_sk) from dw.dim_xxx) + row_number() over (order by getdate()) as bigint) as dim_sk

answered Nov 23 '18 at 12:41

Ron Dunn

1,4211017

You're fine using identity columns in Azure SQL Data Warehouse. Those values are not "random", they're just not sequenced the same way that you'd see on an SMP version of SQL Server.

https://docs.microsoft.com/en-us/azure/sql-data-warehouse/sql-data-warehouse-tables-identity

If you'd prefer an alternative, you can use a calculation like this to assign a surrogate key during row insertion:

cast((select max(dim_sk) from dw.dim_xxx) + row_number() over (order by getdate()) as bigint) as dim_sk

answered Nov 23 '18 at 12:41

Ron Dunn

1,4211017

answered Nov 23 '18 at 12:41

Ron Dunn

1,4211017

answered Nov 23 '18 at 12:41

Ron Dunn

1,4211017

answered Nov 23 '18 at 12:41

Ron Dunn

1,4211017

Thanks for your reply. Does it follows the star schema method. I want to link the dim and fact tables using identity columns as foreign keys.

– Pratik Somaiya
Nov 26 '18 at 5:24

1

Yes and no. You can join tables and index columns to support your outcome, but there is no referential integrity ... no direct support for primary key / foreign key concepts such as deleting rows or nulling values. You don't want that overhead in a very large DW.

– Ron Dunn
Nov 26 '18 at 10:30

Yes, DW doesn't have primary and foreign key concepts. What do you recommend us to use to connect dim and fact table. Ids or a text column?

– Pratik Somaiya
Nov 26 '18 at 10:43

1

Assuming that you are using ASDW because you have a large volume of data, BIGINT is probably your best data type for IDENTITY columns. You'll notice that the alternative algorithm is also cast to a BIGINT. Integer values, in general, are the most compressible and fastest comparable data types, which mean you get best performance from your DW.

– Ron Dunn
Nov 26 '18 at 13:01

add a comment |

Thanks for your reply. Does it follows the star schema method. I want to link the dim and fact tables using identity columns as foreign keys.

– Pratik Somaiya
Nov 26 '18 at 5:24

1

Yes and no. You can join tables and index columns to support your outcome, but there is no referential integrity ... no direct support for primary key / foreign key concepts such as deleting rows or nulling values. You don't want that overhead in a very large DW.

– Ron Dunn
Nov 26 '18 at 10:30

Yes, DW doesn't have primary and foreign key concepts. What do you recommend us to use to connect dim and fact table. Ids or a text column?

– Pratik Somaiya
Nov 26 '18 at 10:43

1

Assuming that you are using ASDW because you have a large volume of data, BIGINT is probably your best data type for IDENTITY columns. You'll notice that the alternative algorithm is also cast to a BIGINT. Integer values, in general, are the most compressible and fastest comparable data types, which mean you get best performance from your DW.

– Ron Dunn
Nov 26 '18 at 13:01

Thanks for your reply. Does it follows the star schema method. I want to link the dim and fact tables using identity columns as foreign keys.

– Pratik Somaiya
Nov 26 '18 at 5:24

Yes and no. You can join tables and index columns to support your outcome, but there is no referential integrity ... no direct support for primary key / foreign key concepts such as deleting rows or nulling values. You don't want that overhead in a very large DW.

– Ron Dunn
Nov 26 '18 at 10:30

Yes, DW doesn't have primary and foreign key concepts. What do you recommend us to use to connect dim and fact table. Ids or a text column?

– Pratik Somaiya
Nov 26 '18 at 10:43

Assuming that you are using ASDW because you have a large volume of data, BIGINT is probably your best data type for IDENTITY columns. You'll notice that the alternative algorithm is also cast to a BIGINT. Integer values, in general, are the most compressible and fastest comparable data types, which mean you get best performance from your DW.

– Ron Dunn
Nov 26 '18 at 13:01

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

ns6fTBf,Z5bZXBaZuc52RH3mdqBI8BPqrYgZ8Ol7p9V,ejR5,ogndQ2aj3 Mpgqivy,2dz5IsU6NXqJeEQ7n,ET0 Jv l

搜尋此網誌

Tukukkk