Stacking the dataframes and ranking

Can't quite find what I need in the questions, please correct me if I'm wrong. I have a number of dfs that are similar in shape and which may contain nans. Suppose a df that does not contain the nans looks like this:

np.random.seed(1)

mat = lambda: np.random.normal(size=10).reshape((5, 2))

df1 = pd.DataFrame(mat())

df2 = pd.DataFrame(mat())

df3 = pd.DataFrame(mat())

I want to somehow stack the df1, df2 and df3 on top of each other. And then rank each value across the df1, df2, df3 (i.e. the stack levels).

So the individual dfs, in this case will look like:

df1

enter image description here

df2

enter image description here

df3

enter image description here

So in this case in the ".iloc[0, 0]" we have the values: 1.62, 1.46 and -1.1, so the ranked df1 would have value 3, df2 would have value 2 and df3 would have value 1. And then this ranking is performed for each value across the dataframe levels. The general case will have about 16 dataframes stacked on top of each other and only 5 ranks, when there are nans, the df gets a rank of 0.

asked Nov 23 '18 at 12:50

i squared - Keep it Real

700520

add a comment |

np.random.seed(1)

mat = lambda: np.random.normal(size=10).reshape((5, 2))

df1 = pd.DataFrame(mat())

df2 = pd.DataFrame(mat())

df3 = pd.DataFrame(mat())

I want to somehow stack the df1, df2 and df3 on top of each other. And then rank each value across the df1, df2, df3 (i.e. the stack levels).

So the individual dfs, in this case will look like:

df1

enter image description here

df2

enter image description here

df3

enter image description here

asked Nov 23 '18 at 12:50

i squared - Keep it Real

700520

add a comment |

np.random.seed(1)

mat = lambda: np.random.normal(size=10).reshape((5, 2))

df1 = pd.DataFrame(mat())

df2 = pd.DataFrame(mat())

df3 = pd.DataFrame(mat())

I want to somehow stack the df1, df2 and df3 on top of each other. And then rank each value across the df1, df2, df3 (i.e. the stack levels).

So the individual dfs, in this case will look like:

df1

enter image description here

df2

enter image description here

df3

enter image description here

asked Nov 23 '18 at 12:50

i squared - Keep it Real

700520

np.random.seed(1)

mat = lambda: np.random.normal(size=10).reshape((5, 2))

df1 = pd.DataFrame(mat())

df2 = pd.DataFrame(mat())

df3 = pd.DataFrame(mat())

I want to somehow stack the df1, df2 and df3 on top of each other. And then rank each value across the df1, df2, df3 (i.e. the stack levels).

So the individual dfs, in this case will look like:

df1

enter image description here

df2

enter image description here

df3

enter image description here

python pandas dataframe

asked Nov 23 '18 at 12:50

i squared - Keep it Real

700520

asked Nov 23 '18 at 12:50

i squared - Keep it Real

700520

asked Nov 23 '18 at 12:50

i squared - Keep it Real

700520

asked Nov 23 '18 at 12:50

i squared - Keep it Real

700520

asked Nov 23 '18 at 12:50

i squared - Keep it Real

700520

add a comment |

1 Answer
1

active

oldest

votes

I think you need concat with GroupBy.rank:

df1.loc[0,1] = np.nan



df = pd.concat([df1, df2, df3], keys=('df1','df2','df3')).groupby(level=1).rank().fillna(0)

print (df)

         0    1

df1 0  3.0  0.0

    1  1.0  1.0

    2  1.0  1.0

    3  3.0  3.0

    4  3.0  1.0

df2 0  2.0  1.0

    1  2.0  2.0

    2  3.0  2.0

    3  1.0  2.0

    4  2.0  3.0

df3 0  1.0  2.0

    1  3.0  3.0

    2  2.0  3.0

    3  2.0  1.0

    4  1.0  2.0

edited Nov 23 '18 at 13:00

answered Nov 23 '18 at 12:55

jezrael

332k24273351

beautiful, a one-liner. But why does df3 have 0 rank in .iloc[0,0]?

– i squared - Keep it Real
Nov 23 '18 at 13:00

@isquared-KeepitReal - I test NaN value, but for easier check was assigned to df1.loc[0,1] = np.nan - it return 0

– jezrael
Nov 23 '18 at 13:01

how can this be extended to x ranks for y dataframes where x < y?

– i squared - Keep it Real
Nov 23 '18 at 13:03

@isquared-KeepitReal - hmmm, not sure if understand e.g. if need rank 2 for sample 3 df, what is expected output?

– jezrael
Nov 23 '18 at 13:06

1

think I'm complicating things. Can jut rank them 1 to however many dfs we have and them sum them all up to get a final df, and do the ranking on that final df.

– i squared - Keep it Real
Nov 23 '18 at 14:26

|
show 2 more comments

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53447067%2fstacking-the-dataframes-and-ranking%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

I think you need concat with GroupBy.rank:

df1.loc[0,1] = np.nan



df = pd.concat([df1, df2, df3], keys=('df1','df2','df3')).groupby(level=1).rank().fillna(0)

print (df)

         0    1

df1 0  3.0  0.0

    1  1.0  1.0

    2  1.0  1.0

    3  3.0  3.0

    4  3.0  1.0

df2 0  2.0  1.0

    1  2.0  2.0

    2  3.0  2.0

    3  1.0  2.0

    4  2.0  3.0

df3 0  1.0  2.0

    1  3.0  3.0

    2  2.0  3.0

    3  2.0  1.0

    4  1.0  2.0

edited Nov 23 '18 at 13:00

answered Nov 23 '18 at 12:55

jezrael

332k24273351

beautiful, a one-liner. But why does df3 have 0 rank in .iloc[0,0]?

– i squared - Keep it Real
Nov 23 '18 at 13:00

@isquared-KeepitReal - I test NaN value, but for easier check was assigned to df1.loc[0,1] = np.nan - it return 0

– jezrael
Nov 23 '18 at 13:01

how can this be extended to x ranks for y dataframes where x < y?

– i squared - Keep it Real
Nov 23 '18 at 13:03

@isquared-KeepitReal - hmmm, not sure if understand e.g. if need rank 2 for sample 3 df, what is expected output?

– jezrael
Nov 23 '18 at 13:06

1

think I'm complicating things. Can jut rank them 1 to however many dfs we have and them sum them all up to get a final df, and do the ranking on that final df.

– i squared - Keep it Real
Nov 23 '18 at 14:26

|
show 2 more comments

I think you need concat with GroupBy.rank:

df1.loc[0,1] = np.nan



df = pd.concat([df1, df2, df3], keys=('df1','df2','df3')).groupby(level=1).rank().fillna(0)

print (df)

         0    1

df1 0  3.0  0.0

    1  1.0  1.0

    2  1.0  1.0

    3  3.0  3.0

    4  3.0  1.0

df2 0  2.0  1.0

    1  2.0  2.0

    2  3.0  2.0

    3  1.0  2.0

    4  2.0  3.0

df3 0  1.0  2.0

    1  3.0  3.0

    2  2.0  3.0

    3  2.0  1.0

    4  1.0  2.0

edited Nov 23 '18 at 13:00

answered Nov 23 '18 at 12:55

jezrael

332k24273351

beautiful, a one-liner. But why does df3 have 0 rank in .iloc[0,0]?

– i squared - Keep it Real
Nov 23 '18 at 13:00

@isquared-KeepitReal - I test NaN value, but for easier check was assigned to df1.loc[0,1] = np.nan - it return 0

– jezrael
Nov 23 '18 at 13:01

how can this be extended to x ranks for y dataframes where x < y?

– i squared - Keep it Real
Nov 23 '18 at 13:03

@isquared-KeepitReal - hmmm, not sure if understand e.g. if need rank 2 for sample 3 df, what is expected output?

– jezrael
Nov 23 '18 at 13:06

1

think I'm complicating things. Can jut rank them 1 to however many dfs we have and them sum them all up to get a final df, and do the ranking on that final df.

– i squared - Keep it Real
Nov 23 '18 at 14:26

|
show 2 more comments

I think you need concat with GroupBy.rank:

df1.loc[0,1] = np.nan



df = pd.concat([df1, df2, df3], keys=('df1','df2','df3')).groupby(level=1).rank().fillna(0)

print (df)

         0    1

df1 0  3.0  0.0

    1  1.0  1.0

    2  1.0  1.0

    3  3.0  3.0

    4  3.0  1.0

df2 0  2.0  1.0

    1  2.0  2.0

    2  3.0  2.0

    3  1.0  2.0

    4  2.0  3.0

df3 0  1.0  2.0

    1  3.0  3.0

    2  2.0  3.0

    3  2.0  1.0

    4  1.0  2.0

edited Nov 23 '18 at 13:00

answered Nov 23 '18 at 12:55

jezrael

332k24273351

I think you need concat with GroupBy.rank:

df1.loc[0,1] = np.nan



df = pd.concat([df1, df2, df3], keys=('df1','df2','df3')).groupby(level=1).rank().fillna(0)

print (df)

         0    1

df1 0  3.0  0.0

    1  1.0  1.0

    2  1.0  1.0

    3  3.0  3.0

    4  3.0  1.0

df2 0  2.0  1.0

    1  2.0  2.0

    2  3.0  2.0

    3  1.0  2.0

    4  2.0  3.0

df3 0  1.0  2.0

    1  3.0  3.0

    2  2.0  3.0

    3  2.0  1.0

    4  1.0  2.0

edited Nov 23 '18 at 13:00

answered Nov 23 '18 at 12:55

jezrael

332k24273351

edited Nov 23 '18 at 13:00

answered Nov 23 '18 at 12:55

jezrael

332k24273351

answered Nov 23 '18 at 12:55

jezrael

332k24273351

answered Nov 23 '18 at 12:55

jezrael

332k24273351

beautiful, a one-liner. But why does df3 have 0 rank in .iloc[0,0]?

– i squared - Keep it Real
Nov 23 '18 at 13:00

@isquared-KeepitReal - I test NaN value, but for easier check was assigned to df1.loc[0,1] = np.nan - it return 0

– jezrael
Nov 23 '18 at 13:01

how can this be extended to x ranks for y dataframes where x < y?

– i squared - Keep it Real
Nov 23 '18 at 13:03

@isquared-KeepitReal - hmmm, not sure if understand e.g. if need rank 2 for sample 3 df, what is expected output?

– jezrael
Nov 23 '18 at 13:06

1

think I'm complicating things. Can jut rank them 1 to however many dfs we have and them sum them all up to get a final df, and do the ranking on that final df.

– i squared - Keep it Real
Nov 23 '18 at 14:26

|
show 2 more comments

beautiful, a one-liner. But why does df3 have 0 rank in .iloc[0,0]?

– i squared - Keep it Real
Nov 23 '18 at 13:00

@isquared-KeepitReal - I test NaN value, but for easier check was assigned to df1.loc[0,1] = np.nan - it return 0

– jezrael
Nov 23 '18 at 13:01

how can this be extended to x ranks for y dataframes where x < y?

– i squared - Keep it Real
Nov 23 '18 at 13:03

@isquared-KeepitReal - hmmm, not sure if understand e.g. if need rank 2 for sample 3 df, what is expected output?

– jezrael
Nov 23 '18 at 13:06

1

think I'm complicating things. Can jut rank them 1 to however many dfs we have and them sum them all up to get a final df, and do the ranking on that final df.

– i squared - Keep it Real
Nov 23 '18 at 14:26

beautiful, a one-liner. But why does df3 have 0 rank in .iloc[0,0]?

– i squared - Keep it Real
Nov 23 '18 at 13:00

@isquared-KeepitReal - I test NaN value, but for easier check was assigned to df1.loc[0,1] = np.nan - it return 0

– jezrael
Nov 23 '18 at 13:01

how can this be extended to x ranks for y dataframes where x < y?

– i squared - Keep it Real
Nov 23 '18 at 13:03

@isquared-KeepitReal - hmmm, not sure if understand e.g. if need rank 2 for sample 3 df, what is expected output?

– jezrael
Nov 23 '18 at 13:06

think I'm complicating things. Can jut rank them 1 to however many dfs we have and them sum them all up to get a final df, and do the ranking on that final df.

– i squared - Keep it Real
Nov 23 '18 at 14:26

|
show 2 more comments

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Tukukkk