Stacking the dataframes and ranking
Can't quite find what I need in the questions, please correct me if I'm wrong. I have a number of dfs that are similar in shape and which may contain nans. Suppose a df that does not contain the nans looks like this:
np.random.seed(1)
mat = lambda: np.random.normal(size=10).reshape((5, 2))
df1 = pd.DataFrame(mat())
df2 = pd.DataFrame(mat())
df3 = pd.DataFrame(mat())
I want to somehow stack the df1
, df2
and df3
on top of each other. And then rank each value across the df1
, df2
, df3
(i.e. the stack levels).
So the individual dfs, in this case will look like:
df1
df2
df3
So in this case in the ".iloc[0, 0]
" we have the values: 1.62, 1.46 and -1.1, so the ranked df1
would have value 3
, df2
would have value 2
and df3
would have value 1
. And then this ranking is performed for each value across the dataframe levels. The general case will have about 16 dataframes stacked on top of each other and only 5 ranks, when there are nans, the df gets a rank of 0.
python pandas dataframe
add a comment |
Can't quite find what I need in the questions, please correct me if I'm wrong. I have a number of dfs that are similar in shape and which may contain nans. Suppose a df that does not contain the nans looks like this:
np.random.seed(1)
mat = lambda: np.random.normal(size=10).reshape((5, 2))
df1 = pd.DataFrame(mat())
df2 = pd.DataFrame(mat())
df3 = pd.DataFrame(mat())
I want to somehow stack the df1
, df2
and df3
on top of each other. And then rank each value across the df1
, df2
, df3
(i.e. the stack levels).
So the individual dfs, in this case will look like:
df1
df2
df3
So in this case in the ".iloc[0, 0]
" we have the values: 1.62, 1.46 and -1.1, so the ranked df1
would have value 3
, df2
would have value 2
and df3
would have value 1
. And then this ranking is performed for each value across the dataframe levels. The general case will have about 16 dataframes stacked on top of each other and only 5 ranks, when there are nans, the df gets a rank of 0.
python pandas dataframe
add a comment |
Can't quite find what I need in the questions, please correct me if I'm wrong. I have a number of dfs that are similar in shape and which may contain nans. Suppose a df that does not contain the nans looks like this:
np.random.seed(1)
mat = lambda: np.random.normal(size=10).reshape((5, 2))
df1 = pd.DataFrame(mat())
df2 = pd.DataFrame(mat())
df3 = pd.DataFrame(mat())
I want to somehow stack the df1
, df2
and df3
on top of each other. And then rank each value across the df1
, df2
, df3
(i.e. the stack levels).
So the individual dfs, in this case will look like:
df1
df2
df3
So in this case in the ".iloc[0, 0]
" we have the values: 1.62, 1.46 and -1.1, so the ranked df1
would have value 3
, df2
would have value 2
and df3
would have value 1
. And then this ranking is performed for each value across the dataframe levels. The general case will have about 16 dataframes stacked on top of each other and only 5 ranks, when there are nans, the df gets a rank of 0.
python pandas dataframe
Can't quite find what I need in the questions, please correct me if I'm wrong. I have a number of dfs that are similar in shape and which may contain nans. Suppose a df that does not contain the nans looks like this:
np.random.seed(1)
mat = lambda: np.random.normal(size=10).reshape((5, 2))
df1 = pd.DataFrame(mat())
df2 = pd.DataFrame(mat())
df3 = pd.DataFrame(mat())
I want to somehow stack the df1
, df2
and df3
on top of each other. And then rank each value across the df1
, df2
, df3
(i.e. the stack levels).
So the individual dfs, in this case will look like:
df1
df2
df3
So in this case in the ".iloc[0, 0]
" we have the values: 1.62, 1.46 and -1.1, so the ranked df1
would have value 3
, df2
would have value 2
and df3
would have value 1
. And then this ranking is performed for each value across the dataframe levels. The general case will have about 16 dataframes stacked on top of each other and only 5 ranks, when there are nans, the df gets a rank of 0.
python pandas dataframe
python pandas dataframe
asked Nov 23 '18 at 12:50
i squared - Keep it Reali squared - Keep it Real
700520
700520
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
I think you need concat
with GroupBy.rank
:
df1.loc[0,1] = np.nan
df = pd.concat([df1, df2, df3], keys=('df1','df2','df3')).groupby(level=1).rank().fillna(0)
print (df)
0 1
df1 0 3.0 0.0
1 1.0 1.0
2 1.0 1.0
3 3.0 3.0
4 3.0 1.0
df2 0 2.0 1.0
1 2.0 2.0
2 3.0 2.0
3 1.0 2.0
4 2.0 3.0
df3 0 1.0 2.0
1 3.0 3.0
2 2.0 3.0
3 2.0 1.0
4 1.0 2.0
beautiful, a one-liner. But why does df3 have 0 rank in.iloc[0,0]
?
– i squared - Keep it Real
Nov 23 '18 at 13:00
@isquared-KeepitReal - I test NaN value, but for easier check was assigned todf1.loc[0,1] = np.nan
- it return 0
– jezrael
Nov 23 '18 at 13:01
how can this be extended tox
ranks fory
dataframes wherex
<y
?
– i squared - Keep it Real
Nov 23 '18 at 13:03
@isquared-KeepitReal - hmmm, not sure if understand e.g. if need rank 2 for sample 3 df, what is expected output?
– jezrael
Nov 23 '18 at 13:06
1
think I'm complicating things. Can jut rank them1
to however many dfs we have and them sum them all up to get a final df, and do the ranking on that final df.
– i squared - Keep it Real
Nov 23 '18 at 14:26
|
show 2 more comments
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53447067%2fstacking-the-dataframes-and-ranking%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
I think you need concat
with GroupBy.rank
:
df1.loc[0,1] = np.nan
df = pd.concat([df1, df2, df3], keys=('df1','df2','df3')).groupby(level=1).rank().fillna(0)
print (df)
0 1
df1 0 3.0 0.0
1 1.0 1.0
2 1.0 1.0
3 3.0 3.0
4 3.0 1.0
df2 0 2.0 1.0
1 2.0 2.0
2 3.0 2.0
3 1.0 2.0
4 2.0 3.0
df3 0 1.0 2.0
1 3.0 3.0
2 2.0 3.0
3 2.0 1.0
4 1.0 2.0
beautiful, a one-liner. But why does df3 have 0 rank in.iloc[0,0]
?
– i squared - Keep it Real
Nov 23 '18 at 13:00
@isquared-KeepitReal - I test NaN value, but for easier check was assigned todf1.loc[0,1] = np.nan
- it return 0
– jezrael
Nov 23 '18 at 13:01
how can this be extended tox
ranks fory
dataframes wherex
<y
?
– i squared - Keep it Real
Nov 23 '18 at 13:03
@isquared-KeepitReal - hmmm, not sure if understand e.g. if need rank 2 for sample 3 df, what is expected output?
– jezrael
Nov 23 '18 at 13:06
1
think I'm complicating things. Can jut rank them1
to however many dfs we have and them sum them all up to get a final df, and do the ranking on that final df.
– i squared - Keep it Real
Nov 23 '18 at 14:26
|
show 2 more comments
I think you need concat
with GroupBy.rank
:
df1.loc[0,1] = np.nan
df = pd.concat([df1, df2, df3], keys=('df1','df2','df3')).groupby(level=1).rank().fillna(0)
print (df)
0 1
df1 0 3.0 0.0
1 1.0 1.0
2 1.0 1.0
3 3.0 3.0
4 3.0 1.0
df2 0 2.0 1.0
1 2.0 2.0
2 3.0 2.0
3 1.0 2.0
4 2.0 3.0
df3 0 1.0 2.0
1 3.0 3.0
2 2.0 3.0
3 2.0 1.0
4 1.0 2.0
beautiful, a one-liner. But why does df3 have 0 rank in.iloc[0,0]
?
– i squared - Keep it Real
Nov 23 '18 at 13:00
@isquared-KeepitReal - I test NaN value, but for easier check was assigned todf1.loc[0,1] = np.nan
- it return 0
– jezrael
Nov 23 '18 at 13:01
how can this be extended tox
ranks fory
dataframes wherex
<y
?
– i squared - Keep it Real
Nov 23 '18 at 13:03
@isquared-KeepitReal - hmmm, not sure if understand e.g. if need rank 2 for sample 3 df, what is expected output?
– jezrael
Nov 23 '18 at 13:06
1
think I'm complicating things. Can jut rank them1
to however many dfs we have and them sum them all up to get a final df, and do the ranking on that final df.
– i squared - Keep it Real
Nov 23 '18 at 14:26
|
show 2 more comments
I think you need concat
with GroupBy.rank
:
df1.loc[0,1] = np.nan
df = pd.concat([df1, df2, df3], keys=('df1','df2','df3')).groupby(level=1).rank().fillna(0)
print (df)
0 1
df1 0 3.0 0.0
1 1.0 1.0
2 1.0 1.0
3 3.0 3.0
4 3.0 1.0
df2 0 2.0 1.0
1 2.0 2.0
2 3.0 2.0
3 1.0 2.0
4 2.0 3.0
df3 0 1.0 2.0
1 3.0 3.0
2 2.0 3.0
3 2.0 1.0
4 1.0 2.0
I think you need concat
with GroupBy.rank
:
df1.loc[0,1] = np.nan
df = pd.concat([df1, df2, df3], keys=('df1','df2','df3')).groupby(level=1).rank().fillna(0)
print (df)
0 1
df1 0 3.0 0.0
1 1.0 1.0
2 1.0 1.0
3 3.0 3.0
4 3.0 1.0
df2 0 2.0 1.0
1 2.0 2.0
2 3.0 2.0
3 1.0 2.0
4 2.0 3.0
df3 0 1.0 2.0
1 3.0 3.0
2 2.0 3.0
3 2.0 1.0
4 1.0 2.0
edited Nov 23 '18 at 13:00
answered Nov 23 '18 at 12:55
jezraeljezrael
332k24273351
332k24273351
beautiful, a one-liner. But why does df3 have 0 rank in.iloc[0,0]
?
– i squared - Keep it Real
Nov 23 '18 at 13:00
@isquared-KeepitReal - I test NaN value, but for easier check was assigned todf1.loc[0,1] = np.nan
- it return 0
– jezrael
Nov 23 '18 at 13:01
how can this be extended tox
ranks fory
dataframes wherex
<y
?
– i squared - Keep it Real
Nov 23 '18 at 13:03
@isquared-KeepitReal - hmmm, not sure if understand e.g. if need rank 2 for sample 3 df, what is expected output?
– jezrael
Nov 23 '18 at 13:06
1
think I'm complicating things. Can jut rank them1
to however many dfs we have and them sum them all up to get a final df, and do the ranking on that final df.
– i squared - Keep it Real
Nov 23 '18 at 14:26
|
show 2 more comments
beautiful, a one-liner. But why does df3 have 0 rank in.iloc[0,0]
?
– i squared - Keep it Real
Nov 23 '18 at 13:00
@isquared-KeepitReal - I test NaN value, but for easier check was assigned todf1.loc[0,1] = np.nan
- it return 0
– jezrael
Nov 23 '18 at 13:01
how can this be extended tox
ranks fory
dataframes wherex
<y
?
– i squared - Keep it Real
Nov 23 '18 at 13:03
@isquared-KeepitReal - hmmm, not sure if understand e.g. if need rank 2 for sample 3 df, what is expected output?
– jezrael
Nov 23 '18 at 13:06
1
think I'm complicating things. Can jut rank them1
to however many dfs we have and them sum them all up to get a final df, and do the ranking on that final df.
– i squared - Keep it Real
Nov 23 '18 at 14:26
beautiful, a one-liner. But why does df3 have 0 rank in
.iloc[0,0]
?– i squared - Keep it Real
Nov 23 '18 at 13:00
beautiful, a one-liner. But why does df3 have 0 rank in
.iloc[0,0]
?– i squared - Keep it Real
Nov 23 '18 at 13:00
@isquared-KeepitReal - I test NaN value, but for easier check was assigned to
df1.loc[0,1] = np.nan
- it return 0– jezrael
Nov 23 '18 at 13:01
@isquared-KeepitReal - I test NaN value, but for easier check was assigned to
df1.loc[0,1] = np.nan
- it return 0– jezrael
Nov 23 '18 at 13:01
how can this be extended to
x
ranks for y
dataframes where x
< y
?– i squared - Keep it Real
Nov 23 '18 at 13:03
how can this be extended to
x
ranks for y
dataframes where x
< y
?– i squared - Keep it Real
Nov 23 '18 at 13:03
@isquared-KeepitReal - hmmm, not sure if understand e.g. if need rank 2 for sample 3 df, what is expected output?
– jezrael
Nov 23 '18 at 13:06
@isquared-KeepitReal - hmmm, not sure if understand e.g. if need rank 2 for sample 3 df, what is expected output?
– jezrael
Nov 23 '18 at 13:06
1
1
think I'm complicating things. Can jut rank them
1
to however many dfs we have and them sum them all up to get a final df, and do the ranking on that final df.– i squared - Keep it Real
Nov 23 '18 at 14:26
think I'm complicating things. Can jut rank them
1
to however many dfs we have and them sum them all up to get a final df, and do the ranking on that final df.– i squared - Keep it Real
Nov 23 '18 at 14:26
|
show 2 more comments
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53447067%2fstacking-the-dataframes-and-ranking%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown