Python: create a lag (t-1) data structure of multiple elements
I'm having trouble creating a time lag column for my data. It works fine when I do it for a dataframe with a just a kind of elements, but it doesn't not work fine, when I have different elements. For example, my dataset looks something like this:
when using the command suggested:
data1['lag_t'] = data1['total_tax'].shift(1)
I get a result like this:
As you can see, it just displace all the 'total_tax' value one row. However, I need to do this lag for EACH ONE of the id_inf (as separate items).
My dataset is really huge, so I need to find a way to solve this issue. So I can get as a result a table like this:
python pandas dataframe shift
add a comment |
I'm having trouble creating a time lag column for my data. It works fine when I do it for a dataframe with a just a kind of elements, but it doesn't not work fine, when I have different elements. For example, my dataset looks something like this:
when using the command suggested:
data1['lag_t'] = data1['total_tax'].shift(1)
I get a result like this:
As you can see, it just displace all the 'total_tax' value one row. However, I need to do this lag for EACH ONE of the id_inf (as separate items).
My dataset is really huge, so I need to find a way to solve this issue. So I can get as a result a table like this:
python pandas dataframe shift
add a comment |
I'm having trouble creating a time lag column for my data. It works fine when I do it for a dataframe with a just a kind of elements, but it doesn't not work fine, when I have different elements. For example, my dataset looks something like this:
when using the command suggested:
data1['lag_t'] = data1['total_tax'].shift(1)
I get a result like this:
As you can see, it just displace all the 'total_tax' value one row. However, I need to do this lag for EACH ONE of the id_inf (as separate items).
My dataset is really huge, so I need to find a way to solve this issue. So I can get as a result a table like this:
python pandas dataframe shift
I'm having trouble creating a time lag column for my data. It works fine when I do it for a dataframe with a just a kind of elements, but it doesn't not work fine, when I have different elements. For example, my dataset looks something like this:
when using the command suggested:
data1['lag_t'] = data1['total_tax'].shift(1)
I get a result like this:
As you can see, it just displace all the 'total_tax' value one row. However, I need to do this lag for EACH ONE of the id_inf (as separate items).
My dataset is really huge, so I need to find a way to solve this issue. So I can get as a result a table like this:
python pandas dataframe shift
python pandas dataframe shift
asked Nov 22 '18 at 22:06
PAstudilloEPAstudilloE
137111
137111
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
You can groupby
on index and shift
# an example with random data.
data1 = pd.DataFrame({'id': [9,9,9,54,54,54],'total_tax':[5,6,7,1,2,3]}).set_index('id')
data1['lag_t'] = data1.groupby(level=0)['total_tax'].apply(lambda x: x.shift())
print (data1)
tax lag_t
id
9 5 NaN
9 6 5.0
9 7 6.0
54 1 NaN
54 2 1.0
54 3 2.0
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53438473%2fpython-create-a-lag-t-1-data-structure-of-multiple-elements%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
You can groupby
on index and shift
# an example with random data.
data1 = pd.DataFrame({'id': [9,9,9,54,54,54],'total_tax':[5,6,7,1,2,3]}).set_index('id')
data1['lag_t'] = data1.groupby(level=0)['total_tax'].apply(lambda x: x.shift())
print (data1)
tax lag_t
id
9 5 NaN
9 6 5.0
9 7 6.0
54 1 NaN
54 2 1.0
54 3 2.0
add a comment |
You can groupby
on index and shift
# an example with random data.
data1 = pd.DataFrame({'id': [9,9,9,54,54,54],'total_tax':[5,6,7,1,2,3]}).set_index('id')
data1['lag_t'] = data1.groupby(level=0)['total_tax'].apply(lambda x: x.shift())
print (data1)
tax lag_t
id
9 5 NaN
9 6 5.0
9 7 6.0
54 1 NaN
54 2 1.0
54 3 2.0
add a comment |
You can groupby
on index and shift
# an example with random data.
data1 = pd.DataFrame({'id': [9,9,9,54,54,54],'total_tax':[5,6,7,1,2,3]}).set_index('id')
data1['lag_t'] = data1.groupby(level=0)['total_tax'].apply(lambda x: x.shift())
print (data1)
tax lag_t
id
9 5 NaN
9 6 5.0
9 7 6.0
54 1 NaN
54 2 1.0
54 3 2.0
You can groupby
on index and shift
# an example with random data.
data1 = pd.DataFrame({'id': [9,9,9,54,54,54],'total_tax':[5,6,7,1,2,3]}).set_index('id')
data1['lag_t'] = data1.groupby(level=0)['total_tax'].apply(lambda x: x.shift())
print (data1)
tax lag_t
id
9 5 NaN
9 6 5.0
9 7 6.0
54 1 NaN
54 2 1.0
54 3 2.0
edited Nov 22 '18 at 22:21
answered Nov 22 '18 at 22:15
AbhiAbhi
2,480320
2,480320
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53438473%2fpython-create-a-lag-t-1-data-structure-of-multiple-elements%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown