Join 2 Pandas dataframes on specific columns accounting for nan
I have 2 pandas DataFrames dfA
and dfB
with X and Y columns and rows.
I need to append dfB
on dfA
, creating dfC
. I need the appending to be on columns with column names in both dfA
and dfB
.
Moreover, if column name in dfA
does not appear in dfB
, then dfC
should contain Nan values.
To simplify,
dfA
looks like,
c1 c2 c3
0 AX1 AX2 AX3
1 AY1 AY2 AY3
2 AZ1 AZ2 AZ3
while dfB
looks like,
c1 c2 c4
0 BX1 BX2 BX3
1 BY1 BY2 BY3
2 BZ1 BZ2 BZ3
such that dfC
should look like,
c1 c2 c3
0 AX1 AX2 AX3
1 AY1 AY2 AY3
2 AZ1 AZ2 AZ3
3 BX1 BX2 Nan
4 BY1 BY2 Nan
5 BZ1 BZ2 Nan
How do I do this?
python pandas
add a comment |
I have 2 pandas DataFrames dfA
and dfB
with X and Y columns and rows.
I need to append dfB
on dfA
, creating dfC
. I need the appending to be on columns with column names in both dfA
and dfB
.
Moreover, if column name in dfA
does not appear in dfB
, then dfC
should contain Nan values.
To simplify,
dfA
looks like,
c1 c2 c3
0 AX1 AX2 AX3
1 AY1 AY2 AY3
2 AZ1 AZ2 AZ3
while dfB
looks like,
c1 c2 c4
0 BX1 BX2 BX3
1 BY1 BY2 BY3
2 BZ1 BZ2 BZ3
such that dfC
should look like,
c1 c2 c3
0 AX1 AX2 AX3
1 AY1 AY2 AY3
2 AZ1 AZ2 AZ3
3 BX1 BX2 Nan
4 BY1 BY2 Nan
5 BZ1 BZ2 Nan
How do I do this?
python pandas
add a comment |
I have 2 pandas DataFrames dfA
and dfB
with X and Y columns and rows.
I need to append dfB
on dfA
, creating dfC
. I need the appending to be on columns with column names in both dfA
and dfB
.
Moreover, if column name in dfA
does not appear in dfB
, then dfC
should contain Nan values.
To simplify,
dfA
looks like,
c1 c2 c3
0 AX1 AX2 AX3
1 AY1 AY2 AY3
2 AZ1 AZ2 AZ3
while dfB
looks like,
c1 c2 c4
0 BX1 BX2 BX3
1 BY1 BY2 BY3
2 BZ1 BZ2 BZ3
such that dfC
should look like,
c1 c2 c3
0 AX1 AX2 AX3
1 AY1 AY2 AY3
2 AZ1 AZ2 AZ3
3 BX1 BX2 Nan
4 BY1 BY2 Nan
5 BZ1 BZ2 Nan
How do I do this?
python pandas
I have 2 pandas DataFrames dfA
and dfB
with X and Y columns and rows.
I need to append dfB
on dfA
, creating dfC
. I need the appending to be on columns with column names in both dfA
and dfB
.
Moreover, if column name in dfA
does not appear in dfB
, then dfC
should contain Nan values.
To simplify,
dfA
looks like,
c1 c2 c3
0 AX1 AX2 AX3
1 AY1 AY2 AY3
2 AZ1 AZ2 AZ3
while dfB
looks like,
c1 c2 c4
0 BX1 BX2 BX3
1 BY1 BY2 BY3
2 BZ1 BZ2 BZ3
such that dfC
should look like,
c1 c2 c3
0 AX1 AX2 AX3
1 AY1 AY2 AY3
2 AZ1 AZ2 AZ3
3 BX1 BX2 Nan
4 BY1 BY2 Nan
5 BZ1 BZ2 Nan
How do I do this?
python pandas
python pandas
asked Nov 23 '18 at 11:37
LucSpanLucSpan
678519
678519
add a comment |
add a comment |
2 Answers
2
active
oldest
votes
Use concat
and filter columns of dfA
:
df = pd.concat([dfA, dfB])[dfA.columns]
Or filter columns by intersection
:
cols = dfA.columns.intersection(dfB.columns)
df = pd.concat([dfA, dfB[cols]])
add a comment |
try this,
pd.concat([dfA,dfB[['c1','c2']]])
To Avoid Hard coding columns
try this,
inter=list(set(dfA.columns)&set(dfB.columns))
print pd.concat([dfA,dfB[inter]])
1
Thanks, but with @jezrael's answer I don't have to check column names.
– LucSpan
Nov 23 '18 at 11:43
1
@LucSpan - Solution updated
– Mohamed Thasin ah
Nov 23 '18 at 11:46
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53445991%2fjoin-2-pandas-dataframes-on-specific-columns-accounting-for-nan%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
Use concat
and filter columns of dfA
:
df = pd.concat([dfA, dfB])[dfA.columns]
Or filter columns by intersection
:
cols = dfA.columns.intersection(dfB.columns)
df = pd.concat([dfA, dfB[cols]])
add a comment |
Use concat
and filter columns of dfA
:
df = pd.concat([dfA, dfB])[dfA.columns]
Or filter columns by intersection
:
cols = dfA.columns.intersection(dfB.columns)
df = pd.concat([dfA, dfB[cols]])
add a comment |
Use concat
and filter columns of dfA
:
df = pd.concat([dfA, dfB])[dfA.columns]
Or filter columns by intersection
:
cols = dfA.columns.intersection(dfB.columns)
df = pd.concat([dfA, dfB[cols]])
Use concat
and filter columns of dfA
:
df = pd.concat([dfA, dfB])[dfA.columns]
Or filter columns by intersection
:
cols = dfA.columns.intersection(dfB.columns)
df = pd.concat([dfA, dfB[cols]])
edited Nov 23 '18 at 11:42
answered Nov 23 '18 at 11:39
jezraeljezrael
331k24273351
331k24273351
add a comment |
add a comment |
try this,
pd.concat([dfA,dfB[['c1','c2']]])
To Avoid Hard coding columns
try this,
inter=list(set(dfA.columns)&set(dfB.columns))
print pd.concat([dfA,dfB[inter]])
1
Thanks, but with @jezrael's answer I don't have to check column names.
– LucSpan
Nov 23 '18 at 11:43
1
@LucSpan - Solution updated
– Mohamed Thasin ah
Nov 23 '18 at 11:46
add a comment |
try this,
pd.concat([dfA,dfB[['c1','c2']]])
To Avoid Hard coding columns
try this,
inter=list(set(dfA.columns)&set(dfB.columns))
print pd.concat([dfA,dfB[inter]])
1
Thanks, but with @jezrael's answer I don't have to check column names.
– LucSpan
Nov 23 '18 at 11:43
1
@LucSpan - Solution updated
– Mohamed Thasin ah
Nov 23 '18 at 11:46
add a comment |
try this,
pd.concat([dfA,dfB[['c1','c2']]])
To Avoid Hard coding columns
try this,
inter=list(set(dfA.columns)&set(dfB.columns))
print pd.concat([dfA,dfB[inter]])
try this,
pd.concat([dfA,dfB[['c1','c2']]])
To Avoid Hard coding columns
try this,
inter=list(set(dfA.columns)&set(dfB.columns))
print pd.concat([dfA,dfB[inter]])
edited Nov 23 '18 at 11:43
answered Nov 23 '18 at 11:39
Mohamed Thasin ahMohamed Thasin ah
3,55131540
3,55131540
1
Thanks, but with @jezrael's answer I don't have to check column names.
– LucSpan
Nov 23 '18 at 11:43
1
@LucSpan - Solution updated
– Mohamed Thasin ah
Nov 23 '18 at 11:46
add a comment |
1
Thanks, but with @jezrael's answer I don't have to check column names.
– LucSpan
Nov 23 '18 at 11:43
1
@LucSpan - Solution updated
– Mohamed Thasin ah
Nov 23 '18 at 11:46
1
1
Thanks, but with @jezrael's answer I don't have to check column names.
– LucSpan
Nov 23 '18 at 11:43
Thanks, but with @jezrael's answer I don't have to check column names.
– LucSpan
Nov 23 '18 at 11:43
1
1
@LucSpan - Solution updated
– Mohamed Thasin ah
Nov 23 '18 at 11:46
@LucSpan - Solution updated
– Mohamed Thasin ah
Nov 23 '18 at 11:46
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53445991%2fjoin-2-pandas-dataframes-on-specific-columns-accounting-for-nan%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown