Reassigning Entries in a Column of Pandas DataFrame
My goal is to conditionally index a data frame and change the values in a column for these indexes.
I intend on looking through the column 'A' to find entries = 'a' and update their column 'B' with the word 'okay.
group = ['a']
df = pd.DataFrame({"A": [a,b,a,a,c], "B": [NaN,NaN,NaN,NaN,NaN]})
>>>df
A B
0 a NaN
1 b NaN
2 a NaN
3 a NaN
4 c NaN
df[df['A'].apply(lambda x: x in group)]['B'].fillna('okay', inplace=True)
This gives me the following error:
SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame
See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
self._update_inplace(new_data)
Following the documentation (what I understood of it) I tried the following instead:
df[df['A'].apply(lambda x: x in group)].loc[:,'B'].fillna('okay', inplace=True)
I can't figure out why the reassignment of 'NaN' to 'okay' is not occurring inplace and how this can be rectified?
Thank you.
python pandas dataframe indexing
add a comment |
My goal is to conditionally index a data frame and change the values in a column for these indexes.
I intend on looking through the column 'A' to find entries = 'a' and update their column 'B' with the word 'okay.
group = ['a']
df = pd.DataFrame({"A": [a,b,a,a,c], "B": [NaN,NaN,NaN,NaN,NaN]})
>>>df
A B
0 a NaN
1 b NaN
2 a NaN
3 a NaN
4 c NaN
df[df['A'].apply(lambda x: x in group)]['B'].fillna('okay', inplace=True)
This gives me the following error:
SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame
See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
self._update_inplace(new_data)
Following the documentation (what I understood of it) I tried the following instead:
df[df['A'].apply(lambda x: x in group)].loc[:,'B'].fillna('okay', inplace=True)
I can't figure out why the reassignment of 'NaN' to 'okay' is not occurring inplace and how this can be rectified?
Thank you.
python pandas dataframe indexing
add a comment |
My goal is to conditionally index a data frame and change the values in a column for these indexes.
I intend on looking through the column 'A' to find entries = 'a' and update their column 'B' with the word 'okay.
group = ['a']
df = pd.DataFrame({"A": [a,b,a,a,c], "B": [NaN,NaN,NaN,NaN,NaN]})
>>>df
A B
0 a NaN
1 b NaN
2 a NaN
3 a NaN
4 c NaN
df[df['A'].apply(lambda x: x in group)]['B'].fillna('okay', inplace=True)
This gives me the following error:
SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame
See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
self._update_inplace(new_data)
Following the documentation (what I understood of it) I tried the following instead:
df[df['A'].apply(lambda x: x in group)].loc[:,'B'].fillna('okay', inplace=True)
I can't figure out why the reassignment of 'NaN' to 'okay' is not occurring inplace and how this can be rectified?
Thank you.
python pandas dataframe indexing
My goal is to conditionally index a data frame and change the values in a column for these indexes.
I intend on looking through the column 'A' to find entries = 'a' and update their column 'B' with the word 'okay.
group = ['a']
df = pd.DataFrame({"A": [a,b,a,a,c], "B": [NaN,NaN,NaN,NaN,NaN]})
>>>df
A B
0 a NaN
1 b NaN
2 a NaN
3 a NaN
4 c NaN
df[df['A'].apply(lambda x: x in group)]['B'].fillna('okay', inplace=True)
This gives me the following error:
SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame
See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
self._update_inplace(new_data)
Following the documentation (what I understood of it) I tried the following instead:
df[df['A'].apply(lambda x: x in group)].loc[:,'B'].fillna('okay', inplace=True)
I can't figure out why the reassignment of 'NaN' to 'okay' is not occurring inplace and how this can be rectified?
Thank you.
python pandas dataframe indexing
python pandas dataframe indexing
asked Nov 24 '18 at 18:30
Scott LucasScott Lucas
132
132
add a comment |
add a comment |
2 Answers
2
active
oldest
votes
Try this with lambda:
Solution First:
>>> df
A B
0 a NaN
1 b NaN
2 a NaN
3 a NaN
4 c NaN
Using lambda
+ map
or apply
..
>>> df["B"] = df["A"].map(lambda x: "okay" if "a" in x else "NaN")
OR# df["B"] = df["A"].map(lambda x: "okay" if "a" in x else np.nan)
OR# df['B'] = df['A'].apply(lambda x: 'okay' if x == 'a' else np.nan)
>>> df
A B
0 a okay
1 b NaN
2 a okay
3 a okay
4 c NaN
Solution second:
>>> df
A B
0 a NaN
1 b NaN
2 a NaN
3 a NaN
4 c NaN
another fancy way to Create Dictionary frame and apply it using map
function across the column:
>>> frame = {'a': "okay"}
>>> df['B'] = df['A'].map(frame)
>>> df
A B
0 a okay
1 b NaN
2 a okay
3 a okay
4 c NaN
Solution Third:
This is already been posted by @d_kennetz but Just want to club together, wher you can also do the assignment to both columns (A & B)in one shot:..
>>> df.loc[df.A == 'a', 'B'] = "okay"
Thank you for your reply. I realised that the question I asked didn't fully capture my problem, however, your first solution still worked. 'group' should actually contain multiple items.
– Scott Lucas
Nov 25 '18 at 18:50
If it helps then you can accept it as a answer and open a new question with new set of queries.
– pygo
Nov 25 '18 at 19:06
add a comment |
If I understand this correctly, you simply want to replace the value for a column on those rows matching a given condition (i.e. where A
column belongs to a certain group, here with a single value 'a'
). The following should do the trick:
import pandas as pd
group = ['a']
df = pd.DataFrame({"A": ['a','b','a','a','c'], "B": [None,None,None,None,None]})
print(df)
df.loc[df['A'].isin(group),'B'] = 'okay'
print(df)
What we're doing here is we're using the .loc
filter, which just returns a view on the existing dataframe.
First argument (df['A'].isin(group)
) filters on those rows matching a given criterion. Notice you can use the equality operator (==
) but not the in
operator and therefore have to use .isin()
instead).
Second argument selects only the 'B' column.
Then you just assign the desired value (which is a constant).
Here's the output:
A B
0 a None
1 b None
2 a None
3 a None
4 c None
A B
0 a okay
1 b None
2 a okay
3 a okay
4 c None
If you wanted to fancier stuff, you might want do the following:
import pandas as pd
group = ['a', 'b']
df = pd.DataFrame({"A": ['a','b','a','a','c'], "B": [None,None,None,None,None]})
df.loc[df['A'].isin(group),'B'] = "okay, it was " + df['A']+df['A']
print(df)
Which gives you:
A B
0 a okay, it was aa
1 b okay, it was bb
2 a okay, it was aa
3 a okay, it was aa
4 c None
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53461190%2freassigning-entries-in-a-column-of-pandas-dataframe%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
Try this with lambda:
Solution First:
>>> df
A B
0 a NaN
1 b NaN
2 a NaN
3 a NaN
4 c NaN
Using lambda
+ map
or apply
..
>>> df["B"] = df["A"].map(lambda x: "okay" if "a" in x else "NaN")
OR# df["B"] = df["A"].map(lambda x: "okay" if "a" in x else np.nan)
OR# df['B'] = df['A'].apply(lambda x: 'okay' if x == 'a' else np.nan)
>>> df
A B
0 a okay
1 b NaN
2 a okay
3 a okay
4 c NaN
Solution second:
>>> df
A B
0 a NaN
1 b NaN
2 a NaN
3 a NaN
4 c NaN
another fancy way to Create Dictionary frame and apply it using map
function across the column:
>>> frame = {'a': "okay"}
>>> df['B'] = df['A'].map(frame)
>>> df
A B
0 a okay
1 b NaN
2 a okay
3 a okay
4 c NaN
Solution Third:
This is already been posted by @d_kennetz but Just want to club together, wher you can also do the assignment to both columns (A & B)in one shot:..
>>> df.loc[df.A == 'a', 'B'] = "okay"
Thank you for your reply. I realised that the question I asked didn't fully capture my problem, however, your first solution still worked. 'group' should actually contain multiple items.
– Scott Lucas
Nov 25 '18 at 18:50
If it helps then you can accept it as a answer and open a new question with new set of queries.
– pygo
Nov 25 '18 at 19:06
add a comment |
Try this with lambda:
Solution First:
>>> df
A B
0 a NaN
1 b NaN
2 a NaN
3 a NaN
4 c NaN
Using lambda
+ map
or apply
..
>>> df["B"] = df["A"].map(lambda x: "okay" if "a" in x else "NaN")
OR# df["B"] = df["A"].map(lambda x: "okay" if "a" in x else np.nan)
OR# df['B'] = df['A'].apply(lambda x: 'okay' if x == 'a' else np.nan)
>>> df
A B
0 a okay
1 b NaN
2 a okay
3 a okay
4 c NaN
Solution second:
>>> df
A B
0 a NaN
1 b NaN
2 a NaN
3 a NaN
4 c NaN
another fancy way to Create Dictionary frame and apply it using map
function across the column:
>>> frame = {'a': "okay"}
>>> df['B'] = df['A'].map(frame)
>>> df
A B
0 a okay
1 b NaN
2 a okay
3 a okay
4 c NaN
Solution Third:
This is already been posted by @d_kennetz but Just want to club together, wher you can also do the assignment to both columns (A & B)in one shot:..
>>> df.loc[df.A == 'a', 'B'] = "okay"
Thank you for your reply. I realised that the question I asked didn't fully capture my problem, however, your first solution still worked. 'group' should actually contain multiple items.
– Scott Lucas
Nov 25 '18 at 18:50
If it helps then you can accept it as a answer and open a new question with new set of queries.
– pygo
Nov 25 '18 at 19:06
add a comment |
Try this with lambda:
Solution First:
>>> df
A B
0 a NaN
1 b NaN
2 a NaN
3 a NaN
4 c NaN
Using lambda
+ map
or apply
..
>>> df["B"] = df["A"].map(lambda x: "okay" if "a" in x else "NaN")
OR# df["B"] = df["A"].map(lambda x: "okay" if "a" in x else np.nan)
OR# df['B'] = df['A'].apply(lambda x: 'okay' if x == 'a' else np.nan)
>>> df
A B
0 a okay
1 b NaN
2 a okay
3 a okay
4 c NaN
Solution second:
>>> df
A B
0 a NaN
1 b NaN
2 a NaN
3 a NaN
4 c NaN
another fancy way to Create Dictionary frame and apply it using map
function across the column:
>>> frame = {'a': "okay"}
>>> df['B'] = df['A'].map(frame)
>>> df
A B
0 a okay
1 b NaN
2 a okay
3 a okay
4 c NaN
Solution Third:
This is already been posted by @d_kennetz but Just want to club together, wher you can also do the assignment to both columns (A & B)in one shot:..
>>> df.loc[df.A == 'a', 'B'] = "okay"
Try this with lambda:
Solution First:
>>> df
A B
0 a NaN
1 b NaN
2 a NaN
3 a NaN
4 c NaN
Using lambda
+ map
or apply
..
>>> df["B"] = df["A"].map(lambda x: "okay" if "a" in x else "NaN")
OR# df["B"] = df["A"].map(lambda x: "okay" if "a" in x else np.nan)
OR# df['B'] = df['A'].apply(lambda x: 'okay' if x == 'a' else np.nan)
>>> df
A B
0 a okay
1 b NaN
2 a okay
3 a okay
4 c NaN
Solution second:
>>> df
A B
0 a NaN
1 b NaN
2 a NaN
3 a NaN
4 c NaN
another fancy way to Create Dictionary frame and apply it using map
function across the column:
>>> frame = {'a': "okay"}
>>> df['B'] = df['A'].map(frame)
>>> df
A B
0 a okay
1 b NaN
2 a okay
3 a okay
4 c NaN
Solution Third:
This is already been posted by @d_kennetz but Just want to club together, wher you can also do the assignment to both columns (A & B)in one shot:..
>>> df.loc[df.A == 'a', 'B'] = "okay"
edited Nov 24 '18 at 19:47
answered Nov 24 '18 at 18:48
pygopygo
3,1381619
3,1381619
Thank you for your reply. I realised that the question I asked didn't fully capture my problem, however, your first solution still worked. 'group' should actually contain multiple items.
– Scott Lucas
Nov 25 '18 at 18:50
If it helps then you can accept it as a answer and open a new question with new set of queries.
– pygo
Nov 25 '18 at 19:06
add a comment |
Thank you for your reply. I realised that the question I asked didn't fully capture my problem, however, your first solution still worked. 'group' should actually contain multiple items.
– Scott Lucas
Nov 25 '18 at 18:50
If it helps then you can accept it as a answer and open a new question with new set of queries.
– pygo
Nov 25 '18 at 19:06
Thank you for your reply. I realised that the question I asked didn't fully capture my problem, however, your first solution still worked. 'group' should actually contain multiple items.
– Scott Lucas
Nov 25 '18 at 18:50
Thank you for your reply. I realised that the question I asked didn't fully capture my problem, however, your first solution still worked. 'group' should actually contain multiple items.
– Scott Lucas
Nov 25 '18 at 18:50
If it helps then you can accept it as a answer and open a new question with new set of queries.
– pygo
Nov 25 '18 at 19:06
If it helps then you can accept it as a answer and open a new question with new set of queries.
– pygo
Nov 25 '18 at 19:06
add a comment |
If I understand this correctly, you simply want to replace the value for a column on those rows matching a given condition (i.e. where A
column belongs to a certain group, here with a single value 'a'
). The following should do the trick:
import pandas as pd
group = ['a']
df = pd.DataFrame({"A": ['a','b','a','a','c'], "B": [None,None,None,None,None]})
print(df)
df.loc[df['A'].isin(group),'B'] = 'okay'
print(df)
What we're doing here is we're using the .loc
filter, which just returns a view on the existing dataframe.
First argument (df['A'].isin(group)
) filters on those rows matching a given criterion. Notice you can use the equality operator (==
) but not the in
operator and therefore have to use .isin()
instead).
Second argument selects only the 'B' column.
Then you just assign the desired value (which is a constant).
Here's the output:
A B
0 a None
1 b None
2 a None
3 a None
4 c None
A B
0 a okay
1 b None
2 a okay
3 a okay
4 c None
If you wanted to fancier stuff, you might want do the following:
import pandas as pd
group = ['a', 'b']
df = pd.DataFrame({"A": ['a','b','a','a','c'], "B": [None,None,None,None,None]})
df.loc[df['A'].isin(group),'B'] = "okay, it was " + df['A']+df['A']
print(df)
Which gives you:
A B
0 a okay, it was aa
1 b okay, it was bb
2 a okay, it was aa
3 a okay, it was aa
4 c None
add a comment |
If I understand this correctly, you simply want to replace the value for a column on those rows matching a given condition (i.e. where A
column belongs to a certain group, here with a single value 'a'
). The following should do the trick:
import pandas as pd
group = ['a']
df = pd.DataFrame({"A": ['a','b','a','a','c'], "B": [None,None,None,None,None]})
print(df)
df.loc[df['A'].isin(group),'B'] = 'okay'
print(df)
What we're doing here is we're using the .loc
filter, which just returns a view on the existing dataframe.
First argument (df['A'].isin(group)
) filters on those rows matching a given criterion. Notice you can use the equality operator (==
) but not the in
operator and therefore have to use .isin()
instead).
Second argument selects only the 'B' column.
Then you just assign the desired value (which is a constant).
Here's the output:
A B
0 a None
1 b None
2 a None
3 a None
4 c None
A B
0 a okay
1 b None
2 a okay
3 a okay
4 c None
If you wanted to fancier stuff, you might want do the following:
import pandas as pd
group = ['a', 'b']
df = pd.DataFrame({"A": ['a','b','a','a','c'], "B": [None,None,None,None,None]})
df.loc[df['A'].isin(group),'B'] = "okay, it was " + df['A']+df['A']
print(df)
Which gives you:
A B
0 a okay, it was aa
1 b okay, it was bb
2 a okay, it was aa
3 a okay, it was aa
4 c None
add a comment |
If I understand this correctly, you simply want to replace the value for a column on those rows matching a given condition (i.e. where A
column belongs to a certain group, here with a single value 'a'
). The following should do the trick:
import pandas as pd
group = ['a']
df = pd.DataFrame({"A": ['a','b','a','a','c'], "B": [None,None,None,None,None]})
print(df)
df.loc[df['A'].isin(group),'B'] = 'okay'
print(df)
What we're doing here is we're using the .loc
filter, which just returns a view on the existing dataframe.
First argument (df['A'].isin(group)
) filters on those rows matching a given criterion. Notice you can use the equality operator (==
) but not the in
operator and therefore have to use .isin()
instead).
Second argument selects only the 'B' column.
Then you just assign the desired value (which is a constant).
Here's the output:
A B
0 a None
1 b None
2 a None
3 a None
4 c None
A B
0 a okay
1 b None
2 a okay
3 a okay
4 c None
If you wanted to fancier stuff, you might want do the following:
import pandas as pd
group = ['a', 'b']
df = pd.DataFrame({"A": ['a','b','a','a','c'], "B": [None,None,None,None,None]})
df.loc[df['A'].isin(group),'B'] = "okay, it was " + df['A']+df['A']
print(df)
Which gives you:
A B
0 a okay, it was aa
1 b okay, it was bb
2 a okay, it was aa
3 a okay, it was aa
4 c None
If I understand this correctly, you simply want to replace the value for a column on those rows matching a given condition (i.e. where A
column belongs to a certain group, here with a single value 'a'
). The following should do the trick:
import pandas as pd
group = ['a']
df = pd.DataFrame({"A": ['a','b','a','a','c'], "B": [None,None,None,None,None]})
print(df)
df.loc[df['A'].isin(group),'B'] = 'okay'
print(df)
What we're doing here is we're using the .loc
filter, which just returns a view on the existing dataframe.
First argument (df['A'].isin(group)
) filters on those rows matching a given criterion. Notice you can use the equality operator (==
) but not the in
operator and therefore have to use .isin()
instead).
Second argument selects only the 'B' column.
Then you just assign the desired value (which is a constant).
Here's the output:
A B
0 a None
1 b None
2 a None
3 a None
4 c None
A B
0 a okay
1 b None
2 a okay
3 a okay
4 c None
If you wanted to fancier stuff, you might want do the following:
import pandas as pd
group = ['a', 'b']
df = pd.DataFrame({"A": ['a','b','a','a','c'], "B": [None,None,None,None,None]})
df.loc[df['A'].isin(group),'B'] = "okay, it was " + df['A']+df['A']
print(df)
Which gives you:
A B
0 a okay, it was aa
1 b okay, it was bb
2 a okay, it was aa
3 a okay, it was aa
4 c None
answered Nov 24 '18 at 19:20
iurlyiurly
593
593
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53461190%2freassigning-entries-in-a-column-of-pandas-dataframe%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown