How can i filter the months with less than 15 entries from a pandas df?

I have a multiindex dataframe organized in year month day that goes from 1960 to 2017, i want to be able to check if a month contains more than 15 NaN.

Can someone help me to figure out how to do this in a efficient way?

Thank you in advance.
Data frame

                           A    B   C   D   E   F   G   H

Year    Month   Day                             

1960    6        1  0.053142    0.632151    NaN -0.740130   NaN -1.273792   NaN -0.287078

                 2  0.827514    -0.487477   NaN -0.246897   NaN -0.310194   NaN 2.150300

                 3  -1.403216   0.350322    NaN 2.134335    NaN 0.023102    NaN 0.343759

                 4  0.305884    0.663174    NaN -2.073908   NaN 0.400311    NaN 0.149292

                 5  0.720521    -2.081981   NaN 0.672169    NaN -0.172794   NaN -0.549559

                 6  -0.987216   -1.190550   NaN 0.318706    NaN 0.863885    NaN -0.995961

                 7  1.781080    0.636422    NaN -0.382552   NaN -0.109566   NaN 0.410586

                 8  -0.654413   -0.094920   NaN -1.763118   NaN 0.075046    NaN -1.130280

                 9  -0.634353   -1.514066   NaN -0.003556   NaN -1.560351   NaN 1.001637

                 10 -1.742696   1.173806    NaN 0.909725    NaN -1.428291   NaN -1.369954

edited Nov 25 '18 at 15:46

asked Nov 24 '18 at 21:38

M.Cerv

1

Please put the DF in a code block instead of an image... It makes it really difficult for anyone to help you here... You also need to be a bit more explicit if you're after a month having 15 NaNs across all entries for that or only in certain columns etc...

– Jon Clements♦
Nov 24 '18 at 22:01

add a comment |

I have a multiindex dataframe organized in year month day that goes from 1960 to 2017, i want to be able to check if a month contains more than 15 NaN.

Can someone help me to figure out how to do this in a efficient way?

Thank you in advance.
Data frame

                           A    B   C   D   E   F   G   H

Year    Month   Day                             

1960    6        1  0.053142    0.632151    NaN -0.740130   NaN -1.273792   NaN -0.287078

                 2  0.827514    -0.487477   NaN -0.246897   NaN -0.310194   NaN 2.150300

                 3  -1.403216   0.350322    NaN 2.134335    NaN 0.023102    NaN 0.343759

                 4  0.305884    0.663174    NaN -2.073908   NaN 0.400311    NaN 0.149292

                 5  0.720521    -2.081981   NaN 0.672169    NaN -0.172794   NaN -0.549559

                 6  -0.987216   -1.190550   NaN 0.318706    NaN 0.863885    NaN -0.995961

                 7  1.781080    0.636422    NaN -0.382552   NaN -0.109566   NaN 0.410586

                 8  -0.654413   -0.094920   NaN -1.763118   NaN 0.075046    NaN -1.130280

                 9  -0.634353   -1.514066   NaN -0.003556   NaN -1.560351   NaN 1.001637

                 10 -1.742696   1.173806    NaN 0.909725    NaN -1.428291   NaN -1.369954

edited Nov 25 '18 at 15:46

asked Nov 24 '18 at 21:38

M.Cerv

1

Please put the DF in a code block instead of an image... It makes it really difficult for anyone to help you here... You also need to be a bit more explicit if you're after a month having 15 NaNs across all entries for that or only in certain columns etc...

– Jon Clements♦
Nov 24 '18 at 22:01

add a comment |

I have a multiindex dataframe organized in year month day that goes from 1960 to 2017, i want to be able to check if a month contains more than 15 NaN.

Can someone help me to figure out how to do this in a efficient way?

Thank you in advance.
Data frame

                           A    B   C   D   E   F   G   H

Year    Month   Day                             

1960    6        1  0.053142    0.632151    NaN -0.740130   NaN -1.273792   NaN -0.287078

                 2  0.827514    -0.487477   NaN -0.246897   NaN -0.310194   NaN 2.150300

                 3  -1.403216   0.350322    NaN 2.134335    NaN 0.023102    NaN 0.343759

                 4  0.305884    0.663174    NaN -2.073908   NaN 0.400311    NaN 0.149292

                 5  0.720521    -2.081981   NaN 0.672169    NaN -0.172794   NaN -0.549559

                 6  -0.987216   -1.190550   NaN 0.318706    NaN 0.863885    NaN -0.995961

                 7  1.781080    0.636422    NaN -0.382552   NaN -0.109566   NaN 0.410586

                 8  -0.654413   -0.094920   NaN -1.763118   NaN 0.075046    NaN -1.130280

                 9  -0.634353   -1.514066   NaN -0.003556   NaN -1.560351   NaN 1.001637

                 10 -1.742696   1.173806    NaN 0.909725    NaN -1.428291   NaN -1.369954

edited Nov 25 '18 at 15:46

asked Nov 24 '18 at 21:38

M.Cerv

I have a multiindex dataframe organized in year month day that goes from 1960 to 2017, i want to be able to check if a month contains more than 15 NaN.

Can someone help me to figure out how to do this in a efficient way?

Thank you in advance.
Data frame

                           A    B   C   D   E   F   G   H

Year    Month   Day                             

1960    6        1  0.053142    0.632151    NaN -0.740130   NaN -1.273792   NaN -0.287078

                 2  0.827514    -0.487477   NaN -0.246897   NaN -0.310194   NaN 2.150300

                 3  -1.403216   0.350322    NaN 2.134335    NaN 0.023102    NaN 0.343759

                 4  0.305884    0.663174    NaN -2.073908   NaN 0.400311    NaN 0.149292

                 5  0.720521    -2.081981   NaN 0.672169    NaN -0.172794   NaN -0.549559

                 6  -0.987216   -1.190550   NaN 0.318706    NaN 0.863885    NaN -0.995961

                 7  1.781080    0.636422    NaN -0.382552   NaN -0.109566   NaN 0.410586

                 8  -0.654413   -0.094920   NaN -1.763118   NaN 0.075046    NaN -1.130280

                 9  -0.634353   -1.514066   NaN -0.003556   NaN -1.560351   NaN 1.001637

                 10 -1.742696   1.173806    NaN 0.909725    NaN -1.428291   NaN -1.369954

python pandas filter timestamp conditional

edited Nov 25 '18 at 15:46

asked Nov 24 '18 at 21:38

M.Cerv

edited Nov 25 '18 at 15:46

asked Nov 24 '18 at 21:38

M.Cerv

edited Nov 25 '18 at 15:46

asked Nov 24 '18 at 21:38

M.Cerv

asked Nov 24 '18 at 21:38

M.Cerv

asked Nov 24 '18 at 21:38

M.Cerv

1

Please put the DF in a code block instead of an image... It makes it really difficult for anyone to help you here... You also need to be a bit more explicit if you're after a month having 15 NaNs across all entries for that or only in certain columns etc...

– Jon Clements♦
Nov 24 '18 at 22:01

add a comment |

1

Please put the DF in a code block instead of an image... It makes it really difficult for anyone to help you here... You also need to be a bit more explicit if you're after a month having 15 NaNs across all entries for that or only in certain columns etc...

– Jon Clements♦
Nov 24 '18 at 22:01

Please put the DF in a code block instead of an image... It makes it really difficult for anyone to help you here... You also need to be a bit more explicit if you're after a month having 15 NaNs across all entries for that or only in certain columns etc...

– Jon Clements♦
Nov 24 '18 at 22:01

add a comment |

1 Answer
1

active

oldest

votes

something like this might work here is an example df:

# create a test dataframe similar to yours

df = pd.DataFrame(np.random.randn(10,8), columns=list('ABCDEFGH'))

df[['C', 'E', 'G']] = np.nan

df['Year'] = 1960

df['Month'] = 6

df['Day'] = range(1,11)



df2 = pd.DataFrame(np.random.randn(10,8), columns=list('ABCDEFGH'))

df2[['B']] = np.nan

df2['Year'] = 1960

df2['Month'] = 7

df2['Day'] = range(1,11)

new_df = pd.concat([df,df2])

new_df.set_index(['Year', 'Month', 'Day'], inplace=True)

then you can do something like this:

# find all nan values then stack and groupby to find the sum of true  for each group

# this is grouping on year and month change the level/levels you want to group

stackdf = pd.isna(new_df).stack().groupby(level=[0,1]).transform(sum)



# filter original df where the index is in the stacked df index

# where the stackdf sum is greater than 15

new_df[new_df.index.isin(stackdf[stackdf>15].unstack().index)]



                       A    B   C   D   E   F   G   H

Year    Month   Day                             

1960    6        1  0.053142    0.632151    NaN -0.740130   NaN -1.273792   NaN -0.287078

                 2  0.827514    -0.487477   NaN -0.246897   NaN -0.310194   NaN 2.150300

                 3  -1.403216   0.350322    NaN 2.134335    NaN 0.023102    NaN 0.343759

                 4  0.305884    0.663174    NaN -2.073908   NaN 0.400311    NaN 0.149292

                 5  0.720521    -2.081981   NaN 0.672169    NaN -0.172794   NaN -0.549559

                 6  -0.987216   -1.190550   NaN 0.318706    NaN 0.863885    NaN -0.995961

                 7  1.781080    0.636422    NaN -0.382552   NaN -0.109566   NaN 0.410586

                 8  -0.654413   -0.094920   NaN -1.763118   NaN 0.075046    NaN -1.130280

                 9  -0.634353   -1.514066   NaN -0.003556   NaN -1.560351   NaN 1.001637

                 10 -1.742696   1.173806    NaN 0.909725    NaN -1.428291   NaN -1.369954

you can also see those less than 15 by doing new_df[new_df.index.isin(stackdf[stackdf<15].unstack().index)]

                       A    B   C   D   E   F   G   H

Year    Month   Day                             

1960     7       1  0.994542    NaN 0.488464    0.809915    0.144305    -1.092597   0.555626    0.012135

                 2  -0.682796   NaN -0.781031   -0.847972   0.238397    0.364584    -0.271764   0.930113

                 3  0.254320    NaN -0.474764   0.154370    -1.497867   -1.454383   0.191503    0.494441

                 4  0.994579    NaN 0.362073    -0.537878   -0.512388   -0.501573   0.315398    1.377701

                 5  0.623287    NaN 1.286725    -0.770290   -0.614005   0.552683    0.225974    -0.564017

                 6  -0.252969   NaN -1.127418   -0.357725   -1.069318   0.218666    1.296458    -0.319678

                 7  0.202788    NaN 0.385931    -0.169915   0.167754    0.821923    0.181937    -0.198668

                 8  -0.272891   NaN 0.963414    0.887208    -1.903742   -2.026687   0.897575    1.148448

                 9  1.398781    NaN -0.298804   -1.081953   -1.346193   0.926548    0.147855    -1.632059

                 10 0.489751    NaN 0.433767    0.752071    -0.714030   -1.776365   0.247908    0.919387

because I am using stack this is counting all NaN values in a group not one particular column.

edited Nov 25 '18 at 0:49

answered Nov 25 '18 at 0:44

Chris

2,7232420

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53462592%2fhow-can-i-filter-the-months-with-less-than-15-entries-from-a-pandas-df%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

something like this might work here is an example df:

# create a test dataframe similar to yours

df = pd.DataFrame(np.random.randn(10,8), columns=list('ABCDEFGH'))

df[['C', 'E', 'G']] = np.nan

df['Year'] = 1960

df['Month'] = 6

df['Day'] = range(1,11)



df2 = pd.DataFrame(np.random.randn(10,8), columns=list('ABCDEFGH'))

df2[['B']] = np.nan

df2['Year'] = 1960

df2['Month'] = 7

df2['Day'] = range(1,11)

new_df = pd.concat([df,df2])

new_df.set_index(['Year', 'Month', 'Day'], inplace=True)

then you can do something like this:

# find all nan values then stack and groupby to find the sum of true  for each group

# this is grouping on year and month change the level/levels you want to group

stackdf = pd.isna(new_df).stack().groupby(level=[0,1]).transform(sum)



# filter original df where the index is in the stacked df index

# where the stackdf sum is greater than 15

new_df[new_df.index.isin(stackdf[stackdf>15].unstack().index)]



                       A    B   C   D   E   F   G   H

Year    Month   Day                             

1960    6        1  0.053142    0.632151    NaN -0.740130   NaN -1.273792   NaN -0.287078

                 2  0.827514    -0.487477   NaN -0.246897   NaN -0.310194   NaN 2.150300

                 3  -1.403216   0.350322    NaN 2.134335    NaN 0.023102    NaN 0.343759

                 4  0.305884    0.663174    NaN -2.073908   NaN 0.400311    NaN 0.149292

                 5  0.720521    -2.081981   NaN 0.672169    NaN -0.172794   NaN -0.549559

                 6  -0.987216   -1.190550   NaN 0.318706    NaN 0.863885    NaN -0.995961

                 7  1.781080    0.636422    NaN -0.382552   NaN -0.109566   NaN 0.410586

                 8  -0.654413   -0.094920   NaN -1.763118   NaN 0.075046    NaN -1.130280

                 9  -0.634353   -1.514066   NaN -0.003556   NaN -1.560351   NaN 1.001637

                 10 -1.742696   1.173806    NaN 0.909725    NaN -1.428291   NaN -1.369954

you can also see those less than 15 by doing new_df[new_df.index.isin(stackdf[stackdf<15].unstack().index)]

                       A    B   C   D   E   F   G   H

Year    Month   Day                             

1960     7       1  0.994542    NaN 0.488464    0.809915    0.144305    -1.092597   0.555626    0.012135

                 2  -0.682796   NaN -0.781031   -0.847972   0.238397    0.364584    -0.271764   0.930113

                 3  0.254320    NaN -0.474764   0.154370    -1.497867   -1.454383   0.191503    0.494441

                 4  0.994579    NaN 0.362073    -0.537878   -0.512388   -0.501573   0.315398    1.377701

                 5  0.623287    NaN 1.286725    -0.770290   -0.614005   0.552683    0.225974    -0.564017

                 6  -0.252969   NaN -1.127418   -0.357725   -1.069318   0.218666    1.296458    -0.319678

                 7  0.202788    NaN 0.385931    -0.169915   0.167754    0.821923    0.181937    -0.198668

                 8  -0.272891   NaN 0.963414    0.887208    -1.903742   -2.026687   0.897575    1.148448

                 9  1.398781    NaN -0.298804   -1.081953   -1.346193   0.926548    0.147855    -1.632059

                 10 0.489751    NaN 0.433767    0.752071    -0.714030   -1.776365   0.247908    0.919387

because I am using stack this is counting all NaN values in a group not one particular column.

edited Nov 25 '18 at 0:49

answered Nov 25 '18 at 0:44

Chris

2,7232420

add a comment |

something like this might work here is an example df:

# create a test dataframe similar to yours

df = pd.DataFrame(np.random.randn(10,8), columns=list('ABCDEFGH'))

df[['C', 'E', 'G']] = np.nan

df['Year'] = 1960

df['Month'] = 6

df['Day'] = range(1,11)



df2 = pd.DataFrame(np.random.randn(10,8), columns=list('ABCDEFGH'))

df2[['B']] = np.nan

df2['Year'] = 1960

df2['Month'] = 7

df2['Day'] = range(1,11)

new_df = pd.concat([df,df2])

new_df.set_index(['Year', 'Month', 'Day'], inplace=True)

then you can do something like this:

# find all nan values then stack and groupby to find the sum of true  for each group

# this is grouping on year and month change the level/levels you want to group

stackdf = pd.isna(new_df).stack().groupby(level=[0,1]).transform(sum)



# filter original df where the index is in the stacked df index

# where the stackdf sum is greater than 15

new_df[new_df.index.isin(stackdf[stackdf>15].unstack().index)]



                       A    B   C   D   E   F   G   H

Year    Month   Day                             

1960    6        1  0.053142    0.632151    NaN -0.740130   NaN -1.273792   NaN -0.287078

                 2  0.827514    -0.487477   NaN -0.246897   NaN -0.310194   NaN 2.150300

                 3  -1.403216   0.350322    NaN 2.134335    NaN 0.023102    NaN 0.343759

                 4  0.305884    0.663174    NaN -2.073908   NaN 0.400311    NaN 0.149292

                 5  0.720521    -2.081981   NaN 0.672169    NaN -0.172794   NaN -0.549559

                 6  -0.987216   -1.190550   NaN 0.318706    NaN 0.863885    NaN -0.995961

                 7  1.781080    0.636422    NaN -0.382552   NaN -0.109566   NaN 0.410586

                 8  -0.654413   -0.094920   NaN -1.763118   NaN 0.075046    NaN -1.130280

                 9  -0.634353   -1.514066   NaN -0.003556   NaN -1.560351   NaN 1.001637

                 10 -1.742696   1.173806    NaN 0.909725    NaN -1.428291   NaN -1.369954

you can also see those less than 15 by doing new_df[new_df.index.isin(stackdf[stackdf<15].unstack().index)]

                       A    B   C   D   E   F   G   H

Year    Month   Day                             

1960     7       1  0.994542    NaN 0.488464    0.809915    0.144305    -1.092597   0.555626    0.012135

                 2  -0.682796   NaN -0.781031   -0.847972   0.238397    0.364584    -0.271764   0.930113

                 3  0.254320    NaN -0.474764   0.154370    -1.497867   -1.454383   0.191503    0.494441

                 4  0.994579    NaN 0.362073    -0.537878   -0.512388   -0.501573   0.315398    1.377701

                 5  0.623287    NaN 1.286725    -0.770290   -0.614005   0.552683    0.225974    -0.564017

                 6  -0.252969   NaN -1.127418   -0.357725   -1.069318   0.218666    1.296458    -0.319678

                 7  0.202788    NaN 0.385931    -0.169915   0.167754    0.821923    0.181937    -0.198668

                 8  -0.272891   NaN 0.963414    0.887208    -1.903742   -2.026687   0.897575    1.148448

                 9  1.398781    NaN -0.298804   -1.081953   -1.346193   0.926548    0.147855    -1.632059

                 10 0.489751    NaN 0.433767    0.752071    -0.714030   -1.776365   0.247908    0.919387

because I am using stack this is counting all NaN values in a group not one particular column.

edited Nov 25 '18 at 0:49

answered Nov 25 '18 at 0:44

Chris

2,7232420

add a comment |

something like this might work here is an example df:

# create a test dataframe similar to yours

df = pd.DataFrame(np.random.randn(10,8), columns=list('ABCDEFGH'))

df[['C', 'E', 'G']] = np.nan

df['Year'] = 1960

df['Month'] = 6

df['Day'] = range(1,11)



df2 = pd.DataFrame(np.random.randn(10,8), columns=list('ABCDEFGH'))

df2[['B']] = np.nan

df2['Year'] = 1960

df2['Month'] = 7

df2['Day'] = range(1,11)

new_df = pd.concat([df,df2])

new_df.set_index(['Year', 'Month', 'Day'], inplace=True)

then you can do something like this:

# find all nan values then stack and groupby to find the sum of true  for each group

# this is grouping on year and month change the level/levels you want to group

stackdf = pd.isna(new_df).stack().groupby(level=[0,1]).transform(sum)



# filter original df where the index is in the stacked df index

# where the stackdf sum is greater than 15

new_df[new_df.index.isin(stackdf[stackdf>15].unstack().index)]



                       A    B   C   D   E   F   G   H

Year    Month   Day                             

1960    6        1  0.053142    0.632151    NaN -0.740130   NaN -1.273792   NaN -0.287078

                 2  0.827514    -0.487477   NaN -0.246897   NaN -0.310194   NaN 2.150300

                 3  -1.403216   0.350322    NaN 2.134335    NaN 0.023102    NaN 0.343759

                 4  0.305884    0.663174    NaN -2.073908   NaN 0.400311    NaN 0.149292

                 5  0.720521    -2.081981   NaN 0.672169    NaN -0.172794   NaN -0.549559

                 6  -0.987216   -1.190550   NaN 0.318706    NaN 0.863885    NaN -0.995961

                 7  1.781080    0.636422    NaN -0.382552   NaN -0.109566   NaN 0.410586

                 8  -0.654413   -0.094920   NaN -1.763118   NaN 0.075046    NaN -1.130280

                 9  -0.634353   -1.514066   NaN -0.003556   NaN -1.560351   NaN 1.001637

                 10 -1.742696   1.173806    NaN 0.909725    NaN -1.428291   NaN -1.369954

you can also see those less than 15 by doing new_df[new_df.index.isin(stackdf[stackdf<15].unstack().index)]

                       A    B   C   D   E   F   G   H

Year    Month   Day                             

1960     7       1  0.994542    NaN 0.488464    0.809915    0.144305    -1.092597   0.555626    0.012135

                 2  -0.682796   NaN -0.781031   -0.847972   0.238397    0.364584    -0.271764   0.930113

                 3  0.254320    NaN -0.474764   0.154370    -1.497867   -1.454383   0.191503    0.494441

                 4  0.994579    NaN 0.362073    -0.537878   -0.512388   -0.501573   0.315398    1.377701

                 5  0.623287    NaN 1.286725    -0.770290   -0.614005   0.552683    0.225974    -0.564017

                 6  -0.252969   NaN -1.127418   -0.357725   -1.069318   0.218666    1.296458    -0.319678

                 7  0.202788    NaN 0.385931    -0.169915   0.167754    0.821923    0.181937    -0.198668

                 8  -0.272891   NaN 0.963414    0.887208    -1.903742   -2.026687   0.897575    1.148448

                 9  1.398781    NaN -0.298804   -1.081953   -1.346193   0.926548    0.147855    -1.632059

                 10 0.489751    NaN 0.433767    0.752071    -0.714030   -1.776365   0.247908    0.919387

because I am using stack this is counting all NaN values in a group not one particular column.

edited Nov 25 '18 at 0:49

answered Nov 25 '18 at 0:44

Chris

2,7232420

something like this might work here is an example df:

# create a test dataframe similar to yours

df = pd.DataFrame(np.random.randn(10,8), columns=list('ABCDEFGH'))

df[['C', 'E', 'G']] = np.nan

df['Year'] = 1960

df['Month'] = 6

df['Day'] = range(1,11)



df2 = pd.DataFrame(np.random.randn(10,8), columns=list('ABCDEFGH'))

df2[['B']] = np.nan

df2['Year'] = 1960

df2['Month'] = 7

df2['Day'] = range(1,11)

new_df = pd.concat([df,df2])

new_df.set_index(['Year', 'Month', 'Day'], inplace=True)

then you can do something like this:

# find all nan values then stack and groupby to find the sum of true  for each group

# this is grouping on year and month change the level/levels you want to group

stackdf = pd.isna(new_df).stack().groupby(level=[0,1]).transform(sum)



# filter original df where the index is in the stacked df index

# where the stackdf sum is greater than 15

new_df[new_df.index.isin(stackdf[stackdf>15].unstack().index)]



                       A    B   C   D   E   F   G   H

Year    Month   Day                             

1960    6        1  0.053142    0.632151    NaN -0.740130   NaN -1.273792   NaN -0.287078

                 2  0.827514    -0.487477   NaN -0.246897   NaN -0.310194   NaN 2.150300

                 3  -1.403216   0.350322    NaN 2.134335    NaN 0.023102    NaN 0.343759

                 4  0.305884    0.663174    NaN -2.073908   NaN 0.400311    NaN 0.149292

                 5  0.720521    -2.081981   NaN 0.672169    NaN -0.172794   NaN -0.549559

                 6  -0.987216   -1.190550   NaN 0.318706    NaN 0.863885    NaN -0.995961

                 7  1.781080    0.636422    NaN -0.382552   NaN -0.109566   NaN 0.410586

                 8  -0.654413   -0.094920   NaN -1.763118   NaN 0.075046    NaN -1.130280

                 9  -0.634353   -1.514066   NaN -0.003556   NaN -1.560351   NaN 1.001637

                 10 -1.742696   1.173806    NaN 0.909725    NaN -1.428291   NaN -1.369954

you can also see those less than 15 by doing new_df[new_df.index.isin(stackdf[stackdf<15].unstack().index)]

                       A    B   C   D   E   F   G   H

Year    Month   Day                             

1960     7       1  0.994542    NaN 0.488464    0.809915    0.144305    -1.092597   0.555626    0.012135

                 2  -0.682796   NaN -0.781031   -0.847972   0.238397    0.364584    -0.271764   0.930113

                 3  0.254320    NaN -0.474764   0.154370    -1.497867   -1.454383   0.191503    0.494441

                 4  0.994579    NaN 0.362073    -0.537878   -0.512388   -0.501573   0.315398    1.377701

                 5  0.623287    NaN 1.286725    -0.770290   -0.614005   0.552683    0.225974    -0.564017

                 6  -0.252969   NaN -1.127418   -0.357725   -1.069318   0.218666    1.296458    -0.319678

                 7  0.202788    NaN 0.385931    -0.169915   0.167754    0.821923    0.181937    -0.198668

                 8  -0.272891   NaN 0.963414    0.887208    -1.903742   -2.026687   0.897575    1.148448

                 9  1.398781    NaN -0.298804   -1.081953   -1.346193   0.926548    0.147855    -1.632059

                 10 0.489751    NaN 0.433767    0.752071    -0.714030   -1.776365   0.247908    0.919387

because I am using stack this is counting all NaN values in a group not one particular column.

edited Nov 25 '18 at 0:49

answered Nov 25 '18 at 0:44

Chris

2,7232420

edited Nov 25 '18 at 0:49

answered Nov 25 '18 at 0:44

Chris

2,7232420

answered Nov 25 '18 at 0:44

Chris

2,7232420

answered Nov 25 '18 at 0:44

Chris

2,7232420

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

ZyrszxSRf5i151NMdIn6aPP4pm1XQdM y62Uhs88rZasXSLquugMEVJ9tg 1kK,yO,fdE Xw9HwHM4jSi53jrkei8

搜尋此網誌

Tukukkk