How can i filter the months with less than 15 entries from a pandas df?












0















I have a multiindex dataframe organized in year month day that goes from 1960 to 2017, i want to be able to check if a month contains more than 15 NaN.



Can someone help me to figure out how to do this in a efficient way?



Thank you in advance.
Data frame



                           A    B   C   D   E   F   G   H
Year Month Day
1960 6 1 0.053142 0.632151 NaN -0.740130 NaN -1.273792 NaN -0.287078
2 0.827514 -0.487477 NaN -0.246897 NaN -0.310194 NaN 2.150300
3 -1.403216 0.350322 NaN 2.134335 NaN 0.023102 NaN 0.343759
4 0.305884 0.663174 NaN -2.073908 NaN 0.400311 NaN 0.149292
5 0.720521 -2.081981 NaN 0.672169 NaN -0.172794 NaN -0.549559
6 -0.987216 -1.190550 NaN 0.318706 NaN 0.863885 NaN -0.995961
7 1.781080 0.636422 NaN -0.382552 NaN -0.109566 NaN 0.410586
8 -0.654413 -0.094920 NaN -1.763118 NaN 0.075046 NaN -1.130280
9 -0.634353 -1.514066 NaN -0.003556 NaN -1.560351 NaN 1.001637
10 -1.742696 1.173806 NaN 0.909725 NaN -1.428291 NaN -1.369954









share|improve this question




















  • 1





    Please put the DF in a code block instead of an image... It makes it really difficult for anyone to help you here... You also need to be a bit more explicit if you're after a month having 15 NaNs across all entries for that or only in certain columns etc...

    – Jon Clements
    Nov 24 '18 at 22:01


















0















I have a multiindex dataframe organized in year month day that goes from 1960 to 2017, i want to be able to check if a month contains more than 15 NaN.



Can someone help me to figure out how to do this in a efficient way?



Thank you in advance.
Data frame



                           A    B   C   D   E   F   G   H
Year Month Day
1960 6 1 0.053142 0.632151 NaN -0.740130 NaN -1.273792 NaN -0.287078
2 0.827514 -0.487477 NaN -0.246897 NaN -0.310194 NaN 2.150300
3 -1.403216 0.350322 NaN 2.134335 NaN 0.023102 NaN 0.343759
4 0.305884 0.663174 NaN -2.073908 NaN 0.400311 NaN 0.149292
5 0.720521 -2.081981 NaN 0.672169 NaN -0.172794 NaN -0.549559
6 -0.987216 -1.190550 NaN 0.318706 NaN 0.863885 NaN -0.995961
7 1.781080 0.636422 NaN -0.382552 NaN -0.109566 NaN 0.410586
8 -0.654413 -0.094920 NaN -1.763118 NaN 0.075046 NaN -1.130280
9 -0.634353 -1.514066 NaN -0.003556 NaN -1.560351 NaN 1.001637
10 -1.742696 1.173806 NaN 0.909725 NaN -1.428291 NaN -1.369954









share|improve this question




















  • 1





    Please put the DF in a code block instead of an image... It makes it really difficult for anyone to help you here... You also need to be a bit more explicit if you're after a month having 15 NaNs across all entries for that or only in certain columns etc...

    – Jon Clements
    Nov 24 '18 at 22:01
















0












0








0








I have a multiindex dataframe organized in year month day that goes from 1960 to 2017, i want to be able to check if a month contains more than 15 NaN.



Can someone help me to figure out how to do this in a efficient way?



Thank you in advance.
Data frame



                           A    B   C   D   E   F   G   H
Year Month Day
1960 6 1 0.053142 0.632151 NaN -0.740130 NaN -1.273792 NaN -0.287078
2 0.827514 -0.487477 NaN -0.246897 NaN -0.310194 NaN 2.150300
3 -1.403216 0.350322 NaN 2.134335 NaN 0.023102 NaN 0.343759
4 0.305884 0.663174 NaN -2.073908 NaN 0.400311 NaN 0.149292
5 0.720521 -2.081981 NaN 0.672169 NaN -0.172794 NaN -0.549559
6 -0.987216 -1.190550 NaN 0.318706 NaN 0.863885 NaN -0.995961
7 1.781080 0.636422 NaN -0.382552 NaN -0.109566 NaN 0.410586
8 -0.654413 -0.094920 NaN -1.763118 NaN 0.075046 NaN -1.130280
9 -0.634353 -1.514066 NaN -0.003556 NaN -1.560351 NaN 1.001637
10 -1.742696 1.173806 NaN 0.909725 NaN -1.428291 NaN -1.369954









share|improve this question
















I have a multiindex dataframe organized in year month day that goes from 1960 to 2017, i want to be able to check if a month contains more than 15 NaN.



Can someone help me to figure out how to do this in a efficient way?



Thank you in advance.
Data frame



                           A    B   C   D   E   F   G   H
Year Month Day
1960 6 1 0.053142 0.632151 NaN -0.740130 NaN -1.273792 NaN -0.287078
2 0.827514 -0.487477 NaN -0.246897 NaN -0.310194 NaN 2.150300
3 -1.403216 0.350322 NaN 2.134335 NaN 0.023102 NaN 0.343759
4 0.305884 0.663174 NaN -2.073908 NaN 0.400311 NaN 0.149292
5 0.720521 -2.081981 NaN 0.672169 NaN -0.172794 NaN -0.549559
6 -0.987216 -1.190550 NaN 0.318706 NaN 0.863885 NaN -0.995961
7 1.781080 0.636422 NaN -0.382552 NaN -0.109566 NaN 0.410586
8 -0.654413 -0.094920 NaN -1.763118 NaN 0.075046 NaN -1.130280
9 -0.634353 -1.514066 NaN -0.003556 NaN -1.560351 NaN 1.001637
10 -1.742696 1.173806 NaN 0.909725 NaN -1.428291 NaN -1.369954






python pandas filter timestamp conditional






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 25 '18 at 15:46







M.Cerv

















asked Nov 24 '18 at 21:38









M.CervM.Cerv

62




62








  • 1





    Please put the DF in a code block instead of an image... It makes it really difficult for anyone to help you here... You also need to be a bit more explicit if you're after a month having 15 NaNs across all entries for that or only in certain columns etc...

    – Jon Clements
    Nov 24 '18 at 22:01
















  • 1





    Please put the DF in a code block instead of an image... It makes it really difficult for anyone to help you here... You also need to be a bit more explicit if you're after a month having 15 NaNs across all entries for that or only in certain columns etc...

    – Jon Clements
    Nov 24 '18 at 22:01










1




1





Please put the DF in a code block instead of an image... It makes it really difficult for anyone to help you here... You also need to be a bit more explicit if you're after a month having 15 NaNs across all entries for that or only in certain columns etc...

– Jon Clements
Nov 24 '18 at 22:01







Please put the DF in a code block instead of an image... It makes it really difficult for anyone to help you here... You also need to be a bit more explicit if you're after a month having 15 NaNs across all entries for that or only in certain columns etc...

– Jon Clements
Nov 24 '18 at 22:01














1 Answer
1






active

oldest

votes


















0














something like this might work here is an example df:



# create a test dataframe similar to yours
df = pd.DataFrame(np.random.randn(10,8), columns=list('ABCDEFGH'))
df[['C', 'E', 'G']] = np.nan
df['Year'] = 1960
df['Month'] = 6
df['Day'] = range(1,11)

df2 = pd.DataFrame(np.random.randn(10,8), columns=list('ABCDEFGH'))
df2[['B']] = np.nan
df2['Year'] = 1960
df2['Month'] = 7
df2['Day'] = range(1,11)
new_df = pd.concat([df,df2])
new_df.set_index(['Year', 'Month', 'Day'], inplace=True)


then you can do something like this:



# find all nan values then stack and groupby to find the sum of true  for each group
# this is grouping on year and month change the level/levels you want to group
stackdf = pd.isna(new_df).stack().groupby(level=[0,1]).transform(sum)

# filter original df where the index is in the stacked df index
# where the stackdf sum is greater than 15
new_df[new_df.index.isin(stackdf[stackdf>15].unstack().index)]

A B C D E F G H
Year Month Day
1960 6 1 0.053142 0.632151 NaN -0.740130 NaN -1.273792 NaN -0.287078
2 0.827514 -0.487477 NaN -0.246897 NaN -0.310194 NaN 2.150300
3 -1.403216 0.350322 NaN 2.134335 NaN 0.023102 NaN 0.343759
4 0.305884 0.663174 NaN -2.073908 NaN 0.400311 NaN 0.149292
5 0.720521 -2.081981 NaN 0.672169 NaN -0.172794 NaN -0.549559
6 -0.987216 -1.190550 NaN 0.318706 NaN 0.863885 NaN -0.995961
7 1.781080 0.636422 NaN -0.382552 NaN -0.109566 NaN 0.410586
8 -0.654413 -0.094920 NaN -1.763118 NaN 0.075046 NaN -1.130280
9 -0.634353 -1.514066 NaN -0.003556 NaN -1.560351 NaN 1.001637
10 -1.742696 1.173806 NaN 0.909725 NaN -1.428291 NaN -1.369954


you can also see those less than 15 by doing new_df[new_df.index.isin(stackdf[stackdf<15].unstack().index)]



                       A    B   C   D   E   F   G   H
Year Month Day
1960 7 1 0.994542 NaN 0.488464 0.809915 0.144305 -1.092597 0.555626 0.012135
2 -0.682796 NaN -0.781031 -0.847972 0.238397 0.364584 -0.271764 0.930113
3 0.254320 NaN -0.474764 0.154370 -1.497867 -1.454383 0.191503 0.494441
4 0.994579 NaN 0.362073 -0.537878 -0.512388 -0.501573 0.315398 1.377701
5 0.623287 NaN 1.286725 -0.770290 -0.614005 0.552683 0.225974 -0.564017
6 -0.252969 NaN -1.127418 -0.357725 -1.069318 0.218666 1.296458 -0.319678
7 0.202788 NaN 0.385931 -0.169915 0.167754 0.821923 0.181937 -0.198668
8 -0.272891 NaN 0.963414 0.887208 -1.903742 -2.026687 0.897575 1.148448
9 1.398781 NaN -0.298804 -1.081953 -1.346193 0.926548 0.147855 -1.632059
10 0.489751 NaN 0.433767 0.752071 -0.714030 -1.776365 0.247908 0.919387


because I am using stack this is counting all NaN values in a group not one particular column.






share|improve this answer

























    Your Answer






    StackExchange.ifUsing("editor", function () {
    StackExchange.using("externalEditor", function () {
    StackExchange.using("snippets", function () {
    StackExchange.snippets.init();
    });
    });
    }, "code-snippets");

    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "1"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














    draft saved

    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53462592%2fhow-can-i-filter-the-months-with-less-than-15-entries-from-a-pandas-df%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    0














    something like this might work here is an example df:



    # create a test dataframe similar to yours
    df = pd.DataFrame(np.random.randn(10,8), columns=list('ABCDEFGH'))
    df[['C', 'E', 'G']] = np.nan
    df['Year'] = 1960
    df['Month'] = 6
    df['Day'] = range(1,11)

    df2 = pd.DataFrame(np.random.randn(10,8), columns=list('ABCDEFGH'))
    df2[['B']] = np.nan
    df2['Year'] = 1960
    df2['Month'] = 7
    df2['Day'] = range(1,11)
    new_df = pd.concat([df,df2])
    new_df.set_index(['Year', 'Month', 'Day'], inplace=True)


    then you can do something like this:



    # find all nan values then stack and groupby to find the sum of true  for each group
    # this is grouping on year and month change the level/levels you want to group
    stackdf = pd.isna(new_df).stack().groupby(level=[0,1]).transform(sum)

    # filter original df where the index is in the stacked df index
    # where the stackdf sum is greater than 15
    new_df[new_df.index.isin(stackdf[stackdf>15].unstack().index)]

    A B C D E F G H
    Year Month Day
    1960 6 1 0.053142 0.632151 NaN -0.740130 NaN -1.273792 NaN -0.287078
    2 0.827514 -0.487477 NaN -0.246897 NaN -0.310194 NaN 2.150300
    3 -1.403216 0.350322 NaN 2.134335 NaN 0.023102 NaN 0.343759
    4 0.305884 0.663174 NaN -2.073908 NaN 0.400311 NaN 0.149292
    5 0.720521 -2.081981 NaN 0.672169 NaN -0.172794 NaN -0.549559
    6 -0.987216 -1.190550 NaN 0.318706 NaN 0.863885 NaN -0.995961
    7 1.781080 0.636422 NaN -0.382552 NaN -0.109566 NaN 0.410586
    8 -0.654413 -0.094920 NaN -1.763118 NaN 0.075046 NaN -1.130280
    9 -0.634353 -1.514066 NaN -0.003556 NaN -1.560351 NaN 1.001637
    10 -1.742696 1.173806 NaN 0.909725 NaN -1.428291 NaN -1.369954


    you can also see those less than 15 by doing new_df[new_df.index.isin(stackdf[stackdf<15].unstack().index)]



                           A    B   C   D   E   F   G   H
    Year Month Day
    1960 7 1 0.994542 NaN 0.488464 0.809915 0.144305 -1.092597 0.555626 0.012135
    2 -0.682796 NaN -0.781031 -0.847972 0.238397 0.364584 -0.271764 0.930113
    3 0.254320 NaN -0.474764 0.154370 -1.497867 -1.454383 0.191503 0.494441
    4 0.994579 NaN 0.362073 -0.537878 -0.512388 -0.501573 0.315398 1.377701
    5 0.623287 NaN 1.286725 -0.770290 -0.614005 0.552683 0.225974 -0.564017
    6 -0.252969 NaN -1.127418 -0.357725 -1.069318 0.218666 1.296458 -0.319678
    7 0.202788 NaN 0.385931 -0.169915 0.167754 0.821923 0.181937 -0.198668
    8 -0.272891 NaN 0.963414 0.887208 -1.903742 -2.026687 0.897575 1.148448
    9 1.398781 NaN -0.298804 -1.081953 -1.346193 0.926548 0.147855 -1.632059
    10 0.489751 NaN 0.433767 0.752071 -0.714030 -1.776365 0.247908 0.919387


    because I am using stack this is counting all NaN values in a group not one particular column.






    share|improve this answer






























      0














      something like this might work here is an example df:



      # create a test dataframe similar to yours
      df = pd.DataFrame(np.random.randn(10,8), columns=list('ABCDEFGH'))
      df[['C', 'E', 'G']] = np.nan
      df['Year'] = 1960
      df['Month'] = 6
      df['Day'] = range(1,11)

      df2 = pd.DataFrame(np.random.randn(10,8), columns=list('ABCDEFGH'))
      df2[['B']] = np.nan
      df2['Year'] = 1960
      df2['Month'] = 7
      df2['Day'] = range(1,11)
      new_df = pd.concat([df,df2])
      new_df.set_index(['Year', 'Month', 'Day'], inplace=True)


      then you can do something like this:



      # find all nan values then stack and groupby to find the sum of true  for each group
      # this is grouping on year and month change the level/levels you want to group
      stackdf = pd.isna(new_df).stack().groupby(level=[0,1]).transform(sum)

      # filter original df where the index is in the stacked df index
      # where the stackdf sum is greater than 15
      new_df[new_df.index.isin(stackdf[stackdf>15].unstack().index)]

      A B C D E F G H
      Year Month Day
      1960 6 1 0.053142 0.632151 NaN -0.740130 NaN -1.273792 NaN -0.287078
      2 0.827514 -0.487477 NaN -0.246897 NaN -0.310194 NaN 2.150300
      3 -1.403216 0.350322 NaN 2.134335 NaN 0.023102 NaN 0.343759
      4 0.305884 0.663174 NaN -2.073908 NaN 0.400311 NaN 0.149292
      5 0.720521 -2.081981 NaN 0.672169 NaN -0.172794 NaN -0.549559
      6 -0.987216 -1.190550 NaN 0.318706 NaN 0.863885 NaN -0.995961
      7 1.781080 0.636422 NaN -0.382552 NaN -0.109566 NaN 0.410586
      8 -0.654413 -0.094920 NaN -1.763118 NaN 0.075046 NaN -1.130280
      9 -0.634353 -1.514066 NaN -0.003556 NaN -1.560351 NaN 1.001637
      10 -1.742696 1.173806 NaN 0.909725 NaN -1.428291 NaN -1.369954


      you can also see those less than 15 by doing new_df[new_df.index.isin(stackdf[stackdf<15].unstack().index)]



                             A    B   C   D   E   F   G   H
      Year Month Day
      1960 7 1 0.994542 NaN 0.488464 0.809915 0.144305 -1.092597 0.555626 0.012135
      2 -0.682796 NaN -0.781031 -0.847972 0.238397 0.364584 -0.271764 0.930113
      3 0.254320 NaN -0.474764 0.154370 -1.497867 -1.454383 0.191503 0.494441
      4 0.994579 NaN 0.362073 -0.537878 -0.512388 -0.501573 0.315398 1.377701
      5 0.623287 NaN 1.286725 -0.770290 -0.614005 0.552683 0.225974 -0.564017
      6 -0.252969 NaN -1.127418 -0.357725 -1.069318 0.218666 1.296458 -0.319678
      7 0.202788 NaN 0.385931 -0.169915 0.167754 0.821923 0.181937 -0.198668
      8 -0.272891 NaN 0.963414 0.887208 -1.903742 -2.026687 0.897575 1.148448
      9 1.398781 NaN -0.298804 -1.081953 -1.346193 0.926548 0.147855 -1.632059
      10 0.489751 NaN 0.433767 0.752071 -0.714030 -1.776365 0.247908 0.919387


      because I am using stack this is counting all NaN values in a group not one particular column.






      share|improve this answer




























        0












        0








        0







        something like this might work here is an example df:



        # create a test dataframe similar to yours
        df = pd.DataFrame(np.random.randn(10,8), columns=list('ABCDEFGH'))
        df[['C', 'E', 'G']] = np.nan
        df['Year'] = 1960
        df['Month'] = 6
        df['Day'] = range(1,11)

        df2 = pd.DataFrame(np.random.randn(10,8), columns=list('ABCDEFGH'))
        df2[['B']] = np.nan
        df2['Year'] = 1960
        df2['Month'] = 7
        df2['Day'] = range(1,11)
        new_df = pd.concat([df,df2])
        new_df.set_index(['Year', 'Month', 'Day'], inplace=True)


        then you can do something like this:



        # find all nan values then stack and groupby to find the sum of true  for each group
        # this is grouping on year and month change the level/levels you want to group
        stackdf = pd.isna(new_df).stack().groupby(level=[0,1]).transform(sum)

        # filter original df where the index is in the stacked df index
        # where the stackdf sum is greater than 15
        new_df[new_df.index.isin(stackdf[stackdf>15].unstack().index)]

        A B C D E F G H
        Year Month Day
        1960 6 1 0.053142 0.632151 NaN -0.740130 NaN -1.273792 NaN -0.287078
        2 0.827514 -0.487477 NaN -0.246897 NaN -0.310194 NaN 2.150300
        3 -1.403216 0.350322 NaN 2.134335 NaN 0.023102 NaN 0.343759
        4 0.305884 0.663174 NaN -2.073908 NaN 0.400311 NaN 0.149292
        5 0.720521 -2.081981 NaN 0.672169 NaN -0.172794 NaN -0.549559
        6 -0.987216 -1.190550 NaN 0.318706 NaN 0.863885 NaN -0.995961
        7 1.781080 0.636422 NaN -0.382552 NaN -0.109566 NaN 0.410586
        8 -0.654413 -0.094920 NaN -1.763118 NaN 0.075046 NaN -1.130280
        9 -0.634353 -1.514066 NaN -0.003556 NaN -1.560351 NaN 1.001637
        10 -1.742696 1.173806 NaN 0.909725 NaN -1.428291 NaN -1.369954


        you can also see those less than 15 by doing new_df[new_df.index.isin(stackdf[stackdf<15].unstack().index)]



                               A    B   C   D   E   F   G   H
        Year Month Day
        1960 7 1 0.994542 NaN 0.488464 0.809915 0.144305 -1.092597 0.555626 0.012135
        2 -0.682796 NaN -0.781031 -0.847972 0.238397 0.364584 -0.271764 0.930113
        3 0.254320 NaN -0.474764 0.154370 -1.497867 -1.454383 0.191503 0.494441
        4 0.994579 NaN 0.362073 -0.537878 -0.512388 -0.501573 0.315398 1.377701
        5 0.623287 NaN 1.286725 -0.770290 -0.614005 0.552683 0.225974 -0.564017
        6 -0.252969 NaN -1.127418 -0.357725 -1.069318 0.218666 1.296458 -0.319678
        7 0.202788 NaN 0.385931 -0.169915 0.167754 0.821923 0.181937 -0.198668
        8 -0.272891 NaN 0.963414 0.887208 -1.903742 -2.026687 0.897575 1.148448
        9 1.398781 NaN -0.298804 -1.081953 -1.346193 0.926548 0.147855 -1.632059
        10 0.489751 NaN 0.433767 0.752071 -0.714030 -1.776365 0.247908 0.919387


        because I am using stack this is counting all NaN values in a group not one particular column.






        share|improve this answer















        something like this might work here is an example df:



        # create a test dataframe similar to yours
        df = pd.DataFrame(np.random.randn(10,8), columns=list('ABCDEFGH'))
        df[['C', 'E', 'G']] = np.nan
        df['Year'] = 1960
        df['Month'] = 6
        df['Day'] = range(1,11)

        df2 = pd.DataFrame(np.random.randn(10,8), columns=list('ABCDEFGH'))
        df2[['B']] = np.nan
        df2['Year'] = 1960
        df2['Month'] = 7
        df2['Day'] = range(1,11)
        new_df = pd.concat([df,df2])
        new_df.set_index(['Year', 'Month', 'Day'], inplace=True)


        then you can do something like this:



        # find all nan values then stack and groupby to find the sum of true  for each group
        # this is grouping on year and month change the level/levels you want to group
        stackdf = pd.isna(new_df).stack().groupby(level=[0,1]).transform(sum)

        # filter original df where the index is in the stacked df index
        # where the stackdf sum is greater than 15
        new_df[new_df.index.isin(stackdf[stackdf>15].unstack().index)]

        A B C D E F G H
        Year Month Day
        1960 6 1 0.053142 0.632151 NaN -0.740130 NaN -1.273792 NaN -0.287078
        2 0.827514 -0.487477 NaN -0.246897 NaN -0.310194 NaN 2.150300
        3 -1.403216 0.350322 NaN 2.134335 NaN 0.023102 NaN 0.343759
        4 0.305884 0.663174 NaN -2.073908 NaN 0.400311 NaN 0.149292
        5 0.720521 -2.081981 NaN 0.672169 NaN -0.172794 NaN -0.549559
        6 -0.987216 -1.190550 NaN 0.318706 NaN 0.863885 NaN -0.995961
        7 1.781080 0.636422 NaN -0.382552 NaN -0.109566 NaN 0.410586
        8 -0.654413 -0.094920 NaN -1.763118 NaN 0.075046 NaN -1.130280
        9 -0.634353 -1.514066 NaN -0.003556 NaN -1.560351 NaN 1.001637
        10 -1.742696 1.173806 NaN 0.909725 NaN -1.428291 NaN -1.369954


        you can also see those less than 15 by doing new_df[new_df.index.isin(stackdf[stackdf<15].unstack().index)]



                               A    B   C   D   E   F   G   H
        Year Month Day
        1960 7 1 0.994542 NaN 0.488464 0.809915 0.144305 -1.092597 0.555626 0.012135
        2 -0.682796 NaN -0.781031 -0.847972 0.238397 0.364584 -0.271764 0.930113
        3 0.254320 NaN -0.474764 0.154370 -1.497867 -1.454383 0.191503 0.494441
        4 0.994579 NaN 0.362073 -0.537878 -0.512388 -0.501573 0.315398 1.377701
        5 0.623287 NaN 1.286725 -0.770290 -0.614005 0.552683 0.225974 -0.564017
        6 -0.252969 NaN -1.127418 -0.357725 -1.069318 0.218666 1.296458 -0.319678
        7 0.202788 NaN 0.385931 -0.169915 0.167754 0.821923 0.181937 -0.198668
        8 -0.272891 NaN 0.963414 0.887208 -1.903742 -2.026687 0.897575 1.148448
        9 1.398781 NaN -0.298804 -1.081953 -1.346193 0.926548 0.147855 -1.632059
        10 0.489751 NaN 0.433767 0.752071 -0.714030 -1.776365 0.247908 0.919387


        because I am using stack this is counting all NaN values in a group not one particular column.







        share|improve this answer














        share|improve this answer



        share|improve this answer








        edited Nov 25 '18 at 0:49

























        answered Nov 25 '18 at 0:44









        ChrisChris

        2,7232420




        2,7232420
































            draft saved

            draft discarded




















































            Thanks for contributing an answer to Stack Overflow!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53462592%2fhow-can-i-filter-the-months-with-less-than-15-entries-from-a-pandas-df%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            404 Error Contact Form 7 ajax form submitting

            How to know if a Active Directory user can login interactively

            TypeError: fit_transform() missing 1 required positional argument: 'X'