Replace all NA values for variable with one row equal to 0












9














Slightly difficult to phrase, as far as I saw none of the similar questions answered my problem.



I have a data.frame such as:



df1 <- data.frame(id = rep(c("a", "b"), each = 4),
val = c(NA, NA, NA, NA, 1, 2, 2, 3))

df1

id val
1 a NA
2 a NA
3 a NA
4 a NA
5 b 1
6 b 2
7 b 2
8 b 3


and I want to get rid of all the NA values (easy enough using e.g. filter() ) but make sure that if this removes all of one id value (in this case it removes every instance of "a") that one extra row is inserted of (e.g.) a = 0



so that:



  id val
1 a 0
2 b 1
3 b 2
4 b 2
5 b 3


obviously easy enough to do this in a roundabout way but I was wondering if there's a tidy/elegant way to do this. I thought tidyr::complete() might help but not entirely sure how to apply it to a case like this



I don't care about the order of the rows



Cheers!










share|improve this question






















  • So you want to add rows with 0 only if all the values for particular id is 0?
    – Ronak Shah
    3 hours ago










  • only if they're all NA for a particular id
    – Robert Hickman
    3 hours ago






  • 1




    @RobertHickman There seems to be some confusion about your desired output. Could you update your question with the expected output based on this df1 <- data.frame(id = rep(c("a", "b","c"), each = 4), val = c(NA, NA, NA, NA, 1, 2, 2, 3,NA,2,NA,3)) ? Thanks to @VivekKalyanarangan for the data.
    – markus
    2 hours ago
















9














Slightly difficult to phrase, as far as I saw none of the similar questions answered my problem.



I have a data.frame such as:



df1 <- data.frame(id = rep(c("a", "b"), each = 4),
val = c(NA, NA, NA, NA, 1, 2, 2, 3))

df1

id val
1 a NA
2 a NA
3 a NA
4 a NA
5 b 1
6 b 2
7 b 2
8 b 3


and I want to get rid of all the NA values (easy enough using e.g. filter() ) but make sure that if this removes all of one id value (in this case it removes every instance of "a") that one extra row is inserted of (e.g.) a = 0



so that:



  id val
1 a 0
2 b 1
3 b 2
4 b 2
5 b 3


obviously easy enough to do this in a roundabout way but I was wondering if there's a tidy/elegant way to do this. I thought tidyr::complete() might help but not entirely sure how to apply it to a case like this



I don't care about the order of the rows



Cheers!










share|improve this question






















  • So you want to add rows with 0 only if all the values for particular id is 0?
    – Ronak Shah
    3 hours ago










  • only if they're all NA for a particular id
    – Robert Hickman
    3 hours ago






  • 1




    @RobertHickman There seems to be some confusion about your desired output. Could you update your question with the expected output based on this df1 <- data.frame(id = rep(c("a", "b","c"), each = 4), val = c(NA, NA, NA, NA, 1, 2, 2, 3,NA,2,NA,3)) ? Thanks to @VivekKalyanarangan for the data.
    – markus
    2 hours ago














9












9








9







Slightly difficult to phrase, as far as I saw none of the similar questions answered my problem.



I have a data.frame such as:



df1 <- data.frame(id = rep(c("a", "b"), each = 4),
val = c(NA, NA, NA, NA, 1, 2, 2, 3))

df1

id val
1 a NA
2 a NA
3 a NA
4 a NA
5 b 1
6 b 2
7 b 2
8 b 3


and I want to get rid of all the NA values (easy enough using e.g. filter() ) but make sure that if this removes all of one id value (in this case it removes every instance of "a") that one extra row is inserted of (e.g.) a = 0



so that:



  id val
1 a 0
2 b 1
3 b 2
4 b 2
5 b 3


obviously easy enough to do this in a roundabout way but I was wondering if there's a tidy/elegant way to do this. I thought tidyr::complete() might help but not entirely sure how to apply it to a case like this



I don't care about the order of the rows



Cheers!










share|improve this question













Slightly difficult to phrase, as far as I saw none of the similar questions answered my problem.



I have a data.frame such as:



df1 <- data.frame(id = rep(c("a", "b"), each = 4),
val = c(NA, NA, NA, NA, 1, 2, 2, 3))

df1

id val
1 a NA
2 a NA
3 a NA
4 a NA
5 b 1
6 b 2
7 b 2
8 b 3


and I want to get rid of all the NA values (easy enough using e.g. filter() ) but make sure that if this removes all of one id value (in this case it removes every instance of "a") that one extra row is inserted of (e.g.) a = 0



so that:



  id val
1 a 0
2 b 1
3 b 2
4 b 2
5 b 3


obviously easy enough to do this in a roundabout way but I was wondering if there's a tidy/elegant way to do this. I thought tidyr::complete() might help but not entirely sure how to apply it to a case like this



I don't care about the order of the rows



Cheers!







r na complete






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked 3 hours ago









Robert Hickman

15019




15019












  • So you want to add rows with 0 only if all the values for particular id is 0?
    – Ronak Shah
    3 hours ago










  • only if they're all NA for a particular id
    – Robert Hickman
    3 hours ago






  • 1




    @RobertHickman There seems to be some confusion about your desired output. Could you update your question with the expected output based on this df1 <- data.frame(id = rep(c("a", "b","c"), each = 4), val = c(NA, NA, NA, NA, 1, 2, 2, 3,NA,2,NA,3)) ? Thanks to @VivekKalyanarangan for the data.
    – markus
    2 hours ago


















  • So you want to add rows with 0 only if all the values for particular id is 0?
    – Ronak Shah
    3 hours ago










  • only if they're all NA for a particular id
    – Robert Hickman
    3 hours ago






  • 1




    @RobertHickman There seems to be some confusion about your desired output. Could you update your question with the expected output based on this df1 <- data.frame(id = rep(c("a", "b","c"), each = 4), val = c(NA, NA, NA, NA, 1, 2, 2, 3,NA,2,NA,3)) ? Thanks to @VivekKalyanarangan for the data.
    – markus
    2 hours ago
















So you want to add rows with 0 only if all the values for particular id is 0?
– Ronak Shah
3 hours ago




So you want to add rows with 0 only if all the values for particular id is 0?
– Ronak Shah
3 hours ago












only if they're all NA for a particular id
– Robert Hickman
3 hours ago




only if they're all NA for a particular id
– Robert Hickman
3 hours ago




1




1




@RobertHickman There seems to be some confusion about your desired output. Could you update your question with the expected output based on this df1 <- data.frame(id = rep(c("a", "b","c"), each = 4), val = c(NA, NA, NA, NA, 1, 2, 2, 3,NA,2,NA,3)) ? Thanks to @VivekKalyanarangan for the data.
– markus
2 hours ago




@RobertHickman There seems to be some confusion about your desired output. Could you update your question with the expected output based on this df1 <- data.frame(id = rep(c("a", "b","c"), each = 4), val = c(NA, NA, NA, NA, 1, 2, 2, 3,NA,2,NA,3)) ? Thanks to @VivekKalyanarangan for the data.
– markus
2 hours ago












8 Answers
8






active

oldest

votes


















5














Another idea using dplyr,



library(dplyr)

df1 %>%
group_by(id) %>%
mutate(val = ifelse(row_number() == 1 & all(is.na(val)), 0, val)) %>%
na.omit()


which gives,




# A tibble: 5 x 2
# Groups: id [2]
id val
<fct> <dbl>
1 a 0
2 b 1
3 b 2
4 b 2
5 b 3






share|improve this answer

















  • 1




    (+1) Seems like the most robust answer here. Would be marginally more concise using replace(val, all(is.na(val)) * 1, 0) instead of the ifelse(...).
    – Mikko Marttila
    1 hour ago










  • @MikkoMarttila Good suggestion. I usually try and avoid ifelse in general
    – Sotos
    1 hour ago





















3














We may do



df1 %>% group_by(id) %>% do(if(all(is.na(.$val))) replace(.[1, ], 2, 0) else na.omit(.))
# A tibble: 5 x 2
# Groups: id [2]
# id val
# <fct> <dbl>
# 1 a 0
# 2 b 1
# 3 b 2
# 4 b 2
# 5 b 3


After grouping by id, if everything in val is NA, then we leave only the first row with the second element replaced by 0, otherwise the same data is returned after applying na.omit.



In a more readable format that would be



df1 %>% group_by(id) %>% 
do(if(all(is.na(.$val))) data.frame(id = .$id[1], val = 0) else na.omit(.))


(Here I presume that you indeed want to get rid of all NA values; otherwise there is no need for na.omit.)






share|improve this answer



















  • 1




    @markus, right, I had assumed that that's the goal. Thanks!
    – Julius Vainora
    3 hours ago












  • It looks like op wants to retain the first row and replace the val column of that row with 0 where all val is NA for a group. Check my ans pls. Agree with @markus, it does seem tricky
    – Vivek Kalyanarangan
    2 hours ago






  • 1




    @VivekKalyanarangan, that's what I initially thought, but "and I want to get rid of all the NA values" suggests otherwise.
    – Julius Vainora
    2 hours ago



















2














df1[is.na(df1)] <- 0
df1[!(duplicated(df1$id) & df1$val == 0), ]

id val
1 a 0
5 b 1
6 b 2
7 b 2
8 b 3





share|improve this answer

















  • 5




    Would this work for ids that contain NAs and non-NAs? Try with df1 <- data.frame(id = rep(c("a", "b"), each = 2), val = c(NA, 1, 2, 3))
    – markus
    3 hours ago












  • I think this is the best so far (I'll leave it open for another hour or so to see) would maybe change to df %>% replace(is.na(.), 0) %>% .[!(duplicated(.$id) & .$val == 0), ]
    – Robert Hickman
    2 hours ago





















1














Base R option is to find groups with all NAs and transform them by changing their val to 0 and select only unique rows so that there is only one row per group. We rbind this dataframe with the groups which are !all_NA.



all_NA <- with(df1, ave(is.na(val), id, FUN = all))
rbind(unique(transform(df1[all_NA, ], val = 0)), df1[!all_NA, ])

# id val
#1 a 0
#5 b 1
#6 b 2
#7 b 2
#8 b 3




dplyr option looks ugly but one way is to make two groups of dataframes one with groups of all NA values and other with groups of all non-NA values. For groups with all NA values we add row with it's id and val as 0 and bind this to the other group.



library(dplyr)

bind_rows(df1 %>%
group_by(id) %>%
filter(all(!is.na(val))),
df1 %>%
group_by(id) %>%
filter(all(is.na(val))) %>%
ungroup() %>%
summarise(id = unique(id),
val = 0)) %>%
arrange(id)


# id val
# <fct> <dbl>
#1 a 0
#2 b 1
#3 b 2
#4 b 2
#5 b 3





share|improve this answer































    1














    Changed the df to make example more exhaustive -



    df1 <- data.frame(id = rep(c("a", "b","c"), each = 4),
    val = c(NA, NA, NA, NA, 1, 2, 2, 3,NA,2,NA,3))
    library(dplyr)
    df1 %>%
    group_by(id) %>%
    mutate(case=sum(is.na(val))==n(), row_num=row_number() ) %>%
    mutate(val=ifelse(is.na(val)&case,0,val)) %>%
    filter( !(case&row_num!=1) ) %>%
    select(id, val)


    Output



      id      val
    <fct> <dbl>
    1 a 0
    2 b 1
    3 b 2
    4 b 2
    5 b 3
    6 c NA
    7 c 2
    8 c NA
    9 c 3





    share|improve this answer





























      1














      Here is an option too:



      df1 %>% 
      mutate_if(is.factor,as.character) %>%
      mutate_all(funs(replace(.,is.na(.),0))) %>%
      slice(4:nrow(.))


      This gives:



       id val
      1 a 0
      2 b 1
      3 b 2
      4 b 2
      5 b 3


      Alternative:



      df1 %>% 
      mutate_if(is.factor,as.character) %>%
      mutate_all(funs(replace(.,is.na(.),0))) %>%
      unique()





      share|improve this answer



















      • 3




        where did 4 come from?
        – Sotos
        3 hours ago










      • The solution produces four 0s. We're only interested in having 1?
        – NelsonGon
        3 hours ago










      • What if one group has 4 and another 3?
        – Sotos
        3 hours ago










      • Sorry I only answered based on the question. Maybe then we could twist things up, not sure though!
        – NelsonGon
        2 hours ago










      • Consider this example - df1 <- data.frame(id = rep(c("a", "b","c"), each = 4), val = c(NA, NA, NA, NA, 1, 2, 2, 3,NA,2,NA,3)) I think here OP wants to remove NA values for A group only, not the rest
        – Vivek Kalyanarangan
        2 hours ago



















      0














      Here is a base R solution.



      res <- lapply(split(df1, df1$id), function(DF){
      if(anyNA(DF$val)) {
      i <- is.na(DF$val)
      DF$val[i] <- 0
      DF <- rbind(DF[i & !duplicated(DF[i, ]), ], DF[!i, ])
      }
      DF
      })
      res <- do.call(rbind, res)
      row.names(res) <- NULL
      res
      # id val
      #1 a 0
      #2 b 1
      #3 b 2
      #4 b 2
      #5 b 3


      Edit.



      A dplyr solution could be the following.
      It was tested with the original dataset posted by the OP, with the dataset in Vivek Kalyanarangan's answer and with the dataset in markus' comment, renamed df2 and df3, respectively.



      library(dplyr)

      na2zero <- function(DF){
      DF %>%
      group_by(id) %>%
      mutate(val = ifelse(is.na(val), 0, val),
      crit = val == 0 & duplicated(val)) %>%
      filter(!crit) %>%
      select(-crit)
      }

      na2zero(df1)
      na2zero(df2)
      na2zero(df3)





      share|improve this answer























      • Rui, try with df1 <- data.frame(id = rep(c("a", "b"), each = 2), val = c(NA, 1, 2, 3)). Unfortunately your solution doesn't return a data frame with only three rows.
        – markus
        3 hours ago










      • @markus No, it doesn't. The NA is replaced by a 0 and the other value of val is not NA so both must be in the output. At least that's how I'm understanding the OP's problem.
        – Rui Barradas
        2 hours ago










      • Fair enough. People are reading the question differently.
        – markus
        1 hour ago



















      0














      Another base approach, one that doesn't maintain the order of the rows and takes advantage of factors remembering lost values:



      df1 <- na.omit(df1)

      df1 <- rbind(
      df1,
      data.frame(
      id = levels(df1$id)[!levels(df1$id) %in% df1$id],
      val = 0)
      )


      I do personally prefer the dplyr approach given by Sotos, as I don't like rbind-ing data.frames back together so it's a matter of taste, but this isn't unbearably complicated by my eye. It's easy enough to adapt to a character id column with a unique(df1$id) variable.






      share|improve this answer





















        Your Answer






        StackExchange.ifUsing("editor", function () {
        StackExchange.using("externalEditor", function () {
        StackExchange.using("snippets", function () {
        StackExchange.snippets.init();
        });
        });
        }, "code-snippets");

        StackExchange.ready(function() {
        var channelOptions = {
        tags: "".split(" "),
        id: "1"
        };
        initTagRenderer("".split(" "), "".split(" "), channelOptions);

        StackExchange.using("externalEditor", function() {
        // Have to fire editor after snippets, if snippets enabled
        if (StackExchange.settings.snippets.snippetsEnabled) {
        StackExchange.using("snippets", function() {
        createEditor();
        });
        }
        else {
        createEditor();
        }
        });

        function createEditor() {
        StackExchange.prepareEditor({
        heartbeatType: 'answer',
        autoActivateHeartbeat: false,
        convertImagesToLinks: true,
        noModals: true,
        showLowRepImageUploadWarning: true,
        reputationToPostImages: 10,
        bindNavPrevention: true,
        postfix: "",
        imageUploader: {
        brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
        contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
        allowUrls: true
        },
        onDemand: true,
        discardSelector: ".discard-answer"
        ,immediatelyShowMarkdownHelp:true
        });


        }
        });














        draft saved

        draft discarded


















        StackExchange.ready(
        function () {
        StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54022536%2freplace-all-na-values-for-variable-with-one-row-equal-to-0%23new-answer', 'question_page');
        }
        );

        Post as a guest















        Required, but never shown

























        8 Answers
        8






        active

        oldest

        votes








        8 Answers
        8






        active

        oldest

        votes









        active

        oldest

        votes






        active

        oldest

        votes









        5














        Another idea using dplyr,



        library(dplyr)

        df1 %>%
        group_by(id) %>%
        mutate(val = ifelse(row_number() == 1 & all(is.na(val)), 0, val)) %>%
        na.omit()


        which gives,




        # A tibble: 5 x 2
        # Groups: id [2]
        id val
        <fct> <dbl>
        1 a 0
        2 b 1
        3 b 2
        4 b 2
        5 b 3






        share|improve this answer

















        • 1




          (+1) Seems like the most robust answer here. Would be marginally more concise using replace(val, all(is.na(val)) * 1, 0) instead of the ifelse(...).
          – Mikko Marttila
          1 hour ago










        • @MikkoMarttila Good suggestion. I usually try and avoid ifelse in general
          – Sotos
          1 hour ago


















        5














        Another idea using dplyr,



        library(dplyr)

        df1 %>%
        group_by(id) %>%
        mutate(val = ifelse(row_number() == 1 & all(is.na(val)), 0, val)) %>%
        na.omit()


        which gives,




        # A tibble: 5 x 2
        # Groups: id [2]
        id val
        <fct> <dbl>
        1 a 0
        2 b 1
        3 b 2
        4 b 2
        5 b 3






        share|improve this answer

















        • 1




          (+1) Seems like the most robust answer here. Would be marginally more concise using replace(val, all(is.na(val)) * 1, 0) instead of the ifelse(...).
          – Mikko Marttila
          1 hour ago










        • @MikkoMarttila Good suggestion. I usually try and avoid ifelse in general
          – Sotos
          1 hour ago
















        5












        5








        5






        Another idea using dplyr,



        library(dplyr)

        df1 %>%
        group_by(id) %>%
        mutate(val = ifelse(row_number() == 1 & all(is.na(val)), 0, val)) %>%
        na.omit()


        which gives,




        # A tibble: 5 x 2
        # Groups: id [2]
        id val
        <fct> <dbl>
        1 a 0
        2 b 1
        3 b 2
        4 b 2
        5 b 3






        share|improve this answer












        Another idea using dplyr,



        library(dplyr)

        df1 %>%
        group_by(id) %>%
        mutate(val = ifelse(row_number() == 1 & all(is.na(val)), 0, val)) %>%
        na.omit()


        which gives,




        # A tibble: 5 x 2
        # Groups: id [2]
        id val
        <fct> <dbl>
        1 a 0
        2 b 1
        3 b 2
        4 b 2
        5 b 3







        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered 2 hours ago









        Sotos

        28.2k51640




        28.2k51640








        • 1




          (+1) Seems like the most robust answer here. Would be marginally more concise using replace(val, all(is.na(val)) * 1, 0) instead of the ifelse(...).
          – Mikko Marttila
          1 hour ago










        • @MikkoMarttila Good suggestion. I usually try and avoid ifelse in general
          – Sotos
          1 hour ago
















        • 1




          (+1) Seems like the most robust answer here. Would be marginally more concise using replace(val, all(is.na(val)) * 1, 0) instead of the ifelse(...).
          – Mikko Marttila
          1 hour ago










        • @MikkoMarttila Good suggestion. I usually try and avoid ifelse in general
          – Sotos
          1 hour ago










        1




        1




        (+1) Seems like the most robust answer here. Would be marginally more concise using replace(val, all(is.na(val)) * 1, 0) instead of the ifelse(...).
        – Mikko Marttila
        1 hour ago




        (+1) Seems like the most robust answer here. Would be marginally more concise using replace(val, all(is.na(val)) * 1, 0) instead of the ifelse(...).
        – Mikko Marttila
        1 hour ago












        @MikkoMarttila Good suggestion. I usually try and avoid ifelse in general
        – Sotos
        1 hour ago






        @MikkoMarttila Good suggestion. I usually try and avoid ifelse in general
        – Sotos
        1 hour ago















        3














        We may do



        df1 %>% group_by(id) %>% do(if(all(is.na(.$val))) replace(.[1, ], 2, 0) else na.omit(.))
        # A tibble: 5 x 2
        # Groups: id [2]
        # id val
        # <fct> <dbl>
        # 1 a 0
        # 2 b 1
        # 3 b 2
        # 4 b 2
        # 5 b 3


        After grouping by id, if everything in val is NA, then we leave only the first row with the second element replaced by 0, otherwise the same data is returned after applying na.omit.



        In a more readable format that would be



        df1 %>% group_by(id) %>% 
        do(if(all(is.na(.$val))) data.frame(id = .$id[1], val = 0) else na.omit(.))


        (Here I presume that you indeed want to get rid of all NA values; otherwise there is no need for na.omit.)






        share|improve this answer



















        • 1




          @markus, right, I had assumed that that's the goal. Thanks!
          – Julius Vainora
          3 hours ago












        • It looks like op wants to retain the first row and replace the val column of that row with 0 where all val is NA for a group. Check my ans pls. Agree with @markus, it does seem tricky
          – Vivek Kalyanarangan
          2 hours ago






        • 1




          @VivekKalyanarangan, that's what I initially thought, but "and I want to get rid of all the NA values" suggests otherwise.
          – Julius Vainora
          2 hours ago
















        3














        We may do



        df1 %>% group_by(id) %>% do(if(all(is.na(.$val))) replace(.[1, ], 2, 0) else na.omit(.))
        # A tibble: 5 x 2
        # Groups: id [2]
        # id val
        # <fct> <dbl>
        # 1 a 0
        # 2 b 1
        # 3 b 2
        # 4 b 2
        # 5 b 3


        After grouping by id, if everything in val is NA, then we leave only the first row with the second element replaced by 0, otherwise the same data is returned after applying na.omit.



        In a more readable format that would be



        df1 %>% group_by(id) %>% 
        do(if(all(is.na(.$val))) data.frame(id = .$id[1], val = 0) else na.omit(.))


        (Here I presume that you indeed want to get rid of all NA values; otherwise there is no need for na.omit.)






        share|improve this answer



















        • 1




          @markus, right, I had assumed that that's the goal. Thanks!
          – Julius Vainora
          3 hours ago












        • It looks like op wants to retain the first row and replace the val column of that row with 0 where all val is NA for a group. Check my ans pls. Agree with @markus, it does seem tricky
          – Vivek Kalyanarangan
          2 hours ago






        • 1




          @VivekKalyanarangan, that's what I initially thought, but "and I want to get rid of all the NA values" suggests otherwise.
          – Julius Vainora
          2 hours ago














        3












        3








        3






        We may do



        df1 %>% group_by(id) %>% do(if(all(is.na(.$val))) replace(.[1, ], 2, 0) else na.omit(.))
        # A tibble: 5 x 2
        # Groups: id [2]
        # id val
        # <fct> <dbl>
        # 1 a 0
        # 2 b 1
        # 3 b 2
        # 4 b 2
        # 5 b 3


        After grouping by id, if everything in val is NA, then we leave only the first row with the second element replaced by 0, otherwise the same data is returned after applying na.omit.



        In a more readable format that would be



        df1 %>% group_by(id) %>% 
        do(if(all(is.na(.$val))) data.frame(id = .$id[1], val = 0) else na.omit(.))


        (Here I presume that you indeed want to get rid of all NA values; otherwise there is no need for na.omit.)






        share|improve this answer














        We may do



        df1 %>% group_by(id) %>% do(if(all(is.na(.$val))) replace(.[1, ], 2, 0) else na.omit(.))
        # A tibble: 5 x 2
        # Groups: id [2]
        # id val
        # <fct> <dbl>
        # 1 a 0
        # 2 b 1
        # 3 b 2
        # 4 b 2
        # 5 b 3


        After grouping by id, if everything in val is NA, then we leave only the first row with the second element replaced by 0, otherwise the same data is returned after applying na.omit.



        In a more readable format that would be



        df1 %>% group_by(id) %>% 
        do(if(all(is.na(.$val))) data.frame(id = .$id[1], val = 0) else na.omit(.))


        (Here I presume that you indeed want to get rid of all NA values; otherwise there is no need for na.omit.)







        share|improve this answer














        share|improve this answer



        share|improve this answer








        edited 2 hours ago

























        answered 3 hours ago









        Julius Vainora

        32.6k75979




        32.6k75979








        • 1




          @markus, right, I had assumed that that's the goal. Thanks!
          – Julius Vainora
          3 hours ago












        • It looks like op wants to retain the first row and replace the val column of that row with 0 where all val is NA for a group. Check my ans pls. Agree with @markus, it does seem tricky
          – Vivek Kalyanarangan
          2 hours ago






        • 1




          @VivekKalyanarangan, that's what I initially thought, but "and I want to get rid of all the NA values" suggests otherwise.
          – Julius Vainora
          2 hours ago














        • 1




          @markus, right, I had assumed that that's the goal. Thanks!
          – Julius Vainora
          3 hours ago












        • It looks like op wants to retain the first row and replace the val column of that row with 0 where all val is NA for a group. Check my ans pls. Agree with @markus, it does seem tricky
          – Vivek Kalyanarangan
          2 hours ago






        • 1




          @VivekKalyanarangan, that's what I initially thought, but "and I want to get rid of all the NA values" suggests otherwise.
          – Julius Vainora
          2 hours ago








        1




        1




        @markus, right, I had assumed that that's the goal. Thanks!
        – Julius Vainora
        3 hours ago






        @markus, right, I had assumed that that's the goal. Thanks!
        – Julius Vainora
        3 hours ago














        It looks like op wants to retain the first row and replace the val column of that row with 0 where all val is NA for a group. Check my ans pls. Agree with @markus, it does seem tricky
        – Vivek Kalyanarangan
        2 hours ago




        It looks like op wants to retain the first row and replace the val column of that row with 0 where all val is NA for a group. Check my ans pls. Agree with @markus, it does seem tricky
        – Vivek Kalyanarangan
        2 hours ago




        1




        1




        @VivekKalyanarangan, that's what I initially thought, but "and I want to get rid of all the NA values" suggests otherwise.
        – Julius Vainora
        2 hours ago




        @VivekKalyanarangan, that's what I initially thought, but "and I want to get rid of all the NA values" suggests otherwise.
        – Julius Vainora
        2 hours ago











        2














        df1[is.na(df1)] <- 0
        df1[!(duplicated(df1$id) & df1$val == 0), ]

        id val
        1 a 0
        5 b 1
        6 b 2
        7 b 2
        8 b 3





        share|improve this answer

















        • 5




          Would this work for ids that contain NAs and non-NAs? Try with df1 <- data.frame(id = rep(c("a", "b"), each = 2), val = c(NA, 1, 2, 3))
          – markus
          3 hours ago












        • I think this is the best so far (I'll leave it open for another hour or so to see) would maybe change to df %>% replace(is.na(.), 0) %>% .[!(duplicated(.$id) & .$val == 0), ]
          – Robert Hickman
          2 hours ago


















        2














        df1[is.na(df1)] <- 0
        df1[!(duplicated(df1$id) & df1$val == 0), ]

        id val
        1 a 0
        5 b 1
        6 b 2
        7 b 2
        8 b 3





        share|improve this answer

















        • 5




          Would this work for ids that contain NAs and non-NAs? Try with df1 <- data.frame(id = rep(c("a", "b"), each = 2), val = c(NA, 1, 2, 3))
          – markus
          3 hours ago












        • I think this is the best so far (I'll leave it open for another hour or so to see) would maybe change to df %>% replace(is.na(.), 0) %>% .[!(duplicated(.$id) & .$val == 0), ]
          – Robert Hickman
          2 hours ago
















        2












        2








        2






        df1[is.na(df1)] <- 0
        df1[!(duplicated(df1$id) & df1$val == 0), ]

        id val
        1 a 0
        5 b 1
        6 b 2
        7 b 2
        8 b 3





        share|improve this answer












        df1[is.na(df1)] <- 0
        df1[!(duplicated(df1$id) & df1$val == 0), ]

        id val
        1 a 0
        5 b 1
        6 b 2
        7 b 2
        8 b 3






        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered 3 hours ago









        Adamm

        832517




        832517








        • 5




          Would this work for ids that contain NAs and non-NAs? Try with df1 <- data.frame(id = rep(c("a", "b"), each = 2), val = c(NA, 1, 2, 3))
          – markus
          3 hours ago












        • I think this is the best so far (I'll leave it open for another hour or so to see) would maybe change to df %>% replace(is.na(.), 0) %>% .[!(duplicated(.$id) & .$val == 0), ]
          – Robert Hickman
          2 hours ago
















        • 5




          Would this work for ids that contain NAs and non-NAs? Try with df1 <- data.frame(id = rep(c("a", "b"), each = 2), val = c(NA, 1, 2, 3))
          – markus
          3 hours ago












        • I think this is the best so far (I'll leave it open for another hour or so to see) would maybe change to df %>% replace(is.na(.), 0) %>% .[!(duplicated(.$id) & .$val == 0), ]
          – Robert Hickman
          2 hours ago










        5




        5




        Would this work for ids that contain NAs and non-NAs? Try with df1 <- data.frame(id = rep(c("a", "b"), each = 2), val = c(NA, 1, 2, 3))
        – markus
        3 hours ago






        Would this work for ids that contain NAs and non-NAs? Try with df1 <- data.frame(id = rep(c("a", "b"), each = 2), val = c(NA, 1, 2, 3))
        – markus
        3 hours ago














        I think this is the best so far (I'll leave it open for another hour or so to see) would maybe change to df %>% replace(is.na(.), 0) %>% .[!(duplicated(.$id) & .$val == 0), ]
        – Robert Hickman
        2 hours ago






        I think this is the best so far (I'll leave it open for another hour or so to see) would maybe change to df %>% replace(is.na(.), 0) %>% .[!(duplicated(.$id) & .$val == 0), ]
        – Robert Hickman
        2 hours ago













        1














        Base R option is to find groups with all NAs and transform them by changing their val to 0 and select only unique rows so that there is only one row per group. We rbind this dataframe with the groups which are !all_NA.



        all_NA <- with(df1, ave(is.na(val), id, FUN = all))
        rbind(unique(transform(df1[all_NA, ], val = 0)), df1[!all_NA, ])

        # id val
        #1 a 0
        #5 b 1
        #6 b 2
        #7 b 2
        #8 b 3




        dplyr option looks ugly but one way is to make two groups of dataframes one with groups of all NA values and other with groups of all non-NA values. For groups with all NA values we add row with it's id and val as 0 and bind this to the other group.



        library(dplyr)

        bind_rows(df1 %>%
        group_by(id) %>%
        filter(all(!is.na(val))),
        df1 %>%
        group_by(id) %>%
        filter(all(is.na(val))) %>%
        ungroup() %>%
        summarise(id = unique(id),
        val = 0)) %>%
        arrange(id)


        # id val
        # <fct> <dbl>
        #1 a 0
        #2 b 1
        #3 b 2
        #4 b 2
        #5 b 3





        share|improve this answer




























          1














          Base R option is to find groups with all NAs and transform them by changing their val to 0 and select only unique rows so that there is only one row per group. We rbind this dataframe with the groups which are !all_NA.



          all_NA <- with(df1, ave(is.na(val), id, FUN = all))
          rbind(unique(transform(df1[all_NA, ], val = 0)), df1[!all_NA, ])

          # id val
          #1 a 0
          #5 b 1
          #6 b 2
          #7 b 2
          #8 b 3




          dplyr option looks ugly but one way is to make two groups of dataframes one with groups of all NA values and other with groups of all non-NA values. For groups with all NA values we add row with it's id and val as 0 and bind this to the other group.



          library(dplyr)

          bind_rows(df1 %>%
          group_by(id) %>%
          filter(all(!is.na(val))),
          df1 %>%
          group_by(id) %>%
          filter(all(is.na(val))) %>%
          ungroup() %>%
          summarise(id = unique(id),
          val = 0)) %>%
          arrange(id)


          # id val
          # <fct> <dbl>
          #1 a 0
          #2 b 1
          #3 b 2
          #4 b 2
          #5 b 3





          share|improve this answer


























            1












            1








            1






            Base R option is to find groups with all NAs and transform them by changing their val to 0 and select only unique rows so that there is only one row per group. We rbind this dataframe with the groups which are !all_NA.



            all_NA <- with(df1, ave(is.na(val), id, FUN = all))
            rbind(unique(transform(df1[all_NA, ], val = 0)), df1[!all_NA, ])

            # id val
            #1 a 0
            #5 b 1
            #6 b 2
            #7 b 2
            #8 b 3




            dplyr option looks ugly but one way is to make two groups of dataframes one with groups of all NA values and other with groups of all non-NA values. For groups with all NA values we add row with it's id and val as 0 and bind this to the other group.



            library(dplyr)

            bind_rows(df1 %>%
            group_by(id) %>%
            filter(all(!is.na(val))),
            df1 %>%
            group_by(id) %>%
            filter(all(is.na(val))) %>%
            ungroup() %>%
            summarise(id = unique(id),
            val = 0)) %>%
            arrange(id)


            # id val
            # <fct> <dbl>
            #1 a 0
            #2 b 1
            #3 b 2
            #4 b 2
            #5 b 3





            share|improve this answer














            Base R option is to find groups with all NAs and transform them by changing their val to 0 and select only unique rows so that there is only one row per group. We rbind this dataframe with the groups which are !all_NA.



            all_NA <- with(df1, ave(is.na(val), id, FUN = all))
            rbind(unique(transform(df1[all_NA, ], val = 0)), df1[!all_NA, ])

            # id val
            #1 a 0
            #5 b 1
            #6 b 2
            #7 b 2
            #8 b 3




            dplyr option looks ugly but one way is to make two groups of dataframes one with groups of all NA values and other with groups of all non-NA values. For groups with all NA values we add row with it's id and val as 0 and bind this to the other group.



            library(dplyr)

            bind_rows(df1 %>%
            group_by(id) %>%
            filter(all(!is.na(val))),
            df1 %>%
            group_by(id) %>%
            filter(all(is.na(val))) %>%
            ungroup() %>%
            summarise(id = unique(id),
            val = 0)) %>%
            arrange(id)


            # id val
            # <fct> <dbl>
            #1 a 0
            #2 b 1
            #3 b 2
            #4 b 2
            #5 b 3






            share|improve this answer














            share|improve this answer



            share|improve this answer








            edited 3 hours ago

























            answered 3 hours ago









            Ronak Shah

            32.6k103753




            32.6k103753























                1














                Changed the df to make example more exhaustive -



                df1 <- data.frame(id = rep(c("a", "b","c"), each = 4),
                val = c(NA, NA, NA, NA, 1, 2, 2, 3,NA,2,NA,3))
                library(dplyr)
                df1 %>%
                group_by(id) %>%
                mutate(case=sum(is.na(val))==n(), row_num=row_number() ) %>%
                mutate(val=ifelse(is.na(val)&case,0,val)) %>%
                filter( !(case&row_num!=1) ) %>%
                select(id, val)


                Output



                  id      val
                <fct> <dbl>
                1 a 0
                2 b 1
                3 b 2
                4 b 2
                5 b 3
                6 c NA
                7 c 2
                8 c NA
                9 c 3





                share|improve this answer


























                  1














                  Changed the df to make example more exhaustive -



                  df1 <- data.frame(id = rep(c("a", "b","c"), each = 4),
                  val = c(NA, NA, NA, NA, 1, 2, 2, 3,NA,2,NA,3))
                  library(dplyr)
                  df1 %>%
                  group_by(id) %>%
                  mutate(case=sum(is.na(val))==n(), row_num=row_number() ) %>%
                  mutate(val=ifelse(is.na(val)&case,0,val)) %>%
                  filter( !(case&row_num!=1) ) %>%
                  select(id, val)


                  Output



                    id      val
                  <fct> <dbl>
                  1 a 0
                  2 b 1
                  3 b 2
                  4 b 2
                  5 b 3
                  6 c NA
                  7 c 2
                  8 c NA
                  9 c 3





                  share|improve this answer
























                    1












                    1








                    1






                    Changed the df to make example more exhaustive -



                    df1 <- data.frame(id = rep(c("a", "b","c"), each = 4),
                    val = c(NA, NA, NA, NA, 1, 2, 2, 3,NA,2,NA,3))
                    library(dplyr)
                    df1 %>%
                    group_by(id) %>%
                    mutate(case=sum(is.na(val))==n(), row_num=row_number() ) %>%
                    mutate(val=ifelse(is.na(val)&case,0,val)) %>%
                    filter( !(case&row_num!=1) ) %>%
                    select(id, val)


                    Output



                      id      val
                    <fct> <dbl>
                    1 a 0
                    2 b 1
                    3 b 2
                    4 b 2
                    5 b 3
                    6 c NA
                    7 c 2
                    8 c NA
                    9 c 3





                    share|improve this answer












                    Changed the df to make example more exhaustive -



                    df1 <- data.frame(id = rep(c("a", "b","c"), each = 4),
                    val = c(NA, NA, NA, NA, 1, 2, 2, 3,NA,2,NA,3))
                    library(dplyr)
                    df1 %>%
                    group_by(id) %>%
                    mutate(case=sum(is.na(val))==n(), row_num=row_number() ) %>%
                    mutate(val=ifelse(is.na(val)&case,0,val)) %>%
                    filter( !(case&row_num!=1) ) %>%
                    select(id, val)


                    Output



                      id      val
                    <fct> <dbl>
                    1 a 0
                    2 b 1
                    3 b 2
                    4 b 2
                    5 b 3
                    6 c NA
                    7 c 2
                    8 c NA
                    9 c 3






                    share|improve this answer












                    share|improve this answer



                    share|improve this answer










                    answered 3 hours ago









                    Vivek Kalyanarangan

                    4,8911827




                    4,8911827























                        1














                        Here is an option too:



                        df1 %>% 
                        mutate_if(is.factor,as.character) %>%
                        mutate_all(funs(replace(.,is.na(.),0))) %>%
                        slice(4:nrow(.))


                        This gives:



                         id val
                        1 a 0
                        2 b 1
                        3 b 2
                        4 b 2
                        5 b 3


                        Alternative:



                        df1 %>% 
                        mutate_if(is.factor,as.character) %>%
                        mutate_all(funs(replace(.,is.na(.),0))) %>%
                        unique()





                        share|improve this answer



















                        • 3




                          where did 4 come from?
                          – Sotos
                          3 hours ago










                        • The solution produces four 0s. We're only interested in having 1?
                          – NelsonGon
                          3 hours ago










                        • What if one group has 4 and another 3?
                          – Sotos
                          3 hours ago










                        • Sorry I only answered based on the question. Maybe then we could twist things up, not sure though!
                          – NelsonGon
                          2 hours ago










                        • Consider this example - df1 <- data.frame(id = rep(c("a", "b","c"), each = 4), val = c(NA, NA, NA, NA, 1, 2, 2, 3,NA,2,NA,3)) I think here OP wants to remove NA values for A group only, not the rest
                          – Vivek Kalyanarangan
                          2 hours ago
















                        1














                        Here is an option too:



                        df1 %>% 
                        mutate_if(is.factor,as.character) %>%
                        mutate_all(funs(replace(.,is.na(.),0))) %>%
                        slice(4:nrow(.))


                        This gives:



                         id val
                        1 a 0
                        2 b 1
                        3 b 2
                        4 b 2
                        5 b 3


                        Alternative:



                        df1 %>% 
                        mutate_if(is.factor,as.character) %>%
                        mutate_all(funs(replace(.,is.na(.),0))) %>%
                        unique()





                        share|improve this answer



















                        • 3




                          where did 4 come from?
                          – Sotos
                          3 hours ago










                        • The solution produces four 0s. We're only interested in having 1?
                          – NelsonGon
                          3 hours ago










                        • What if one group has 4 and another 3?
                          – Sotos
                          3 hours ago










                        • Sorry I only answered based on the question. Maybe then we could twist things up, not sure though!
                          – NelsonGon
                          2 hours ago










                        • Consider this example - df1 <- data.frame(id = rep(c("a", "b","c"), each = 4), val = c(NA, NA, NA, NA, 1, 2, 2, 3,NA,2,NA,3)) I think here OP wants to remove NA values for A group only, not the rest
                          – Vivek Kalyanarangan
                          2 hours ago














                        1












                        1








                        1






                        Here is an option too:



                        df1 %>% 
                        mutate_if(is.factor,as.character) %>%
                        mutate_all(funs(replace(.,is.na(.),0))) %>%
                        slice(4:nrow(.))


                        This gives:



                         id val
                        1 a 0
                        2 b 1
                        3 b 2
                        4 b 2
                        5 b 3


                        Alternative:



                        df1 %>% 
                        mutate_if(is.factor,as.character) %>%
                        mutate_all(funs(replace(.,is.na(.),0))) %>%
                        unique()





                        share|improve this answer














                        Here is an option too:



                        df1 %>% 
                        mutate_if(is.factor,as.character) %>%
                        mutate_all(funs(replace(.,is.na(.),0))) %>%
                        slice(4:nrow(.))


                        This gives:



                         id val
                        1 a 0
                        2 b 1
                        3 b 2
                        4 b 2
                        5 b 3


                        Alternative:



                        df1 %>% 
                        mutate_if(is.factor,as.character) %>%
                        mutate_all(funs(replace(.,is.na(.),0))) %>%
                        unique()






                        share|improve this answer














                        share|improve this answer



                        share|improve this answer








                        edited 2 hours ago

























                        answered 3 hours ago









                        NelsonGon

                        815217




                        815217








                        • 3




                          where did 4 come from?
                          – Sotos
                          3 hours ago










                        • The solution produces four 0s. We're only interested in having 1?
                          – NelsonGon
                          3 hours ago










                        • What if one group has 4 and another 3?
                          – Sotos
                          3 hours ago










                        • Sorry I only answered based on the question. Maybe then we could twist things up, not sure though!
                          – NelsonGon
                          2 hours ago










                        • Consider this example - df1 <- data.frame(id = rep(c("a", "b","c"), each = 4), val = c(NA, NA, NA, NA, 1, 2, 2, 3,NA,2,NA,3)) I think here OP wants to remove NA values for A group only, not the rest
                          – Vivek Kalyanarangan
                          2 hours ago














                        • 3




                          where did 4 come from?
                          – Sotos
                          3 hours ago










                        • The solution produces four 0s. We're only interested in having 1?
                          – NelsonGon
                          3 hours ago










                        • What if one group has 4 and another 3?
                          – Sotos
                          3 hours ago










                        • Sorry I only answered based on the question. Maybe then we could twist things up, not sure though!
                          – NelsonGon
                          2 hours ago










                        • Consider this example - df1 <- data.frame(id = rep(c("a", "b","c"), each = 4), val = c(NA, NA, NA, NA, 1, 2, 2, 3,NA,2,NA,3)) I think here OP wants to remove NA values for A group only, not the rest
                          – Vivek Kalyanarangan
                          2 hours ago








                        3




                        3




                        where did 4 come from?
                        – Sotos
                        3 hours ago




                        where did 4 come from?
                        – Sotos
                        3 hours ago












                        The solution produces four 0s. We're only interested in having 1?
                        – NelsonGon
                        3 hours ago




                        The solution produces four 0s. We're only interested in having 1?
                        – NelsonGon
                        3 hours ago












                        What if one group has 4 and another 3?
                        – Sotos
                        3 hours ago




                        What if one group has 4 and another 3?
                        – Sotos
                        3 hours ago












                        Sorry I only answered based on the question. Maybe then we could twist things up, not sure though!
                        – NelsonGon
                        2 hours ago




                        Sorry I only answered based on the question. Maybe then we could twist things up, not sure though!
                        – NelsonGon
                        2 hours ago












                        Consider this example - df1 <- data.frame(id = rep(c("a", "b","c"), each = 4), val = c(NA, NA, NA, NA, 1, 2, 2, 3,NA,2,NA,3)) I think here OP wants to remove NA values for A group only, not the rest
                        – Vivek Kalyanarangan
                        2 hours ago




                        Consider this example - df1 <- data.frame(id = rep(c("a", "b","c"), each = 4), val = c(NA, NA, NA, NA, 1, 2, 2, 3,NA,2,NA,3)) I think here OP wants to remove NA values for A group only, not the rest
                        – Vivek Kalyanarangan
                        2 hours ago











                        0














                        Here is a base R solution.



                        res <- lapply(split(df1, df1$id), function(DF){
                        if(anyNA(DF$val)) {
                        i <- is.na(DF$val)
                        DF$val[i] <- 0
                        DF <- rbind(DF[i & !duplicated(DF[i, ]), ], DF[!i, ])
                        }
                        DF
                        })
                        res <- do.call(rbind, res)
                        row.names(res) <- NULL
                        res
                        # id val
                        #1 a 0
                        #2 b 1
                        #3 b 2
                        #4 b 2
                        #5 b 3


                        Edit.



                        A dplyr solution could be the following.
                        It was tested with the original dataset posted by the OP, with the dataset in Vivek Kalyanarangan's answer and with the dataset in markus' comment, renamed df2 and df3, respectively.



                        library(dplyr)

                        na2zero <- function(DF){
                        DF %>%
                        group_by(id) %>%
                        mutate(val = ifelse(is.na(val), 0, val),
                        crit = val == 0 & duplicated(val)) %>%
                        filter(!crit) %>%
                        select(-crit)
                        }

                        na2zero(df1)
                        na2zero(df2)
                        na2zero(df3)





                        share|improve this answer























                        • Rui, try with df1 <- data.frame(id = rep(c("a", "b"), each = 2), val = c(NA, 1, 2, 3)). Unfortunately your solution doesn't return a data frame with only three rows.
                          – markus
                          3 hours ago










                        • @markus No, it doesn't. The NA is replaced by a 0 and the other value of val is not NA so both must be in the output. At least that's how I'm understanding the OP's problem.
                          – Rui Barradas
                          2 hours ago










                        • Fair enough. People are reading the question differently.
                          – markus
                          1 hour ago
















                        0














                        Here is a base R solution.



                        res <- lapply(split(df1, df1$id), function(DF){
                        if(anyNA(DF$val)) {
                        i <- is.na(DF$val)
                        DF$val[i] <- 0
                        DF <- rbind(DF[i & !duplicated(DF[i, ]), ], DF[!i, ])
                        }
                        DF
                        })
                        res <- do.call(rbind, res)
                        row.names(res) <- NULL
                        res
                        # id val
                        #1 a 0
                        #2 b 1
                        #3 b 2
                        #4 b 2
                        #5 b 3


                        Edit.



                        A dplyr solution could be the following.
                        It was tested with the original dataset posted by the OP, with the dataset in Vivek Kalyanarangan's answer and with the dataset in markus' comment, renamed df2 and df3, respectively.



                        library(dplyr)

                        na2zero <- function(DF){
                        DF %>%
                        group_by(id) %>%
                        mutate(val = ifelse(is.na(val), 0, val),
                        crit = val == 0 & duplicated(val)) %>%
                        filter(!crit) %>%
                        select(-crit)
                        }

                        na2zero(df1)
                        na2zero(df2)
                        na2zero(df3)





                        share|improve this answer























                        • Rui, try with df1 <- data.frame(id = rep(c("a", "b"), each = 2), val = c(NA, 1, 2, 3)). Unfortunately your solution doesn't return a data frame with only three rows.
                          – markus
                          3 hours ago










                        • @markus No, it doesn't. The NA is replaced by a 0 and the other value of val is not NA so both must be in the output. At least that's how I'm understanding the OP's problem.
                          – Rui Barradas
                          2 hours ago










                        • Fair enough. People are reading the question differently.
                          – markus
                          1 hour ago














                        0












                        0








                        0






                        Here is a base R solution.



                        res <- lapply(split(df1, df1$id), function(DF){
                        if(anyNA(DF$val)) {
                        i <- is.na(DF$val)
                        DF$val[i] <- 0
                        DF <- rbind(DF[i & !duplicated(DF[i, ]), ], DF[!i, ])
                        }
                        DF
                        })
                        res <- do.call(rbind, res)
                        row.names(res) <- NULL
                        res
                        # id val
                        #1 a 0
                        #2 b 1
                        #3 b 2
                        #4 b 2
                        #5 b 3


                        Edit.



                        A dplyr solution could be the following.
                        It was tested with the original dataset posted by the OP, with the dataset in Vivek Kalyanarangan's answer and with the dataset in markus' comment, renamed df2 and df3, respectively.



                        library(dplyr)

                        na2zero <- function(DF){
                        DF %>%
                        group_by(id) %>%
                        mutate(val = ifelse(is.na(val), 0, val),
                        crit = val == 0 & duplicated(val)) %>%
                        filter(!crit) %>%
                        select(-crit)
                        }

                        na2zero(df1)
                        na2zero(df2)
                        na2zero(df3)





                        share|improve this answer














                        Here is a base R solution.



                        res <- lapply(split(df1, df1$id), function(DF){
                        if(anyNA(DF$val)) {
                        i <- is.na(DF$val)
                        DF$val[i] <- 0
                        DF <- rbind(DF[i & !duplicated(DF[i, ]), ], DF[!i, ])
                        }
                        DF
                        })
                        res <- do.call(rbind, res)
                        row.names(res) <- NULL
                        res
                        # id val
                        #1 a 0
                        #2 b 1
                        #3 b 2
                        #4 b 2
                        #5 b 3


                        Edit.



                        A dplyr solution could be the following.
                        It was tested with the original dataset posted by the OP, with the dataset in Vivek Kalyanarangan's answer and with the dataset in markus' comment, renamed df2 and df3, respectively.



                        library(dplyr)

                        na2zero <- function(DF){
                        DF %>%
                        group_by(id) %>%
                        mutate(val = ifelse(is.na(val), 0, val),
                        crit = val == 0 & duplicated(val)) %>%
                        filter(!crit) %>%
                        select(-crit)
                        }

                        na2zero(df1)
                        na2zero(df2)
                        na2zero(df3)






                        share|improve this answer














                        share|improve this answer



                        share|improve this answer








                        edited 2 hours ago

























                        answered 3 hours ago









                        Rui Barradas

                        16.1k41730




                        16.1k41730












                        • Rui, try with df1 <- data.frame(id = rep(c("a", "b"), each = 2), val = c(NA, 1, 2, 3)). Unfortunately your solution doesn't return a data frame with only three rows.
                          – markus
                          3 hours ago










                        • @markus No, it doesn't. The NA is replaced by a 0 and the other value of val is not NA so both must be in the output. At least that's how I'm understanding the OP's problem.
                          – Rui Barradas
                          2 hours ago










                        • Fair enough. People are reading the question differently.
                          – markus
                          1 hour ago


















                        • Rui, try with df1 <- data.frame(id = rep(c("a", "b"), each = 2), val = c(NA, 1, 2, 3)). Unfortunately your solution doesn't return a data frame with only three rows.
                          – markus
                          3 hours ago










                        • @markus No, it doesn't. The NA is replaced by a 0 and the other value of val is not NA so both must be in the output. At least that's how I'm understanding the OP's problem.
                          – Rui Barradas
                          2 hours ago










                        • Fair enough. People are reading the question differently.
                          – markus
                          1 hour ago
















                        Rui, try with df1 <- data.frame(id = rep(c("a", "b"), each = 2), val = c(NA, 1, 2, 3)). Unfortunately your solution doesn't return a data frame with only three rows.
                        – markus
                        3 hours ago




                        Rui, try with df1 <- data.frame(id = rep(c("a", "b"), each = 2), val = c(NA, 1, 2, 3)). Unfortunately your solution doesn't return a data frame with only three rows.
                        – markus
                        3 hours ago












                        @markus No, it doesn't. The NA is replaced by a 0 and the other value of val is not NA so both must be in the output. At least that's how I'm understanding the OP's problem.
                        – Rui Barradas
                        2 hours ago




                        @markus No, it doesn't. The NA is replaced by a 0 and the other value of val is not NA so both must be in the output. At least that's how I'm understanding the OP's problem.
                        – Rui Barradas
                        2 hours ago












                        Fair enough. People are reading the question differently.
                        – markus
                        1 hour ago




                        Fair enough. People are reading the question differently.
                        – markus
                        1 hour ago











                        0














                        Another base approach, one that doesn't maintain the order of the rows and takes advantage of factors remembering lost values:



                        df1 <- na.omit(df1)

                        df1 <- rbind(
                        df1,
                        data.frame(
                        id = levels(df1$id)[!levels(df1$id) %in% df1$id],
                        val = 0)
                        )


                        I do personally prefer the dplyr approach given by Sotos, as I don't like rbind-ing data.frames back together so it's a matter of taste, but this isn't unbearably complicated by my eye. It's easy enough to adapt to a character id column with a unique(df1$id) variable.






                        share|improve this answer


























                          0














                          Another base approach, one that doesn't maintain the order of the rows and takes advantage of factors remembering lost values:



                          df1 <- na.omit(df1)

                          df1 <- rbind(
                          df1,
                          data.frame(
                          id = levels(df1$id)[!levels(df1$id) %in% df1$id],
                          val = 0)
                          )


                          I do personally prefer the dplyr approach given by Sotos, as I don't like rbind-ing data.frames back together so it's a matter of taste, but this isn't unbearably complicated by my eye. It's easy enough to adapt to a character id column with a unique(df1$id) variable.






                          share|improve this answer
























                            0












                            0








                            0






                            Another base approach, one that doesn't maintain the order of the rows and takes advantage of factors remembering lost values:



                            df1 <- na.omit(df1)

                            df1 <- rbind(
                            df1,
                            data.frame(
                            id = levels(df1$id)[!levels(df1$id) %in% df1$id],
                            val = 0)
                            )


                            I do personally prefer the dplyr approach given by Sotos, as I don't like rbind-ing data.frames back together so it's a matter of taste, but this isn't unbearably complicated by my eye. It's easy enough to adapt to a character id column with a unique(df1$id) variable.






                            share|improve this answer












                            Another base approach, one that doesn't maintain the order of the rows and takes advantage of factors remembering lost values:



                            df1 <- na.omit(df1)

                            df1 <- rbind(
                            df1,
                            data.frame(
                            id = levels(df1$id)[!levels(df1$id) %in% df1$id],
                            val = 0)
                            )


                            I do personally prefer the dplyr approach given by Sotos, as I don't like rbind-ing data.frames back together so it's a matter of taste, but this isn't unbearably complicated by my eye. It's easy enough to adapt to a character id column with a unique(df1$id) variable.







                            share|improve this answer












                            share|improve this answer



                            share|improve this answer










                            answered 16 mins ago









                            CriminallyVulgar

                            1




                            1






























                                draft saved

                                draft discarded




















































                                Thanks for contributing an answer to Stack Overflow!


                                • Please be sure to answer the question. Provide details and share your research!

                                But avoid



                                • Asking for help, clarification, or responding to other answers.

                                • Making statements based on opinion; back them up with references or personal experience.


                                To learn more, see our tips on writing great answers.





                                Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


                                Please pay close attention to the following guidance:


                                • Please be sure to answer the question. Provide details and share your research!

                                But avoid



                                • Asking for help, clarification, or responding to other answers.

                                • Making statements based on opinion; back them up with references or personal experience.


                                To learn more, see our tips on writing great answers.




                                draft saved


                                draft discarded














                                StackExchange.ready(
                                function () {
                                StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54022536%2freplace-all-na-values-for-variable-with-one-row-equal-to-0%23new-answer', 'question_page');
                                }
                                );

                                Post as a guest















                                Required, but never shown





















































                                Required, but never shown














                                Required, but never shown












                                Required, but never shown







                                Required, but never shown

































                                Required, but never shown














                                Required, but never shown












                                Required, but never shown







                                Required, but never shown







                                Popular posts from this blog

                                404 Error Contact Form 7 ajax form submitting

                                How to know if a Active Directory user can login interactively

                                TypeError: fit_transform() missing 1 required positional argument: 'X'