Replace all NA values for variable with one row equal to 0

Slightly difficult to phrase, as far as I saw none of the similar questions answered my problem.

I have a data.frame such as:

df1 <- data.frame(id = rep(c("a", "b"), each = 4),

                  val = c(NA, NA, NA, NA, 1, 2, 2, 3))



df1



  id val

1  a  NA

2  a  NA

3  a  NA

4  a  NA

5  b   1

6  b   2

7  b   2

8  b   3

and I want to get rid of all the NA values (easy enough using e.g. filter() ) but make sure that if this removes all of one id value (in this case it removes every instance of "a") that one extra row is inserted of (e.g.) a = 0

so that:

obviously easy enough to do this in a roundabout way but I was wondering if there's a tidy/elegant way to do this. I thought tidyr::complete() might help but not entirely sure how to apply it to a case like this

I don't care about the order of the rows

Cheers!

asked 3 hours ago

Robert Hickman

15019

So you want to add rows with 0 only if all the values for particular id is 0?
– Ronak Shah
3 hours ago

only if they're all NA for a particular id
– Robert Hickman
3 hours ago

1

@RobertHickman There seems to be some confusion about your desired output. Could you update your question with the expected output based on this df1 <- data.frame(id = rep(c("a", "b","c"), each = 4), val = c(NA, NA, NA, NA, 1, 2, 2, 3,NA,2,NA,3)) ? Thanks to @VivekKalyanarangan for the data.
– markus
2 hours ago

add a comment |

Slightly difficult to phrase, as far as I saw none of the similar questions answered my problem.

I have a data.frame such as:

df1 <- data.frame(id = rep(c("a", "b"), each = 4),

                  val = c(NA, NA, NA, NA, 1, 2, 2, 3))



df1



  id val

1  a  NA

2  a  NA

3  a  NA

4  a  NA

5  b   1

6  b   2

7  b   2

8  b   3

so that:

I don't care about the order of the rows

Cheers!

asked 3 hours ago

Robert Hickman

15019

So you want to add rows with 0 only if all the values for particular id is 0?
– Ronak Shah
3 hours ago

only if they're all NA for a particular id
– Robert Hickman
3 hours ago

1

@RobertHickman There seems to be some confusion about your desired output. Could you update your question with the expected output based on this df1 <- data.frame(id = rep(c("a", "b","c"), each = 4), val = c(NA, NA, NA, NA, 1, 2, 2, 3,NA,2,NA,3)) ? Thanks to @VivekKalyanarangan for the data.
– markus
2 hours ago

add a comment |

Slightly difficult to phrase, as far as I saw none of the similar questions answered my problem.

I have a data.frame such as:

df1 <- data.frame(id = rep(c("a", "b"), each = 4),

                  val = c(NA, NA, NA, NA, 1, 2, 2, 3))



df1



  id val

1  a  NA

2  a  NA

3  a  NA

4  a  NA

5  b   1

6  b   2

7  b   2

8  b   3

so that:

I don't care about the order of the rows

Cheers!

asked 3 hours ago

Robert Hickman

15019

Slightly difficult to phrase, as far as I saw none of the similar questions answered my problem.

I have a data.frame such as:

df1 <- data.frame(id = rep(c("a", "b"), each = 4),

                  val = c(NA, NA, NA, NA, 1, 2, 2, 3))



df1



  id val

1  a  NA

2  a  NA

3  a  NA

4  a  NA

5  b   1

6  b   2

7  b   2

8  b   3

so that:

I don't care about the order of the rows

Cheers!

r na complete

asked 3 hours ago

Robert Hickman

15019

asked 3 hours ago

Robert Hickman

15019

asked 3 hours ago

Robert Hickman

15019

asked 3 hours ago

Robert Hickman

15019

asked 3 hours ago

Robert Hickman

15019

So you want to add rows with 0 only if all the values for particular id is 0?
– Ronak Shah
3 hours ago

only if they're all NA for a particular id
– Robert Hickman
3 hours ago

1

@RobertHickman There seems to be some confusion about your desired output. Could you update your question with the expected output based on this df1 <- data.frame(id = rep(c("a", "b","c"), each = 4), val = c(NA, NA, NA, NA, 1, 2, 2, 3,NA,2,NA,3)) ? Thanks to @VivekKalyanarangan for the data.
– markus
2 hours ago

add a comment |

So you want to add rows with 0 only if all the values for particular id is 0?
– Ronak Shah
3 hours ago

only if they're all NA for a particular id
– Robert Hickman
3 hours ago

1

@RobertHickman There seems to be some confusion about your desired output. Could you update your question with the expected output based on this df1 <- data.frame(id = rep(c("a", "b","c"), each = 4), val = c(NA, NA, NA, NA, 1, 2, 2, 3,NA,2,NA,3)) ? Thanks to @VivekKalyanarangan for the data.
– markus
2 hours ago

So you want to add rows with 0 only if all the values for particular id is 0?
– Ronak Shah
3 hours ago

only if they're all NA for a particular id
– Robert Hickman
3 hours ago

@RobertHickman There seems to be some confusion about your desired output. Could you update your question with the expected output based on this df1 <- data.frame(id = rep(c("a", "b","c"), each = 4), val = c(NA, NA, NA, NA, 1, 2, 2, 3,NA,2,NA,3)) ? Thanks to @VivekKalyanarangan for the data.
– markus
2 hours ago

add a comment |

8 Answers
8

active

oldest

votes

Another idea using dplyr,

library(dplyr)



df1 %>% 

 group_by(id) %>% 

 mutate(val = ifelse(row_number() == 1 & all(is.na(val)), 0, val)) %>% 

 na.omit()

which gives,

# A tibble: 5 x 2

# Groups:   id [2]

  id      val

  <fct> <dbl>

1 a         0

2 b         1

3 b         2

4 b         2

5 b         3

answered 2 hours ago

Sotos

28.2k51640

1

(+1) Seems like the most robust answer here. Would be marginally more concise using replace(val, all(is.na(val)) * 1, 0) instead of the ifelse(...).
– Mikko Marttila
1 hour ago

@MikkoMarttila Good suggestion. I usually try and avoid ifelse in general
– Sotos
1 hour ago

add a comment |

We may do

df1 %>% group_by(id) %>% do(if(all(is.na(.$val))) replace(.[1, ], 2, 0) else na.omit(.))

# A tibble: 5 x 2

# Groups:   id [2]

#   id      val

#   <fct> <dbl>

# 1 a         0

# 2 b         1

# 3 b         2

# 4 b         2

# 5 b         3

After grouping by id, if everything in val is NA, then we leave only the first row with the second element replaced by 0, otherwise the same data is returned after applying na.omit.

In a more readable format that would be

df1 %>% group_by(id) %>% 

  do(if(all(is.na(.$val))) data.frame(id = .$id[1], val = 0) else na.omit(.))

(Here I presume that you indeed want to get rid of all NA values; otherwise there is no need for na.omit.)

edited 2 hours ago

answered 3 hours ago

Julius Vainora

32.6k75979

1

@markus, right, I had assumed that that's the goal. Thanks!
– Julius Vainora
3 hours ago

It looks like op wants to retain the first row and replace the val column of that row with 0 where all val is NA for a group. Check my ans pls. Agree with @markus, it does seem tricky
– Vivek Kalyanarangan
2 hours ago

1

@VivekKalyanarangan, that's what I initially thought, but "and I want to get rid of all the NA values" suggests otherwise.
– Julius Vainora
2 hours ago

add a comment |

df1[is.na(df1)] <- 0

df1[!(duplicated(df1$id) & df1$val == 0), ]



  id val

1  a   0

5  b   1

6  b   2

7  b   2

8  b   3

answered 3 hours ago

Adamm

832517

5

Would this work for ids that contain NAs and non-NAs? Try with df1 <- data.frame(id = rep(c("a", "b"), each = 2), val = c(NA, 1, 2, 3))
– markus
3 hours ago

I think this is the best so far (I'll leave it open for another hour or so to see) would maybe change to df %>% replace(is.na(.), 0) %>% .[!(duplicated(.$id) & .$val == 0), ]
– Robert Hickman
2 hours ago

add a comment |

Base R option is to find groups with all NAs and transform them by changing their val to 0 and select only unique rows so that there is only one row per group. We rbind this dataframe with the groups which are !all_NA.

all_NA <- with(df1, ave(is.na(val), id, FUN = all))

rbind(unique(transform(df1[all_NA, ], val = 0)), df1[!all_NA, ])



#  id val

#1  a   0

#5  b   1

#6  b   2

#7  b   2

#8  b   3

dplyr option looks ugly but one way is to make two groups of dataframes one with groups of all NA values and other with groups of all non-NA values. For groups with all NA values we add row with it's id and val as 0 and bind this to the other group.

library(dplyr)



bind_rows(df1 %>%

            group_by(id) %>%

            filter(all(!is.na(val))), 

          df1 %>%

             group_by(id) %>%

             filter(all(is.na(val))) %>%

             ungroup() %>%

             summarise(id = unique(id), 

                       val = 0)) %>%

arrange(id)





#   id      val

#  <fct> <dbl>

#1  a         0

#2  b         1

#3  b         2

#4  b         2

#5  b         3

edited 3 hours ago

answered 3 hours ago

Ronak Shah

32.6k103753

add a comment |

Changed the df to make example more exhaustive -

df1 <- data.frame(id = rep(c("a", "b","c"), each = 4),

                  val = c(NA, NA, NA, NA, 1, 2, 2, 3,NA,2,NA,3))

library(dplyr)

df1 %>%

  group_by(id) %>%

  mutate(case=sum(is.na(val))==n(), row_num=row_number() ) %>%

  mutate(val=ifelse(is.na(val)&case,0,val)) %>%

  filter( !(case&row_num!=1) ) %>%

  select(id, val)

Output

  id      val

  <fct> <dbl>

1 a         0

2 b         1

3 b         2

4 b         2

5 b         3

6 c        NA

7 c         2

8 c        NA

9 c         3

answered 3 hours ago

Vivek Kalyanarangan

4,8911827

add a comment |

Here is an option too:

df1 %>% 

  mutate_if(is.factor,as.character) %>% 

 mutate_all(funs(replace(.,is.na(.),0))) %>% 

  slice(4:nrow(.))

This gives:

Alternative:

df1 %>% 

  mutate_if(is.factor,as.character) %>% 

 mutate_all(funs(replace(.,is.na(.),0))) %>% 

  unique()

edited 2 hours ago

answered 3 hours ago

NelsonGon

815217

3

where did 4 come from?
– Sotos
3 hours ago

The solution produces four 0s. We're only interested in having 1?
– NelsonGon
3 hours ago

What if one group has 4 and another 3?
– Sotos
3 hours ago

Sorry I only answered based on the question. Maybe then we could twist things up, not sure though!
– NelsonGon
2 hours ago

Consider this example - df1 <- data.frame(id = rep(c("a", "b","c"), each = 4), val = c(NA, NA, NA, NA, 1, 2, 2, 3,NA,2,NA,3)) I think here OP wants to remove NA values for A group only, not the rest
– Vivek Kalyanarangan
2 hours ago

|
show 2 more comments

Here is a base R solution.

res <- lapply(split(df1, df1$id), function(DF){

  if(anyNA(DF$val)) {

    i <- is.na(DF$val)

    DF$val[i] <- 0

    DF <- rbind(DF[i & !duplicated(DF[i, ]), ], DF[!i, ])

  }

  DF

})

res <- do.call(rbind, res)

row.names(res) <- NULL

res

#  id val

#1  a   0

#2  b   1

#3  b   2

#4  b   2

#5  b   3

Edit.

A dplyr solution could be the following.
It was tested with the original dataset posted by the OP, with the dataset in Vivek Kalyanarangan's answer and with the dataset in markus' comment, renamed df2 and df3, respectively.

library(dplyr)



na2zero <- function(DF){

  DF %>%

    group_by(id) %>%

    mutate(val = ifelse(is.na(val), 0, val),

           crit = val == 0 & duplicated(val)) %>%

    filter(!crit) %>%

    select(-crit)

}



na2zero(df1)

na2zero(df2)

na2zero(df3)

edited 2 hours ago

answered 3 hours ago

Rui Barradas

16.1k41730

Rui, try with df1 <- data.frame(id = rep(c("a", "b"), each = 2), val = c(NA, 1, 2, 3)). Unfortunately your solution doesn't return a data frame with only three rows.
– markus
3 hours ago

@markus No, it doesn't. The NA is replaced by a 0 and the other value of val is not NA so both must be in the output. At least that's how I'm understanding the OP's problem.
– Rui Barradas
2 hours ago

Fair enough. People are reading the question differently.
– markus
1 hour ago

add a comment |

Another base approach, one that doesn't maintain the order of the rows and takes advantage of factors remembering lost values:

df1 <- na.omit(df1)



df1 <- rbind(

  df1, 

  data.frame(

    id  = levels(df1$id)[!levels(df1$id) %in% df1$id], 

    val = 0)

  )

I do personally prefer the dplyr approach given by Sotos, as I don't like rbind-ing data.frames back together so it's a matter of taste, but this isn't unbearably complicated by my eye. It's easy enough to adapt to a character id column with a unique(df1$id) variable.

answered 16 mins ago

CriminallyVulgar

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54022536%2freplace-all-na-values-for-variable-with-one-row-equal-to-0%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

8 Answers
8

active

oldest

votes

8 Answers
8

active

oldest

votes

Another idea using dplyr,

library(dplyr)



df1 %>% 

 group_by(id) %>% 

 mutate(val = ifelse(row_number() == 1 & all(is.na(val)), 0, val)) %>% 

 na.omit()

which gives,

# A tibble: 5 x 2

# Groups:   id [2]

  id      val

  <fct> <dbl>

1 a         0

2 b         1

3 b         2

4 b         2

5 b         3

answered 2 hours ago

Sotos

28.2k51640

1

(+1) Seems like the most robust answer here. Would be marginally more concise using replace(val, all(is.na(val)) * 1, 0) instead of the ifelse(...).
– Mikko Marttila
1 hour ago

@MikkoMarttila Good suggestion. I usually try and avoid ifelse in general
– Sotos
1 hour ago

add a comment |

Another idea using dplyr,

library(dplyr)



df1 %>% 

 group_by(id) %>% 

 mutate(val = ifelse(row_number() == 1 & all(is.na(val)), 0, val)) %>% 

 na.omit()

which gives,

# A tibble: 5 x 2

# Groups:   id [2]

  id      val

  <fct> <dbl>

1 a         0

2 b         1

3 b         2

4 b         2

5 b         3

answered 2 hours ago

Sotos

28.2k51640

1

(+1) Seems like the most robust answer here. Would be marginally more concise using replace(val, all(is.na(val)) * 1, 0) instead of the ifelse(...).
– Mikko Marttila
1 hour ago

@MikkoMarttila Good suggestion. I usually try and avoid ifelse in general
– Sotos
1 hour ago

add a comment |

Another idea using dplyr,

library(dplyr)



df1 %>% 

 group_by(id) %>% 

 mutate(val = ifelse(row_number() == 1 & all(is.na(val)), 0, val)) %>% 

 na.omit()

which gives,

# A tibble: 5 x 2

# Groups:   id [2]

  id      val

  <fct> <dbl>

1 a         0

2 b         1

3 b         2

4 b         2

5 b         3

answered 2 hours ago

Sotos

28.2k51640

Another idea using dplyr,

library(dplyr)



df1 %>% 

 group_by(id) %>% 

 mutate(val = ifelse(row_number() == 1 & all(is.na(val)), 0, val)) %>% 

 na.omit()

which gives,

# A tibble: 5 x 2

# Groups:   id [2]

  id      val

  <fct> <dbl>

1 a         0

2 b         1

3 b         2

4 b         2

5 b         3

answered 2 hours ago

Sotos

28.2k51640

answered 2 hours ago

Sotos

28.2k51640

answered 2 hours ago

Sotos

28.2k51640

answered 2 hours ago

Sotos

28.2k51640

1

(+1) Seems like the most robust answer here. Would be marginally more concise using replace(val, all(is.na(val)) * 1, 0) instead of the ifelse(...).
– Mikko Marttila
1 hour ago

@MikkoMarttila Good suggestion. I usually try and avoid ifelse in general
– Sotos
1 hour ago

add a comment |

1

(+1) Seems like the most robust answer here. Would be marginally more concise using replace(val, all(is.na(val)) * 1, 0) instead of the ifelse(...).
– Mikko Marttila
1 hour ago

@MikkoMarttila Good suggestion. I usually try and avoid ifelse in general
– Sotos
1 hour ago

(+1) Seems like the most robust answer here. Would be marginally more concise using replace(val, all(is.na(val)) * 1, 0) instead of the ifelse(...).
– Mikko Marttila
1 hour ago

@MikkoMarttila Good suggestion. I usually try and avoid ifelse in general
– Sotos
1 hour ago

add a comment |

We may do

df1 %>% group_by(id) %>% do(if(all(is.na(.$val))) replace(.[1, ], 2, 0) else na.omit(.))

# A tibble: 5 x 2

# Groups:   id [2]

#   id      val

#   <fct> <dbl>

# 1 a         0

# 2 b         1

# 3 b         2

# 4 b         2

# 5 b         3

After grouping by id, if everything in val is NA, then we leave only the first row with the second element replaced by 0, otherwise the same data is returned after applying na.omit.

In a more readable format that would be

df1 %>% group_by(id) %>% 

  do(if(all(is.na(.$val))) data.frame(id = .$id[1], val = 0) else na.omit(.))

(Here I presume that you indeed want to get rid of all NA values; otherwise there is no need for na.omit.)

edited 2 hours ago

answered 3 hours ago

Julius Vainora

32.6k75979

1

@markus, right, I had assumed that that's the goal. Thanks!
– Julius Vainora
3 hours ago

It looks like op wants to retain the first row and replace the val column of that row with 0 where all val is NA for a group. Check my ans pls. Agree with @markus, it does seem tricky
– Vivek Kalyanarangan
2 hours ago

1

@VivekKalyanarangan, that's what I initially thought, but "and I want to get rid of all the NA values" suggests otherwise.
– Julius Vainora
2 hours ago

add a comment |

We may do

df1 %>% group_by(id) %>% do(if(all(is.na(.$val))) replace(.[1, ], 2, 0) else na.omit(.))

# A tibble: 5 x 2

# Groups:   id [2]

#   id      val

#   <fct> <dbl>

# 1 a         0

# 2 b         1

# 3 b         2

# 4 b         2

# 5 b         3

After grouping by id, if everything in val is NA, then we leave only the first row with the second element replaced by 0, otherwise the same data is returned after applying na.omit.

In a more readable format that would be

df1 %>% group_by(id) %>% 

  do(if(all(is.na(.$val))) data.frame(id = .$id[1], val = 0) else na.omit(.))

(Here I presume that you indeed want to get rid of all NA values; otherwise there is no need for na.omit.)

edited 2 hours ago

answered 3 hours ago

Julius Vainora

32.6k75979

1

@markus, right, I had assumed that that's the goal. Thanks!
– Julius Vainora
3 hours ago

It looks like op wants to retain the first row and replace the val column of that row with 0 where all val is NA for a group. Check my ans pls. Agree with @markus, it does seem tricky
– Vivek Kalyanarangan
2 hours ago

1

@VivekKalyanarangan, that's what I initially thought, but "and I want to get rid of all the NA values" suggests otherwise.
– Julius Vainora
2 hours ago

add a comment |

We may do

df1 %>% group_by(id) %>% do(if(all(is.na(.$val))) replace(.[1, ], 2, 0) else na.omit(.))

# A tibble: 5 x 2

# Groups:   id [2]

#   id      val

#   <fct> <dbl>

# 1 a         0

# 2 b         1

# 3 b         2

# 4 b         2

# 5 b         3

After grouping by id, if everything in val is NA, then we leave only the first row with the second element replaced by 0, otherwise the same data is returned after applying na.omit.

In a more readable format that would be

df1 %>% group_by(id) %>% 

  do(if(all(is.na(.$val))) data.frame(id = .$id[1], val = 0) else na.omit(.))

(Here I presume that you indeed want to get rid of all NA values; otherwise there is no need for na.omit.)

edited 2 hours ago

answered 3 hours ago

Julius Vainora

32.6k75979

We may do

df1 %>% group_by(id) %>% do(if(all(is.na(.$val))) replace(.[1, ], 2, 0) else na.omit(.))

# A tibble: 5 x 2

# Groups:   id [2]

#   id      val

#   <fct> <dbl>

# 1 a         0

# 2 b         1

# 3 b         2

# 4 b         2

# 5 b         3

After grouping by id, if everything in val is NA, then we leave only the first row with the second element replaced by 0, otherwise the same data is returned after applying na.omit.

In a more readable format that would be

df1 %>% group_by(id) %>% 

  do(if(all(is.na(.$val))) data.frame(id = .$id[1], val = 0) else na.omit(.))

(Here I presume that you indeed want to get rid of all NA values; otherwise there is no need for na.omit.)

edited 2 hours ago

answered 3 hours ago

Julius Vainora

32.6k75979

edited 2 hours ago

answered 3 hours ago

Julius Vainora

32.6k75979

answered 3 hours ago

Julius Vainora

32.6k75979

answered 3 hours ago

Julius Vainora

32.6k75979

1

@markus, right, I had assumed that that's the goal. Thanks!
– Julius Vainora
3 hours ago

It looks like op wants to retain the first row and replace the val column of that row with 0 where all val is NA for a group. Check my ans pls. Agree with @markus, it does seem tricky
– Vivek Kalyanarangan
2 hours ago

1

@VivekKalyanarangan, that's what I initially thought, but "and I want to get rid of all the NA values" suggests otherwise.
– Julius Vainora
2 hours ago

add a comment |

1

@markus, right, I had assumed that that's the goal. Thanks!
– Julius Vainora
3 hours ago

It looks like op wants to retain the first row and replace the val column of that row with 0 where all val is NA for a group. Check my ans pls. Agree with @markus, it does seem tricky
– Vivek Kalyanarangan
2 hours ago

1

@VivekKalyanarangan, that's what I initially thought, but "and I want to get rid of all the NA values" suggests otherwise.
– Julius Vainora
2 hours ago

@markus, right, I had assumed that that's the goal. Thanks!
– Julius Vainora
3 hours ago

It looks like op wants to retain the first row and replace the val column of that row with 0 where all val is NA for a group. Check my ans pls. Agree with @markus, it does seem tricky
– Vivek Kalyanarangan
2 hours ago

@VivekKalyanarangan, that's what I initially thought, but "and I want to get rid of all the NA values" suggests otherwise.
– Julius Vainora
2 hours ago

add a comment |

df1[is.na(df1)] <- 0

df1[!(duplicated(df1$id) & df1$val == 0), ]



  id val

1  a   0

5  b   1

6  b   2

7  b   2

8  b   3

answered 3 hours ago

Adamm

832517

5

Would this work for ids that contain NAs and non-NAs? Try with df1 <- data.frame(id = rep(c("a", "b"), each = 2), val = c(NA, 1, 2, 3))
– markus
3 hours ago

I think this is the best so far (I'll leave it open for another hour or so to see) would maybe change to df %>% replace(is.na(.), 0) %>% .[!(duplicated(.$id) & .$val == 0), ]
– Robert Hickman
2 hours ago

add a comment |

df1[is.na(df1)] <- 0

df1[!(duplicated(df1$id) & df1$val == 0), ]



  id val

1  a   0

5  b   1

6  b   2

7  b   2

8  b   3

answered 3 hours ago

Adamm

832517

5

Would this work for ids that contain NAs and non-NAs? Try with df1 <- data.frame(id = rep(c("a", "b"), each = 2), val = c(NA, 1, 2, 3))
– markus
3 hours ago

I think this is the best so far (I'll leave it open for another hour or so to see) would maybe change to df %>% replace(is.na(.), 0) %>% .[!(duplicated(.$id) & .$val == 0), ]
– Robert Hickman
2 hours ago

add a comment |

df1[is.na(df1)] <- 0

df1[!(duplicated(df1$id) & df1$val == 0), ]



  id val

1  a   0

5  b   1

6  b   2

7  b   2

8  b   3

answered 3 hours ago

Adamm

832517

df1[is.na(df1)] <- 0

df1[!(duplicated(df1$id) & df1$val == 0), ]



  id val

1  a   0

5  b   1

6  b   2

7  b   2

8  b   3

answered 3 hours ago

Adamm

832517

answered 3 hours ago

Adamm

832517

answered 3 hours ago

Adamm

832517

answered 3 hours ago

Adamm

832517

5

Would this work for ids that contain NAs and non-NAs? Try with df1 <- data.frame(id = rep(c("a", "b"), each = 2), val = c(NA, 1, 2, 3))
– markus
3 hours ago

I think this is the best so far (I'll leave it open for another hour or so to see) would maybe change to df %>% replace(is.na(.), 0) %>% .[!(duplicated(.$id) & .$val == 0), ]
– Robert Hickman
2 hours ago

add a comment |

5

Would this work for ids that contain NAs and non-NAs? Try with df1 <- data.frame(id = rep(c("a", "b"), each = 2), val = c(NA, 1, 2, 3))
– markus
3 hours ago

I think this is the best so far (I'll leave it open for another hour or so to see) would maybe change to df %>% replace(is.na(.), 0) %>% .[!(duplicated(.$id) & .$val == 0), ]
– Robert Hickman
2 hours ago

Would this work for ids that contain NAs and non-NAs? Try with df1 <- data.frame(id = rep(c("a", "b"), each = 2), val = c(NA, 1, 2, 3))
– markus
3 hours ago

I think this is the best so far (I'll leave it open for another hour or so to see) would maybe change to df %>% replace(is.na(.), 0) %>% .[!(duplicated(.$id) & .$val == 0), ]
– Robert Hickman
2 hours ago

add a comment |

all_NA <- with(df1, ave(is.na(val), id, FUN = all))

rbind(unique(transform(df1[all_NA, ], val = 0)), df1[!all_NA, ])



#  id val

#1  a   0

#5  b   1

#6  b   2

#7  b   2

#8  b   3

library(dplyr)



bind_rows(df1 %>%

            group_by(id) %>%

            filter(all(!is.na(val))), 

          df1 %>%

             group_by(id) %>%

             filter(all(is.na(val))) %>%

             ungroup() %>%

             summarise(id = unique(id), 

                       val = 0)) %>%

arrange(id)





#   id      val

#  <fct> <dbl>

#1  a         0

#2  b         1

#3  b         2

#4  b         2

#5  b         3

edited 3 hours ago

answered 3 hours ago

Ronak Shah

32.6k103753

add a comment |

all_NA <- with(df1, ave(is.na(val), id, FUN = all))

rbind(unique(transform(df1[all_NA, ], val = 0)), df1[!all_NA, ])



#  id val

#1  a   0

#5  b   1

#6  b   2

#7  b   2

#8  b   3

library(dplyr)



bind_rows(df1 %>%

            group_by(id) %>%

            filter(all(!is.na(val))), 

          df1 %>%

             group_by(id) %>%

             filter(all(is.na(val))) %>%

             ungroup() %>%

             summarise(id = unique(id), 

                       val = 0)) %>%

arrange(id)





#   id      val

#  <fct> <dbl>

#1  a         0

#2  b         1

#3  b         2

#4  b         2

#5  b         3

edited 3 hours ago

answered 3 hours ago

Ronak Shah

32.6k103753

add a comment |

all_NA <- with(df1, ave(is.na(val), id, FUN = all))

rbind(unique(transform(df1[all_NA, ], val = 0)), df1[!all_NA, ])



#  id val

#1  a   0

#5  b   1

#6  b   2

#7  b   2

#8  b   3

library(dplyr)



bind_rows(df1 %>%

            group_by(id) %>%

            filter(all(!is.na(val))), 

          df1 %>%

             group_by(id) %>%

             filter(all(is.na(val))) %>%

             ungroup() %>%

             summarise(id = unique(id), 

                       val = 0)) %>%

arrange(id)





#   id      val

#  <fct> <dbl>

#1  a         0

#2  b         1

#3  b         2

#4  b         2

#5  b         3

edited 3 hours ago

answered 3 hours ago

Ronak Shah

32.6k103753

all_NA <- with(df1, ave(is.na(val), id, FUN = all))

rbind(unique(transform(df1[all_NA, ], val = 0)), df1[!all_NA, ])



#  id val

#1  a   0

#5  b   1

#6  b   2

#7  b   2

#8  b   3

library(dplyr)



bind_rows(df1 %>%

            group_by(id) %>%

            filter(all(!is.na(val))), 

          df1 %>%

             group_by(id) %>%

             filter(all(is.na(val))) %>%

             ungroup() %>%

             summarise(id = unique(id), 

                       val = 0)) %>%

arrange(id)





#   id      val

#  <fct> <dbl>

#1  a         0

#2  b         1

#3  b         2

#4  b         2

#5  b         3

edited 3 hours ago

answered 3 hours ago

Ronak Shah

32.6k103753

edited 3 hours ago

answered 3 hours ago

Ronak Shah

32.6k103753

answered 3 hours ago

Ronak Shah

32.6k103753

answered 3 hours ago

Ronak Shah

32.6k103753

add a comment |

Changed the df to make example more exhaustive -

df1 <- data.frame(id = rep(c("a", "b","c"), each = 4),

                  val = c(NA, NA, NA, NA, 1, 2, 2, 3,NA,2,NA,3))

library(dplyr)

df1 %>%

  group_by(id) %>%

  mutate(case=sum(is.na(val))==n(), row_num=row_number() ) %>%

  mutate(val=ifelse(is.na(val)&case,0,val)) %>%

  filter( !(case&row_num!=1) ) %>%

  select(id, val)

Output

  id      val

  <fct> <dbl>

1 a         0

2 b         1

3 b         2

4 b         2

5 b         3

6 c        NA

7 c         2

8 c        NA

9 c         3

answered 3 hours ago

Vivek Kalyanarangan

4,8911827

add a comment |

Changed the df to make example more exhaustive -

df1 <- data.frame(id = rep(c("a", "b","c"), each = 4),

                  val = c(NA, NA, NA, NA, 1, 2, 2, 3,NA,2,NA,3))

library(dplyr)

df1 %>%

  group_by(id) %>%

  mutate(case=sum(is.na(val))==n(), row_num=row_number() ) %>%

  mutate(val=ifelse(is.na(val)&case,0,val)) %>%

  filter( !(case&row_num!=1) ) %>%

  select(id, val)

Output

  id      val

  <fct> <dbl>

1 a         0

2 b         1

3 b         2

4 b         2

5 b         3

6 c        NA

7 c         2

8 c        NA

9 c         3

answered 3 hours ago

Vivek Kalyanarangan

4,8911827

add a comment |

Changed the df to make example more exhaustive -

df1 <- data.frame(id = rep(c("a", "b","c"), each = 4),

                  val = c(NA, NA, NA, NA, 1, 2, 2, 3,NA,2,NA,3))

library(dplyr)

df1 %>%

  group_by(id) %>%

  mutate(case=sum(is.na(val))==n(), row_num=row_number() ) %>%

  mutate(val=ifelse(is.na(val)&case,0,val)) %>%

  filter( !(case&row_num!=1) ) %>%

  select(id, val)

Output

  id      val

  <fct> <dbl>

1 a         0

2 b         1

3 b         2

4 b         2

5 b         3

6 c        NA

7 c         2

8 c        NA

9 c         3

answered 3 hours ago

Vivek Kalyanarangan

4,8911827

Changed the df to make example more exhaustive -

df1 <- data.frame(id = rep(c("a", "b","c"), each = 4),

                  val = c(NA, NA, NA, NA, 1, 2, 2, 3,NA,2,NA,3))

library(dplyr)

df1 %>%

  group_by(id) %>%

  mutate(case=sum(is.na(val))==n(), row_num=row_number() ) %>%

  mutate(val=ifelse(is.na(val)&case,0,val)) %>%

  filter( !(case&row_num!=1) ) %>%

  select(id, val)

Output

  id      val

  <fct> <dbl>

1 a         0

2 b         1

3 b         2

4 b         2

5 b         3

6 c        NA

7 c         2

8 c        NA

9 c         3

answered 3 hours ago

Vivek Kalyanarangan

4,8911827

answered 3 hours ago

Vivek Kalyanarangan

4,8911827

answered 3 hours ago

Vivek Kalyanarangan

4,8911827

answered 3 hours ago

Vivek Kalyanarangan

4,8911827

add a comment |

Here is an option too:

df1 %>% 

  mutate_if(is.factor,as.character) %>% 

 mutate_all(funs(replace(.,is.na(.),0))) %>% 

  slice(4:nrow(.))

This gives:

Alternative:

df1 %>% 

  mutate_if(is.factor,as.character) %>% 

 mutate_all(funs(replace(.,is.na(.),0))) %>% 

  unique()

edited 2 hours ago

answered 3 hours ago

NelsonGon

815217

3

where did 4 come from?
– Sotos
3 hours ago

The solution produces four 0s. We're only interested in having 1?
– NelsonGon
3 hours ago

What if one group has 4 and another 3?
– Sotos
3 hours ago

Sorry I only answered based on the question. Maybe then we could twist things up, not sure though!
– NelsonGon
2 hours ago

Consider this example - df1 <- data.frame(id = rep(c("a", "b","c"), each = 4), val = c(NA, NA, NA, NA, 1, 2, 2, 3,NA,2,NA,3)) I think here OP wants to remove NA values for A group only, not the rest
– Vivek Kalyanarangan
2 hours ago

|
show 2 more comments

Here is an option too:

df1 %>% 

  mutate_if(is.factor,as.character) %>% 

 mutate_all(funs(replace(.,is.na(.),0))) %>% 

  slice(4:nrow(.))

This gives:

Alternative:

df1 %>% 

  mutate_if(is.factor,as.character) %>% 

 mutate_all(funs(replace(.,is.na(.),0))) %>% 

  unique()

edited 2 hours ago

answered 3 hours ago

NelsonGon

815217

3

where did 4 come from?
– Sotos
3 hours ago

The solution produces four 0s. We're only interested in having 1?
– NelsonGon
3 hours ago

What if one group has 4 and another 3?
– Sotos
3 hours ago

Sorry I only answered based on the question. Maybe then we could twist things up, not sure though!
– NelsonGon
2 hours ago

Consider this example - df1 <- data.frame(id = rep(c("a", "b","c"), each = 4), val = c(NA, NA, NA, NA, 1, 2, 2, 3,NA,2,NA,3)) I think here OP wants to remove NA values for A group only, not the rest
– Vivek Kalyanarangan
2 hours ago

|
show 2 more comments

Here is an option too:

df1 %>% 

  mutate_if(is.factor,as.character) %>% 

 mutate_all(funs(replace(.,is.na(.),0))) %>% 

  slice(4:nrow(.))

This gives:

Alternative:

df1 %>% 

  mutate_if(is.factor,as.character) %>% 

 mutate_all(funs(replace(.,is.na(.),0))) %>% 

  unique()

edited 2 hours ago

answered 3 hours ago

NelsonGon

815217

Here is an option too:

df1 %>% 

  mutate_if(is.factor,as.character) %>% 

 mutate_all(funs(replace(.,is.na(.),0))) %>% 

  slice(4:nrow(.))

This gives:

Alternative:

df1 %>% 

  mutate_if(is.factor,as.character) %>% 

 mutate_all(funs(replace(.,is.na(.),0))) %>% 

  unique()

edited 2 hours ago

answered 3 hours ago

NelsonGon

815217

edited 2 hours ago

answered 3 hours ago

NelsonGon

815217

answered 3 hours ago

NelsonGon

815217

answered 3 hours ago

NelsonGon

815217

3

where did 4 come from?
– Sotos
3 hours ago

The solution produces four 0s. We're only interested in having 1?
– NelsonGon
3 hours ago

What if one group has 4 and another 3?
– Sotos
3 hours ago

Sorry I only answered based on the question. Maybe then we could twist things up, not sure though!
– NelsonGon
2 hours ago

Consider this example - df1 <- data.frame(id = rep(c("a", "b","c"), each = 4), val = c(NA, NA, NA, NA, 1, 2, 2, 3,NA,2,NA,3)) I think here OP wants to remove NA values for A group only, not the rest
– Vivek Kalyanarangan
2 hours ago

|
show 2 more comments

3

where did 4 come from?
– Sotos
3 hours ago

The solution produces four 0s. We're only interested in having 1?
– NelsonGon
3 hours ago

What if one group has 4 and another 3?
– Sotos
3 hours ago

Sorry I only answered based on the question. Maybe then we could twist things up, not sure though!
– NelsonGon
2 hours ago

Consider this example - df1 <- data.frame(id = rep(c("a", "b","c"), each = 4), val = c(NA, NA, NA, NA, 1, 2, 2, 3,NA,2,NA,3)) I think here OP wants to remove NA values for A group only, not the rest
– Vivek Kalyanarangan
2 hours ago

where did 4 come from?
– Sotos
3 hours ago

The solution produces four 0s. We're only interested in having 1?
– NelsonGon
3 hours ago

What if one group has 4 and another 3?
– Sotos
3 hours ago

Sorry I only answered based on the question. Maybe then we could twist things up, not sure though!
– NelsonGon
2 hours ago

Consider this example - df1 <- data.frame(id = rep(c("a", "b","c"), each = 4), val = c(NA, NA, NA, NA, 1, 2, 2, 3,NA,2,NA,3)) I think here OP wants to remove NA values for A group only, not the rest
– Vivek Kalyanarangan
2 hours ago

|
show 2 more comments

Here is a base R solution.

res <- lapply(split(df1, df1$id), function(DF){

  if(anyNA(DF$val)) {

    i <- is.na(DF$val)

    DF$val[i] <- 0

    DF <- rbind(DF[i & !duplicated(DF[i, ]), ], DF[!i, ])

  }

  DF

})

res <- do.call(rbind, res)

row.names(res) <- NULL

res

#  id val

#1  a   0

#2  b   1

#3  b   2

#4  b   2

#5  b   3

Edit.

library(dplyr)



na2zero <- function(DF){

  DF %>%

    group_by(id) %>%

    mutate(val = ifelse(is.na(val), 0, val),

           crit = val == 0 & duplicated(val)) %>%

    filter(!crit) %>%

    select(-crit)

}



na2zero(df1)

na2zero(df2)

na2zero(df3)

edited 2 hours ago

answered 3 hours ago

Rui Barradas

16.1k41730

Rui, try with df1 <- data.frame(id = rep(c("a", "b"), each = 2), val = c(NA, 1, 2, 3)). Unfortunately your solution doesn't return a data frame with only three rows.
– markus
3 hours ago

@markus No, it doesn't. The NA is replaced by a 0 and the other value of val is not NA so both must be in the output. At least that's how I'm understanding the OP's problem.
– Rui Barradas
2 hours ago

Fair enough. People are reading the question differently.
– markus
1 hour ago

add a comment |

Here is a base R solution.

res <- lapply(split(df1, df1$id), function(DF){

  if(anyNA(DF$val)) {

    i <- is.na(DF$val)

    DF$val[i] <- 0

    DF <- rbind(DF[i & !duplicated(DF[i, ]), ], DF[!i, ])

  }

  DF

})

res <- do.call(rbind, res)

row.names(res) <- NULL

res

#  id val

#1  a   0

#2  b   1

#3  b   2

#4  b   2

#5  b   3

Edit.

library(dplyr)



na2zero <- function(DF){

  DF %>%

    group_by(id) %>%

    mutate(val = ifelse(is.na(val), 0, val),

           crit = val == 0 & duplicated(val)) %>%

    filter(!crit) %>%

    select(-crit)

}



na2zero(df1)

na2zero(df2)

na2zero(df3)

edited 2 hours ago

answered 3 hours ago

Rui Barradas

16.1k41730

Rui, try with df1 <- data.frame(id = rep(c("a", "b"), each = 2), val = c(NA, 1, 2, 3)). Unfortunately your solution doesn't return a data frame with only three rows.
– markus
3 hours ago

@markus No, it doesn't. The NA is replaced by a 0 and the other value of val is not NA so both must be in the output. At least that's how I'm understanding the OP's problem.
– Rui Barradas
2 hours ago

Fair enough. People are reading the question differently.
– markus
1 hour ago

add a comment |

Here is a base R solution.

res <- lapply(split(df1, df1$id), function(DF){

  if(anyNA(DF$val)) {

    i <- is.na(DF$val)

    DF$val[i] <- 0

    DF <- rbind(DF[i & !duplicated(DF[i, ]), ], DF[!i, ])

  }

  DF

})

res <- do.call(rbind, res)

row.names(res) <- NULL

res

#  id val

#1  a   0

#2  b   1

#3  b   2

#4  b   2

#5  b   3

Edit.

library(dplyr)



na2zero <- function(DF){

  DF %>%

    group_by(id) %>%

    mutate(val = ifelse(is.na(val), 0, val),

           crit = val == 0 & duplicated(val)) %>%

    filter(!crit) %>%

    select(-crit)

}



na2zero(df1)

na2zero(df2)

na2zero(df3)

edited 2 hours ago

answered 3 hours ago

Rui Barradas

16.1k41730

Here is a base R solution.

res <- lapply(split(df1, df1$id), function(DF){

  if(anyNA(DF$val)) {

    i <- is.na(DF$val)

    DF$val[i] <- 0

    DF <- rbind(DF[i & !duplicated(DF[i, ]), ], DF[!i, ])

  }

  DF

})

res <- do.call(rbind, res)

row.names(res) <- NULL

res

#  id val

#1  a   0

#2  b   1

#3  b   2

#4  b   2

#5  b   3

Edit.

library(dplyr)



na2zero <- function(DF){

  DF %>%

    group_by(id) %>%

    mutate(val = ifelse(is.na(val), 0, val),

           crit = val == 0 & duplicated(val)) %>%

    filter(!crit) %>%

    select(-crit)

}



na2zero(df1)

na2zero(df2)

na2zero(df3)

edited 2 hours ago

answered 3 hours ago

Rui Barradas

16.1k41730

edited 2 hours ago

answered 3 hours ago

Rui Barradas

16.1k41730

answered 3 hours ago

Rui Barradas

16.1k41730

answered 3 hours ago

Rui Barradas

16.1k41730

Rui, try with df1 <- data.frame(id = rep(c("a", "b"), each = 2), val = c(NA, 1, 2, 3)). Unfortunately your solution doesn't return a data frame with only three rows.
– markus
3 hours ago

@markus No, it doesn't. The NA is replaced by a 0 and the other value of val is not NA so both must be in the output. At least that's how I'm understanding the OP's problem.
– Rui Barradas
2 hours ago

Fair enough. People are reading the question differently.
– markus
1 hour ago

add a comment |

Rui, try with df1 <- data.frame(id = rep(c("a", "b"), each = 2), val = c(NA, 1, 2, 3)). Unfortunately your solution doesn't return a data frame with only three rows.
– markus
3 hours ago

@markus No, it doesn't. The NA is replaced by a 0 and the other value of val is not NA so both must be in the output. At least that's how I'm understanding the OP's problem.
– Rui Barradas
2 hours ago

Fair enough. People are reading the question differently.
– markus
1 hour ago

Rui, try with df1 <- data.frame(id = rep(c("a", "b"), each = 2), val = c(NA, 1, 2, 3)). Unfortunately your solution doesn't return a data frame with only three rows.
– markus
3 hours ago

@markus No, it doesn't. The NA is replaced by a 0 and the other value of val is not NA so both must be in the output. At least that's how I'm understanding the OP's problem.
– Rui Barradas
2 hours ago

Fair enough. People are reading the question differently.
– markus
1 hour ago

add a comment |

Another base approach, one that doesn't maintain the order of the rows and takes advantage of factors remembering lost values:

df1 <- na.omit(df1)



df1 <- rbind(

  df1, 

  data.frame(

    id  = levels(df1$id)[!levels(df1$id) %in% df1$id], 

    val = 0)

  )

answered 16 mins ago

CriminallyVulgar

add a comment |

Another base approach, one that doesn't maintain the order of the rows and takes advantage of factors remembering lost values:

df1 <- na.omit(df1)



df1 <- rbind(

  df1, 

  data.frame(

    id  = levels(df1$id)[!levels(df1$id) %in% df1$id], 

    val = 0)

  )

answered 16 mins ago

CriminallyVulgar

add a comment |

Another base approach, one that doesn't maintain the order of the rows and takes advantage of factors remembering lost values:

df1 <- na.omit(df1)



df1 <- rbind(

  df1, 

  data.frame(

    id  = levels(df1$id)[!levels(df1$id) %in% df1$id], 

    val = 0)

  )

answered 16 mins ago

CriminallyVulgar

Another base approach, one that doesn't maintain the order of the rows and takes advantage of factors remembering lost values:

df1 <- na.omit(df1)



df1 <- rbind(

  df1, 

  data.frame(

    id  = levels(df1$id)[!levels(df1$id) %in% df1$id], 

    val = 0)

  )

answered 16 mins ago

CriminallyVulgar

answered 16 mins ago

CriminallyVulgar

answered 16 mins ago

CriminallyVulgar

answered 16 mins ago

CriminallyVulgar

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

Some of your past answers have not been well-received, and you're in danger of being blocked from answering.

Please pay close attention to the following guidance:

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Tukukkk