Remove rows with all or some NAs (missing values) in data.frame

R Programming

Want to remove the lines from data frame from that :

Have NAs across all columns 

             a                               b    c       d     e        f

1 YASH00000206234    0   NA   NA   NA   NA

2 YASH00000199774    0    2        2       2      2

3 YASH00000221722    0   NA   NA   NA   NA

4 YASH00000207704    0   NA   NA   1       2

5 YASH00000207531    0   NA   NA   NA   NA

6 YASH00000221412    0   1        2      3       2 

I would like to get the data frame as follows :

           a                             b    c    d    e    f

2 YASH00000199774    0   2    2    2    2

6 YASH00000221412    0   1    2    3    2 

Have NAs in only some columns and the result I will get:

         a                              b   c    d     e    f    

2 YASH00000199774    0   2    2    2    2

4 YASH00000207704   0  NA  NA 1   2 

6 YASH00000221412    0   1    2    3    2

 


If you want to unleash your potential in this competitive field, please visit the R Programming course page for more information, where you can find the R Programming tutorials and R Programming frequently asked interview questions and answers as well.

2
Answers

Replies


There is a new function called drop_na in the tidyr library that can remove all the rows that have NA in a data frame. Here is how you can use it.



library(tidyr)


df %>% drop_na()


# gene hsap mmul mmus rnor cfam


# 2 ENSG00000199674 0 2 2 2 2


# 6 ENSG00000221312 0 1 2 3 2



You can use the same function to remove the rows with partial NA columns like this.



df %>% drop_na(rnor, cfam)


# gene hsap mmul mmus rnor cfam


# 2 ENSG00000199674 0 2 2 2 2


# 4 ENSG00000207604 0 NA NA 1 2


# 6 ENSG00000221312 0 1 2 3 2


 

 


To remove all the NA from the data frame, it is easier to use the complete case. Here is how you can use it.



> final[complete.cases(final), ]


gene hsap mmul mmus rnor cfam


2 ENSG00000199674 0 2 2 2 2


6 ENSG00000221312 0 1 2 3 2



If you want to keep the partial NA columns and only remove some columns, you can do it like this.



> final[complete.cases(final[ , 5:6]),]


gene hsap mmul mmus rnor cfam


2 ENSG00000199674 0 2 2 2 2


4 ENSG00000207604 0 NA NA 1 2


6 ENSG00000221312 0 1 2 3 2


 

 
 

This topic has been locked/unapproved. No replies allowed

Login to participate in this discussion.

Leave a reply

Before proceeding, please check your email for a verification link. If you did not receive the email, click here to request another.
WhatsApp
To Top