thr3ads.net - R devel - [R] Why does R replace all row values with NAs [Feb 2015]

If this information is useful, please help other people find it:
Share via:

Dimitri Liakhovitski

2015-Feb-27 14:04 UTC

[R] Why does R replace all row values with NAs

I know how to get the output I need, but I would benefit from an
explanation why R behaves the way it does.

# I have a data frame x:
x = data.frame(a=1:10,b=2:11,c=c(1,NA,3,NA,5,NA,7,NA,NA,10))
x
# I want to toss rows in x that contain values >=6. But I don't want
to toss my NAs there.

subset(x,c<6) # Works correctly, but removes NAs in c, understand why
x[which(x$c<6),] # Works correctly, but removes NAs in c, understand why
x[-which(x$c>=6),] # output I need

# Here is my question: why does the following line replace the values
of all rows that contain an NA # in x$c with NAs?

x[x$c<6,]  # Leaves rows with c=NA, but makes the whole row an NA. Why???
x[(x$c<6) | is.na(x$c),] # output I need - I have to be super-explicit

Thank you very much!

-- 
Dimitri Liakhovitski

Duncan Murdoch

2015-Feb-27 14:13 UTC

head link

[R] Why does R replace all row values with NAs

On 27/02/2015 9:04 AM, Dimitri Liakhovitski wrote:> I know how to get the output I need, but I would benefit from an
> explanation why R behaves the way it does.
> 
> # I have a data frame x:
> x = data.frame(a=1:10,b=2:11,c=c(1,NA,3,NA,5,NA,7,NA,NA,10))
> x
> # I want to toss rows in x that contain values >=6. But I don't want
> to toss my NAs there.
> 
> subset(x,c<6) # Works correctly, but removes NAs in c, understand why
> x[which(x$c<6),] # Works correctly, but removes NAs in c, understand why
> x[-which(x$c>=6),] # output I need
> 
> # Here is my question: why does the following line replace the values
> of all rows that contain an NA # in x$c with NAs?
> 
> x[x$c<6,]  # Leaves rows with c=NA, but makes the whole row an NA.
Why???
> x[(x$c<6) | is.na(x$c),] # output I need - I have to be super-explicit
> 
> Thank you very much!
Most of your examples (except the ones using which()) are doing logical
indexing.  In logical indexing, TRUE keeps a line, FALSE drops the line,
and NA returns NA.  Since "x$c < 6" is NA if x$c is NA, you get the
third kind of indexing.

Your last example works because in the cases where x$c is NA, it
evaluates NA | TRUE, and that evaluates to TRUE.  In the cases where x$c
is not NA, you get x$c < 6 | FALSE, and that's the same as x$c < 6,
which will be either TRUE or FALSE.

Duncan Murdoch

Dimitri Liakhovitski

2015-Feb-27 14:49 UTC

head link

[R] Why does R replace all row values with NAs

So, Duncan, do I understand you correctly:

When I use x$x<6, R doesn't know if it's TRUE or FALSE, so it returns
a logical value of NA.
When this logical value is applied to a row, the R says: hell, I don't
know if I should keep it or not, so, just in case, I am going to keep
it, but I'll replace all the values in this row with NAs?

On Fri, Feb 27, 2015 at 9:13 AM, Duncan Murdoch
<murdoch.duncan at gmail.com> wrote:> On 27/02/2015 9:04 AM, Dimitri Liakhovitski wrote:
>> I know how to get the output I need, but I would benefit from an
>> explanation why R behaves the way it does.
>>
>> # I have a data frame x:
>> x = data.frame(a=1:10,b=2:11,c=c(1,NA,3,NA,5,NA,7,NA,NA,10))
>> x
>> # I want to toss rows in x that contain values >=6. But I don't
want
>> to toss my NAs there.
>>
>> subset(x,c<6) # Works correctly, but removes NAs in c, understand
why
>> x[which(x$c<6),] # Works correctly, but removes NAs in c, understand
why
>> x[-which(x$c>=6),] # output I need
>>
>> # Here is my question: why does the following line replace the values
>> of all rows that contain an NA # in x$c with NAs?
>>
>> x[x$c<6,]  # Leaves rows with c=NA, but makes the whole row an NA.
Why???
>> x[(x$c<6) | is.na(x$c),] # output I need - I have to be
super-explicit
>>
>> Thank you very much!
>
> Most of your examples (except the ones using which()) are doing logical
> indexing.  In logical indexing, TRUE keeps a line, FALSE drops the line,
> and NA returns NA.  Since "x$c < 6" is NA if x$c is NA, you
get the
> third kind of indexing.
>
> Your last example works because in the cases where x$c is NA, it
> evaluates NA | TRUE, and that evaluates to TRUE.  In the cases where x$c
> is not NA, you get x$c < 6 | FALSE, and that's the same as x$c <
6,
> which will be either TRUE or FALSE.
>
> Duncan Murdoch
>


-- 
Dimitri Liakhovitski

Apparently Analagous Threads

Search for more seemingly similar threads

R devel - Feb 2015 - Why does R replace all row values with NAs

[R] Why does R replace all row values with NAs

[R] Why does R replace all row values with NAs

[R] Why does R replace all row values with NAs

Apparently Analagous Threads