Hadley Wickham — written Dec 10, 2012 — source
If you’re working with missing values, you need to know two things: - What
happens when you put missing values in scalars (e.g. double1
) - How to get
and set missing values in vectors (e.g. NumericVector
)
The following code explores what happens when you take one of R’s missing values, coerce it into a scalar, and then coerce back to an R vector.
Which as expected yields missing values in R:
List of 4 $ : int NA $ : chr NA $ : logi TRUE $ : num NA
So:
IntegerVector
-> int
: stored as the smallest integerCharacterVector
-> String
: the string “NA”LogicalVector
-> bool
: TRUE. To work with missing values in logical
vectors, use an int instead of a bool.NumericVector
-> double
: stored as an NaN, and preserved. Most
numerical operations will behave as you expect, but as discussed
below logical comparison will not.If you’re working with doubles, you may be able to get away with ignoring missing values and working with NaN (not a number). R’s missing values are a special type of the IEEE 754 floating point number NaN. That means if you coerce them to double in your C++ code, they will behave like regular NaN’s. That means, in a logical context they always evaluate to FALSE.
To set a missing value in a vector, you need to use a missing value specific to the type of vector:
Now let’s confirm that these values do in fact appear missing in R:
List of 4 $ : num NA $ : int NA $ : logi NA $ : chr NA
To check if a value in a vector is missing, use the class method is_na
:
Here we test with some missing and non-missing values:
[1] TRUE FALSE FALSE TRUE
Equivalent behavior to the isNA
function can be obtained by calling the
is_na
sugar function, which takes a vector and returns a logical vector.
tags: basics
Tweet