Comments on The trick to understanding NAs (missing values) in RTypePad2016-07-01T16:25:09ZBlog Administratorhttps://blog.revolutionanalytics.com/tag:typepad.com,2003:https://blog.revolutionanalytics.com/2016/07/understanding-na-in-r/comments/atom.xml/Martin Maechler commented on 'The trick to understanding NAs (missing values) in R'tag:typepad.com,2003:6a010534b1db25970b01bb092336d3970d2016-07-26T13:41:40Z2016-08-10T19:50:00ZMartin Maechler@Kevin and NA * 0: before drawing too quick conclusions, note that Inf * 0 is (most of the time,...<p>@Kevin and NA * 0: before drawing too quick conclusions, note that Inf * 0 is (most of the time, at least in the double precision standard!) defined to be 'NaN' and basic arithmetic in R does follow that. So, replacing the placeholder x=NA by Inf (or -Inf !), you have cases where x * 0 is not 0.... and that was the reason NA * 0 was defined to be NA (and NaN * 0 to be NaN).</p>
<p>And yes, it is true, one *could have* adopted the definition that all of these, including 0^NA should return NaN ... which corresponds to typical floating point standards... *BUT* and here we are back to the original posting by David Smith, in almost all math-stat applications it is very convenient to have 0^0 = 1; this goes for the border cases of binomial, negbinomial and poisson and derived formulas IIRC.</p>James Howard commented on 'The trick to understanding NAs (missing values) in R'tag:typepad.com,2003:6a010534b1db25970b01bb09230dea970d2016-07-26T01:54:57Z2016-07-26T01:54:59ZJames Howardhttp://profile.typepad.com/howardjp1. I think the post David was trying to link to was this one: https://jameshoward.us/2016/07/18/nan-versus-na-r/ 2. The defense for NaN^0...<p>1. I think the post David was trying to link to was this one: https://jameshoward.us/2016/07/18/nan-versus-na-r/</p>
<p>2. The defense for NaN^0 = 1 comes from the hardware: https://jameshoward.us/2016/07/25/course-nan0-1/</p>David Smith commented on 'The trick to understanding NAs (missing values) in R'tag:typepad.com,2003:6a010534b1db25970b01b7c87db1a8970b2016-07-20T20:07:38Z2016-07-21T21:21:46ZDavid SmithThere seems to be a little confusion between NaN (not-a-number) and NA (R's placeholder for a missing number) in the...<p>There seems to be a little confusion between NaN (not-a-number) and NA (R's placeholder for a missing number) in the above. R shouldn't return NA for an indeterminate form; it should (and generally does) return NaN in such cases. James Howard has a <a href="https://jameshoward.us/2016/07/18/nan-versus-na-r/" rel="nofollow">recent blog post on this topic</a>.</p>
<p>I suspect the reason why R Core adopted the 0^0=1 definition is because of the binomial justification, R being a stats package after all.</p>
<p>I can't think of any defense for NaN^0=1 though...</p>Heitz commented on 'The trick to understanding NAs (missing values) in R'tag:typepad.com,2003:6a010534b1db25970b01b7c87d89be970b2016-07-20T12:58:38Z2016-07-21T21:21:46ZHeitzAnother counterpoint is to realize that in R, NaN^0 also equals 1. Since NaN is by definition 'not a number',...<p>Another counterpoint is to realize that in R, NaN^0 also equals 1. Since NaN is by definition 'not a number', it can't be the case that R is using a 'placeholder for an unknown number' logic.</p>flodel commented on 'The trick to understanding NAs (missing values) in R'tag:typepad.com,2003:6a010534b1db25970b01bb0920e81d970d2016-07-20T02:40:47Z2016-07-21T21:21:46Zflodel@R-Stats, you could check http://mathworld.wolfram.com/Indeterminate.html or https://en.wikipedia.org/wiki/Indeterminate_form; both sources describe 0^0, Inf^0, 1^Inf, Inf * 0, Inf-Inf, Inf/Inf, and 0/0...<p>@R-Stats, you could check http://mathworld.wolfram.com/Indeterminate.html or https://en.wikipedia.org/wiki/Indeterminate_form; both sources describe 0^0, Inf^0, 1^Inf, Inf * 0, Inf-Inf, Inf/Inf, and 0/0 as indeterminate forms.</p>
<p>After R made the choice that 0^0 and Inf^0 are both equal to 1, then it's understandable that it claims NA^0 is 1 as well. However, apply the log() to that result and you get that log(NA^0) is not equal to 0 * log(NA).</p>
<p>Similarly, after R made the choice that 1^Inf be 1, it is understandable that it returns 1 for 1^NA. However, take the log() and you get that log(1^NA) is not equal to NA * log(1).</p>
<p>With some work, one could probably come up with more examples of surprising results like the ones above, which exploit the inconsistent way R handles the indeterminate forms I have listed. Makes you wonder why the R authors had not decided to return NA for all these indeterminate forms.</p>R-Stats commented on 'The trick to understanding NAs (missing values) in R'tag:typepad.com,2003:6a010534b1db25970b01b8d2073bab970c2016-07-20T01:42:42Z2016-07-21T21:21:46ZR-Stats@flodel That's not exactly an inconsistent treatment of indeterminate forms. That's the mathematical treatment. 0^0, Inf^0 and 1^Inf are all...<p>@flodel</p>
<p>That's not exactly an inconsistent treatment of indeterminate forms. That's the mathematical treatment.</p>
<p>0^0, Inf^0 and 1^Inf are all indeed equal to 1, in the mathematical sense. On the other hand, Inf*0, Inf - Inf, Inf/Inf and 0/0 are all undetermined - again in the mathematical sense - which is exactly what R returns - it actually returns NaN, at least in my machine.</p>flodel commented on 'The trick to understanding NAs (missing values) in R'tag:typepad.com,2003:6a010534b1db25970b01b8d20739fa970c2016-07-20T01:13:15Z2016-07-21T21:21:46ZflodelAn interesting follow-up would be to find out why R claims that 0^0, Inf^0, and 1^Inf are all equal to...<p>An interesting follow-up would be to find out why R claims that 0^0, Inf^0, and 1^Inf are all equal to 1. Whereas it returns NA for Inf * 0, Inf-Inf, Inf/Inf, and 0/0. It seems that R is not consistent in the treatment of indeterminate forms.</p>Carl Witthoft commented on 'The trick to understanding NAs (missing values) in R'tag:typepad.com,2003:6a010534b1db25970b01b8d2071c1f970c2016-07-19T20:09:07Z2016-07-21T21:21:46ZCarl WitthoftWhile I'm at it, just for fun: > NA/NaN [1] NA > NaN/NA [1] NaN<p>While I'm at it, just for fun:</p>
<p>> NA/NaN<br />
[1] NA<br />
> NaN/NA<br />
[1] NaN<br />
</p>Carl Witthoft commented on 'The trick to understanding NAs (missing values) in R'tag:typepad.com,2003:6a010534b1db25970b01b8d2071b8c970c2016-07-19T20:03:39Z2016-07-21T21:21:46ZCarl WitthoftI agree that, at the very least, the result of prod(NA,na.rm=TRUE) should be documented in the help page. I did...<p>I agree that, at the very least, the result of prod(NA,na.rm=TRUE) should be documented in the help page.</p>
<p>I did find this nugget at ?prod :</p>
<p>"For historical reasons, NULL is accepted and treated as if it were numeric(0)."</p>
<p>So now we can all start arguing about what NULL really is :-)<br />
</p>R-Stats commented on 'The trick to understanding NAs (missing values) in R'tag:typepad.com,2003:6a010534b1db25970b01bb0920a904970d2016-07-19T15:12:31Z2016-07-21T21:21:46ZR-Stats@Kevin Wright That makes sense. But let me present a different POV. If you're using na.rm = TRUE, shouldn't you...<p>@Kevin Wright</p>
<p>That makes sense. But let me present a different POV. If you're using </p>
<p>na.rm = TRUE,</p>
<p>shouldn't you be responsible for making sense of the absence of NA? If you do want to keep the NA, you can use </p>
<p>tapply(dat$sales, dat$qtr, FUN=sum, na.rm=FALSE)</p>
<p>which correctly results in </p>
<p>Q1 Q2 Q3 Q4 <br />
NA 11 12 14 </p>
<p><br />
</p>Kevin Wright commented on 'The trick to understanding NAs (missing values) in R'tag:typepad.com,2003:6a010534b1db25970b01b7c87d1f79970b2016-07-19T14:11:37Z2016-07-21T21:21:46ZKevin WrightThanks @R-Stats for the link to the Empty_product. This is exactly what I meant about the R language being designed...<p>Thanks @R-Stats for the link to the Empty_product. This is exactly what I meant about the R language being designed to some ideal standard. But consider the following example. Is there any possible way that you would ever want Q1 sales to print as 0? Wouldn't you want it to be NA? Printing 0 is extremely misleading in my opinion.</p>
<p>R> dat <- data.frame(yr=c("Y1","Y1","Y1","Y1","Y2","Y2","Y2","Y2"),<br />
+ qtr=c("Q1","Q2","Q3","Q4","Q1","Q2","Q3","Q4"),<br />
+ sales=c(NA,5,5,6,NA,6,7,8))</p>
<p>R> tapply(dat$sales, dat$yr, FUN=sum, na.rm=TRUE)<br />
Y1 Y2 <br />
16 21 </p>
<p>R> tapply(dat$sales, dat$qtr, FUN=sum, na.rm=TRUE)<br />
Q1 Q2 Q3 Q4 <br />
0 11 12 14 <br />
</p>Rakshana commented on 'The trick to understanding NAs (missing values) in R'tag:typepad.com,2003:6a010534b1db25970b01b8d206e399970c2016-07-19T08:58:56Z2016-07-21T21:21:46ZRakshanahttps://www.gangboard.com/big-data-trainingNice article i was really impressed by seeing this article, it was very interesting and it is very useful for...<p>Nice article i was really impressed by seeing this article, it was very interesting and it is very useful for Big data training.</p>R-Stats commented on 'The trick to understanding NAs (missing values) in R'tag:typepad.com,2003:6a010534b1db25970b01bb09206d13970d2016-07-18T23:12:42Z2016-07-21T21:21:46ZR-StatsHello Kevin, I might be able to explain your results: 1) Notice that Infinity*0 is completely undefined, but Infinity^0 is...<p>Hello Kevin, </p>
<p>I might be able to explain your results:<br />
1) Notice that Infinity*0 is completely undefined, but Infinity^0 is still reasonable to be defined as 1 - you can try this in R with Inf*0 and Inf^0</p>
<p>2) It's reasonable, and standard, to define the empty product as the multiplicative unit - see this: https://en.wikipedia.org/wiki/Empty_product</p>Cliff AB commented on 'The trick to understanding NAs (missing values) in R'tag:typepad.com,2003:6a010534b1db25970b01bb09206cf0970d2016-07-18T23:11:53Z2016-07-21T21:21:46ZCliff ABAnnoying counter point: if we would consider NA to replace any number, then the following should be TRUE instead of...<p>Annoying counter point: if we would consider NA to replace any number, then the following should be TRUE instead of NA:</p>
<p>R> Inf >= NA</p>
<p>(instead we get NA). However, this counter point provides also a counter point to the previous comment that NA * 0 should be 0; in fact, Inf * 0 == NaN. </p>
<p>This also lead to a result that was slightly surprising to me: Inf^0 == 1 (I was expecting NaN!)</p>Kevin Wright commented on 'The trick to understanding NAs (missing values) in R'tag:typepad.com,2003:6a010534b1db25970b01bb092065bc970d2016-07-18T21:46:38Z2016-07-21T21:21:46ZKevin WrightJust for further example, I can sorta, kinda, maybe, tolerate R doing this: R> sum(NA, na.rm=TRUE) [1] 0 But this...<p>Just for further example, I can sorta, kinda, maybe, tolerate R doing this:</p>
<p>R> sum(NA, na.rm=TRUE)<br />
[1] 0</p>
<p>But this borders on insanity for real-life analytic scripts:</p>
<p>R> prod(NA, na.rm=TRUE)<br />
[1] 1</p>Kevin Wright commented on 'The trick to understanding NAs (missing values) in R'tag:typepad.com,2003:6a010534b1db25970b01b7c87ce2c0970b2016-07-18T21:35:10Z2016-07-21T21:21:46ZKevin WrightDavid, I applaud your attempt, but I think R's handling of NA values defies explanation. You wrote: "Now think of...<p>David, I applaud your attempt, but I think R's handling of NA values defies explanation.</p>
<p>You wrote: "Now think of all of the numbers that could replace NA in the expression NA^0. Any positive number to the power zero is 1."</p>
<p>Allow me to change this slightly: "Now think of all of the numbers that could replace NA in the expression NA*0. Any positive number times zero is 0."</p>
<p>Thus, we expect NA*0 to be 0. Let's check:</p>
<p>R> NA * 0<br />
[1] NA</p>
<p>Ahg, no.</p>
<p>I've seen people try to explain R's handling of NA values as being somehow consistent from a computer-science language-design point of view, but as a user who writes R scripts with lots of missing data, I claim there are some inexplicable inconsistencies with NA values in R.</p>
<p>Kevin Wright</p>