« The Awe of Big Data | Main | June 29: The Royal Statistical Society talks R »

June 06, 2012


Feed You can follow this conversation by subscribing to the comment feed for this post.

Is the Feb 29 valley adjusted by 4x to account for the rate of occurrence of Feb29s?
Meanwhile, I do find it amusing that the birthrate is slightly lower (heatmap) on the 4th of July and Christmas and New Year's day. I bet that first dip applies only in the USA :-) .
And, finally, a pedantic note: given that the calendar year is cyclic and the end of the year is selected arbitrarily, it's a bad idea to attempt a linear fit, especially after you've overlaid several years' data. What does the fit look like if you plot "true time," i.e. from day 1 of first year to day 365 of the last year?

Link to punkrockor blog post is broken.

Where is the data? The CDC files linked only include Birth Month and Year, and not the day.

The CDC data includes birth day for 1969 to 1988.

I actually did a very similar simulation a few days ago using that full data and found that they were nearly identical. Slightly more likely in reality than the simulation, but only 0.14% more likely at n=23 (and n=23 is still the minimum group size necessary for >= 50%.

Post: http://chmullig.com/2012/06/births-by-day-of-year/

@Chris -- thanks! I really liked your time series chart, which I reproduced in this post.

Does Revolution R have a Birthday Problem?

Here is the Matlab answer:



The comments to this entry are closed.

Search Revolutions Blog

Got comments or suggestions for the blog editor?
Email David Smith.
Follow revodavid on Twitter Follow David on Twitter: @revodavid
Get this blog via email with Blogtrottr