« An R "meta" book | Main | Because it's Friday: Jerk Cats »

March 14, 2014

Comments

Feed You can follow this conversation by subscribing to the comment feed for this post.

> By looking at the rates at which some known victims were not reported by all of the agencies, HRDAG can estimate the number of victims that were identified by nobody, and thereby get a more accurate count of total casualties. (The specific statistical technique used was Random Forests, using the R language. You can read more about the methodology here.)

This description seems to be a bit off from the presentation: she says they use random forests for the de-duplication process, but doesn't go into any apparent detail on what they do with the final dataset to estimate how many victims are missing. My guess would be they're using capture-recapture analysis (possibly as implemented in Rcapture) for estimating missingness because their problem would be ideal for that technique, but I could be wrong.

The comments to this entry are closed.


R for the Enterprise

Got comments or suggestions for the blog editor?
Email David Smith.
Follow revodavid on Twitter Follow David on Twitter: @revodavid

Search Revolutions Blog