« Free eBook on Big Data and Data Science | Main | Give your R charts that Wes Anderson style »

March 25, 2014

Comments

Feed You can follow this conversation by subscribing to the comment feed for this post.

Great post .. Awaiting the exploration of these algorithms !!

"The idea with Random Decision Forests was to train binary decision trees on random subsets of attributes (random subsets of columns of the training data). Breiman and Cutler’s Random Forests method combined random subsampling of rows (Bagging) with random subsampling of columns."

I think that the most important idea behind Random Forest is that by columns sampling you decorrelate trees, generating a better committe classifier. At your text this idea isn't so clear. And after all, Random Forest is just a bagging with a heuristics.

Octopus CIP (http://www.octopuscip.comcastbiz.net) supports the same approach to optimize data analysis. Models are the foundation of Octopus Cloud Interactive Platform (CIP) and they can be executed in parallel. We call our approach (n + 1)-models. n models run simultaneously to perform data analysis and one (+1) monitors performance of those n models. The final decision (result)is the result of the best performing model or a weighted average on results of all n models. Weight of each model is recalculated based on it's performance on each iteration of an analysis.

The comments to this entry are closed.


R for the Enterprise

Got comments or suggestions for the blog editor?
Email David Smith.
Follow revodavid on Twitter Follow David on Twitter: @revodavid

Search Revolutions Blog