I had a great time at the Data Mining Camp hosted by the ACM yesterday. The event was full of energy -- despite original projections of around 100 attendees, in fact more than 200 people from around the region turned up to meet and discuss various aspects of data mining. This was an "unconference" - except for a 30-minute panel discussion (great discussion from folks at Google and LinkedIn, amongst others) there was no pre-set agenda. Instead, people proposed topics for discussion, participants expressed interest with a show of hands, and the talks were allocated to timeslots and rooms accordingly.
I wish I'd taken a photo of the papers stuck to the wall making up the final schedule -- I'd guess in the end there were well over 20 talks on various topics related to data mining. Large-scale data processing with Hadoop and its machine-learning cousin Mahout was a hot topic. So was financial data mining (which was surprising to me, for a West Coast event). There were also talks on the semantic web, natural language processing, and many other topics I can't remember now.
I proposed a topic on Data Mining and Machine Learning with R, which with a show of hands about 50 people were interested in. Then someone proposed a topic on Basic R, and more than 80 people signed up for that one. So there was a lot of interest in R at the conference -- lots of people had heard about R but hadn't yet used it. I participated in both those sessions. For the Basics session, I have an introduction to R resources for beginners and the R syntax. For the Machine Learning session, I relied heavily on Josh Reich's machine learning script and other blog posts about predictive analytics. There was also a general session on open-source data mining tools. In addition to R, there was an interesting demo of KNIME (a workflow-based data mining tool that reminded me of Insightful Miner). It was interesting to see that KNIME can run algorithms from R and Weka, too.
All in all, a really invigorating event. Many thanks to the folks from the ACM for organizing it.
SF Bay ACM: 2009 Silicon Valley Data Mining Camp