Setting up AWS Cluster to use snow in R
November 8, 2011 Leave a comment
I wanted to setup an AWS cluster to take a shot at a Kaggle contest – DunnHumby Challenge
http://www.kaggle.com/c/dunnhumbychallenge
For this, I found StarCluster to be of great help. It allows you to set-up AWS nodes in a few lines of code and does much more (choosing AMIs and cluster configurations)
http://web.mit.edu/stardev/cluster/
Make sure you use the Bioconductor AMI which comes bundled with R and a host of installed packages.
http://www.bioconductor.org/help/bioconductor-cloud-ami/
I used the package “snowfall” for parallel processing.
Relevant SO questions I had asked
http://stackoverflow.com/questions/7241244/using-aws-for-parallel-processing-with-r