Below is a WYSIWYG tutorial on how to set up a HBase cluster (and Thrift) on Amazon’s Elastic Map Reduce.
1. Click on “Create Job Flow” in Amazon’s Elastic Map Reduce screen.
2. Choose HBase in the type of job flow.
3. Choose “No” for restore from backup, and backup schedule.
4. Choose the number of master/data nodes that you want. You can choose “spot” instances if you would like (if you are using this cluster only for testing.)
5. Make sure you choose a key-pair in the “Amazon EC2 Key Pair” option – you will need this to start Thrift on the HBase cluster.
6. You do not need any bootstrap actions
7. Review, and click “Create Job Flow”.
8. Go to EC2, and click on “Security Groups”, and then add port 9000 and the necessary IP (or group) to the list of nodes allowed to access port 9000 (for Thrift).
9. Go to EC2, choose your HBase master node, right click and choose “Connect”. Make sure to replace the default “root” user with “hadoop”.
10. Connect to the master node through SSH, and run this command: “hbase-daemon.sh start thrift”. That will start Thrift on the master node.