Using an Amazon Elastic MapReduce Cluster

To use a remote Amazon Elastic MapReduce cluster for deploying, debugging or monitoring, you create a cluster object in the Hadoop Manager presented via NetBeans Services. Karmasphere Studio enables you to run jobs on that cluster and to monitor the status of all Elastic MapReduce clusters, jobs and steps.

Note that when you set up the object, this does NOT create the cluster within Amazon Elastic MapReduce . Within Amazon Elastic MapReduce , clusters are created "on demand" so the cluster is only established when you run a job on it.

Note that Karmasphere Studio has a built-in object representing a "pretend" cluster emulated on a thread within NetBeans. You can use this to run a job but here we're going to cover how to create and use an Amazon Elastic MapReduce cluster.

First, we will add a new remote cluster object which uses a JobTracker. In the Services tab, right-click on "Amazon Cluster" and select "Add Cluster...".

Next, you give the cluster object a name and select "Amazon Cluster" as the cluster type. Amazon currently uses Hadoop version 18.3 so ensure that this version remains selected.

Finally, you associate a default file system with the cluster.

Next you configure your AWS Account credentials, SSH key information and set the options you require for this cluster object.

Just click "Finish" to complete the creation of your Amazon Elastic MapReduce cluster object.

Useful next steps: