Deploying a Job from the Job Developer
By now, hopefully you've seen how to use the Job Developer to work with your MapReduce Job. If not, you may want to read Walking Through Your First MapReduce Job in the Job Developer. So how do you deploy this job to a remote cluster?
This is as simple as clicking the "Deploy" button in the Workflow view. This invokes functionality from the Hadoop Manager and enables you to deploy the job to a remote cluster.
So let's take a look at the dialog box that comes up when you click "Deploy". If you haven't already used any of the Hadoop Manager functionality, you're going to need to set up a cluster and file system and, if you're using Amazon Web Services Elastic MapReduce or S3, your account credentials.
If you are going to use a Hadoop cluster, learn about Using the HDFS File System and Using a Hadoop Cluster in order to select your Hadoop cluster and HDFS-based data file system.
If you are going to use an Amazon cluster, learn about Using the Amazon S3 File System and Using an Amazon Elastic MapReduce Cluster to select your Elastic MapReduce cluster and S3-based data file system.
Note that because the Job Developer uses Hadoop locally, it knows how to deal with class paths and input and output files. When you choose to deploy the job on a remote cluster you have to tell the job what input and output files to use on the file system you associate with the job.
Once your cluster and data filesystem are configured and selected, simply press the "Run" button and Karmasphere Studio will kick your job off on the selected cluster.