diff options
Diffstat (limited to 'doc/admin-guide/en-US/markdown/admin_Hadoop.md')
| -rw-r--r-- | doc/admin-guide/en-US/markdown/admin_Hadoop.md | 60 | 
1 files changed, 18 insertions, 42 deletions
diff --git a/doc/admin-guide/en-US/markdown/admin_Hadoop.md b/doc/admin-guide/en-US/markdown/admin_Hadoop.md index 2894fa71302..742e8ad6255 100644 --- a/doc/admin-guide/en-US/markdown/admin_Hadoop.md +++ b/doc/admin-guide/en-US/markdown/admin_Hadoop.md @@ -1,5 +1,4 @@ -Managing Hadoop Compatible Storage -================================== +#Managing Hadoop Compatible Storage  GlusterFS provides compatibility for Apache Hadoop and it uses the  standard file system APIs available in Hadoop to provide a new storage @@ -7,54 +6,44 @@ option for Hadoop deployments. Existing MapReduce based applications can  use GlusterFS seamlessly. This new functionality opens up data within  Hadoop deployments to any file-based or object-based application. -Architecture Overview -===================== +##Architecture Overview  The following diagram illustrates Hadoop integration with GlusterFS: -Advantages -========== + + +##Advantages  The following are the advantages of Hadoop Compatible Storage with  GlusterFS:  -   Provides simultaneous file-based and object-based access within      Hadoop. -  -   Eliminates the centralized metadata server. -  -   Provides compatibility with MapReduce applications and rewrite is      not required. -  -   Provides a fault tolerant file system. -Preparing to Install Hadoop Compatible Storage -============================================== +##Preparing to Install Hadoop Compatible Storage  This section provides information on pre-requisites and list of  dependencies that will be installed during installation of Hadoop  compatible storage. -Pre-requisites --------------- +###Pre-requisites  The following are the pre-requisites to install Hadoop Compatible  Storage :  -   Hadoop 0.20.2 is installed, configured, and is running on all the      machines in the cluster. -  -   Java Runtime Environment -  -   Maven (mandatory only if you are building the plugin from the      source) -  -   JDK (mandatory only if you are building the plugin from the source) -  -   getfattr - command line utility -Installing, and Configuring Hadoop Compatible Storage -===================================================== +##Installing, and Configuring Hadoop Compatible Storage  This section describes how to install and configure Hadoop Compatible  Storage in your storage environment and verify that it is functioning @@ -70,9 +59,8 @@ correctly.      The following files will be extracted: -    -   /usr/local/lib/glusterfs-Hadoop-version-gluster\_plugin\_version.jar - -    -   /usr/local/lib/conf/core-site.xml +        - /usr/local/lib/glusterfs-Hadoop-version-gluster\_plugin\_version.jar +        - /usr/local/lib/conf/core-site.xml  3.  (Optional) To install Hadoop Compatible Storage in a different      location, run the following command: @@ -116,22 +104,13 @@ correctly.      The following are the configurable fields: -      ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- -      Property Name          Default Value              Description -      ---------------------- -------------------------- --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- -      fs.default.name        glusterfs://fedora1:9000   Any hostname in the cluster as the server and any port number. - -      fs.glusterfs.volname   hadoopvol                  GlusterFS volume to mount. - -      fs.glusterfs.mount     /mnt/glusterfs             The directory used to fuse mount the volume. - -      fs.glusterfs.server    fedora2                    Any hostname or IP address on the cluster except the client/master. - -      quick.slave.io         Off                        Performance tunable option. If this option is set to On, the plugin will try to perform I/O directly from the disk file system (like ext3 or ext4) the file resides on. Hence read performance will improve and job would run faster. -                                                        > **Note** -                                                        > -                                                        > This option is not tested widely -      ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- +	Property Name | Default Value | Description +	--- | --- | ---	 +	fs.default.name | glusterfs://fedora1:9000 | Any hostname in the cluster as the server and any port number. +	fs.glusterfs.volname | hadoopvol | GlusterFS volume to mount. +	fs.glusterfs.mount | /mnt/glusterfs | The directory used to fuse mount the volume. +	fs.glusterfs.server | fedora2 | Any hostname or IP address on the cluster except the client/master. +	quick.slave.io | Off | Performance tunable option. If this option is set to On, the plugin will try to perform I/O directly from the disk file system (like ext3 or ext4) the file resides on. Hence read performance will improve and job would run faster. **Note*: This option is not tested widely  5.  Create a soft link in Hadoop’s library and configuration directory      for the downloaded files (in Step 3) using the following commands: @@ -141,7 +120,6 @@ correctly.      For example,      `# ln –s /usr/local/lib/glusterfs-0.20.2-0.1.jar /lib/glusterfs-0.20.2-0.1.jar` -      `# ln –s /usr/local/lib/conf/core-site.xml /conf/core-site.xml `  6.  (Optional) You can run the following command on Hadoop master to @@ -150,8 +128,7 @@ correctly.      `# build-deploy-jar.py -d  -c ` -Starting and Stopping the Hadoop MapReduce Daemon -================================================= +##Starting and Stopping the Hadoop MapReduce Daemon  To start and stop MapReduce daemon @@ -164,7 +141,6 @@ To start and stop MapReduce daemon      `# /bin/stop-mapred.sh `  > **Note** ->  > You must start Hadoop MapReduce daemon on all servers.    []: http://download.gluster.com/pub/gluster/glusterfs/qa-releases/3.3-beta-2/glusterfs-hadoop-0.20.2-0.1.x86_64.rpm  | 
