搭建Hbase集群之前,确保已经正确安装了Hadoop集群环境。
?
我的集群机器:hadoopmaster(Namenode,Datanode) hadoopslaver(Datanode)两台机器
?
hadoopmaster即是namenode也是Datanode
?
第一步:下载HBase包? tar -zxvf Hbase*.tar.gz? 解压HBase
?
? 本集群环境软件版本:hadoop-1.0.4.tar.gz、hbase-0.94.2.tar.gz、jdk-8u20-linux-x64.tar.gz
?
?注意:hadoop和hbase的版本对应
?
?
?
由于 HBase 依赖 Hadoop,它配套发布了一个Hadoop jar 文件在它的?lib?下。该套装jar仅用于独立模式。在分布式模式下,Hadoop版本必须和HBase下的版本一致。用你运行的分布式Hadoop版本jar文件替换HBase lib目录下的Hadoop jar文件,以避免版本不匹配问题。确认替换了集群中所有HBase下的jar文件。Hadoop版本不匹配问题有不同表现,但看起来都像挂掉了。
?
第二步:修改conf目录下的配置文件:
?
(1)hbase-env.sh
?
class="java"># #/** # * Copyright 2007 The Apache Software Foundation # * # * Licensed to the Apache Software Foundation (ASF) under one # * or more contributor license agreements. See the NOTICE file # * distributed with this work for additional information # * regarding copyright ownership. The ASF licenses this file # * to you under the Apache License, Version 2.0 (the # * "License"); you may not use this file except in compliance # * with the License. You may obtain a copy of the License at # * # * http://www.apache.org/licenses/LICENSE-2.0 # * # * Unless required by applicable law or agreed to in writing, software # * distributed under the License is distributed on an "AS IS" BASIS, # * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # * See the License for the specific language governing permissions and # * limitations under the License. # */ # Set environment variables here. # The java implementation to use. Java 1.6 required. export JAVA_HOME=/usr/devesoftware/java # Extra Java CLASSPATH elements. Optional. export HBASE_CLASSPATH=/usr/devesoftware/hadoop/conf # The maximum amount of heap to use, in MB. Default is 1000. # export HBASE_HEAPSIZE=1000 # Extra Java runtime options. # Below are what we set by default. May only work with SUN JVM. # For more on why as well as other possible settings, # see http://wiki.apache.org/hadoop/PerformanceTuning export HBASE_OPTS="$HBASE_OPTS -XX:+UseConcMarkSweepGC" # Uncomment below to enable java garbage collection logging in the .out file. # export HBASE_OPTS="$HBASE_OPTS -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps $HBASE_GC_OPTS" # Uncomment below (along with above GC logging) to put GC information in its own logfile (will set HBASE_GC_OPTS) # export HBASE_USE_GC_LOGFILE=true # Uncomment below if you intend to use the EXPERIMENTAL off heap cache. # export HBASE_OPTS="$HBASE_OPTS -XX:MaxDirectMemorySize=" # Set hbase.offheapcache.percentage in hbase-site.xml to a nonzero value. # Uncomment and adjust to enable JMX exporting # See jmxremote.password and jmxremote.access in $JRE_HOME/lib/management to configure remote password access. # More details at: http://java.sun.com/javase/6/docs/technotes/guides/management/agent.html # # export HBASE_JMX_BASE="-Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false" # export HBASE_MASTER_OPTS="$HBASE_MASTER_OPTS $HBASE_JMX_BASE -Dcom.sun.management.jmxremote.port=10101" # export HBASE_REGIONSERVER_OPTS="$HBASE_REGIONSERVER_OPTS $HBASE_JMX_BASE -Dcom.sun.management.jmxremote.port=10102" # export HBASE_THRIFT_OPTS="$HBASE_THRIFT_OPTS $HBASE_JMX_BASE -Dcom.sun.management.jmxremote.port=10103" # export HBASE_ZOOKEEPER_OPTS="$HBASE_ZOOKEEPER_OPTS $HBASE_JMX_BASE -Dcom.sun.management.jmxremote.port=10104" # File naming hosts on which HRegionServers will run. $HBASE_HOME/conf/regionservers by default. # export HBASE_REGIONSERVERS=${HBASE_HOME}/conf/regionservers # File naming hosts on which backup HMaster will run. $HBASE_HOME/conf/backup-masters by default. # export HBASE_BACKUP_MASTERS=${HBASE_HOME}/conf/backup-masters # Extra ssh options. Empty by default. # export HBASE_SSH_OPTS="-o ConnectTimeout=1 -o SendEnv=HBASE_CONF_DIR" # Where log files are stored. $HBASE_HOME/logs by default. # export HBASE_LOG_DIR=${HBASE_HOME}/logs # Enable remote JDWP debugging of major HBase processes. Meant for Core Developers # export HBASE_MASTER_OPTS="$HBASE_MASTER_OPTS -Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=8070" # export HBASE_REGIONSERVER_OPTS="$HBASE_REGIONSERVER_OPTS -Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=8071" # export HBASE_THRIFT_OPTS="$HBASE_THRIFT_OPTS -Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=8072" # export HBASE_ZOOKEEPER_OPTS="$HBASE_ZOOKEEPER_OPTS -Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=8073" # A string representing this instance of hbase. $USER by default. # export HBASE_IDENT_STRING=$USER # The scheduling priority for daemon processes. See 'man nice'. # export HBASE_NICENESS=10 # The directory where pid files are stored. /tmp by default. # export HBASE_PID_DIR=/var/hadoop/pids # Seconds to sleep between slave commands. Unset by default. This # can be useful in large clusters, where, e.g., slave rsyncs can # otherwise arrive faster than the master can service them. # export HBASE_SLAVE_SLEEP=0.1 # Tell HBase whether it should manage it's own instance of Zookeeper or not. export HBASE_MANAGES_ZK=true
?
(2)hbase-site.xml
/** * Copyright 2010 The Apache Software Foundation * * Licensed to the Apache Software Foundation (ASF) under one * or more contributor license agreements. See the NOTICE file * distributed with this work for additional information * regarding copyright ownership. The ASF licenses this file * to you under the Apache License, Version 2.0 (the * "License"); you may not use this file except in compliance * with the License. You may obtain a copy of the License at * * http://www.apache.org/licenses/LICENSE-2.0 * * Unless required by applicable law or agreed to in writing, software * distributed under the License is distributed on an "AS IS" BASIS, * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. * See the License for the specific language governing permissions and * limitations under the License. */ --> <configuration> <property> #hbase.rootdir hbase在hdfs上的数据存储目录 <name>hbase.rootdir</name> <value>hdfs://hadoopmaster:9000/hbase</value> </property> #hbase.cluster.distributed指定是否是完全分布式,如果是单机模式或者伪分布式,则设置为false <property> <name>hbase.cluster.distributed</name> <value>true</value> </property> #hbase.master指定master的位置 <property> <name>hbase.master</name> <value>hdfs://hadoopmaster:60000</value> </property> #hbase.zookeeper.quorum指定zookeeper的分布式部署机器没,多个以逗号分隔 <property> <name>hbase.zookeeper.quorum</name> <value>hadoopmaster,hadoopslaver</value> </property> </configuration>
(3)regionservers
?
?
?
hadoopmaster hadoopslaver
?
?
(4)最后修改Hadoop hdfs-site.xml ?<?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <!-- Put site-specific property overrides in this file. --> <configuration> <property> <name>dfs.replication</name> <value>1</value> </property> <property> <name>dfs.name.dir</name> <value>/home/${user.name}/dfs/name</value> </property> <property> <name>dfs.data.dir</name> <value>/home/${user.name}/dfs/data</value> </property> #该参数限制了datanode所允许同时执行的发送和接受任务的数量,缺省为256,hadoop-defaults.xml中通常不设置这个参数。这个限制看来实际有些偏小 <property> <name>dfs.datanode.max.xcievers</name> <value>4096</value> </property> </configuration>第三步:拷贝hbase到所有的节点上,按照实际机器路径配置 ? 第四步:修改etc/profile 增加HBASE_HOME ?
#set java path JAVA_HOME=/usr/devesoftware/java #set hadoop path HADOOP_HOME=/usr/devesoftware/hadoop #set hbase path HBASE_HOME=/usr/devesoftware/hbase CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar PATH=$HBASE_HOME/bin:$HADOOP_HOME/bin:$JAVA_HOME/bin:$PATH export HBASE_HOME HADOOP_HOME JAVA_HOME CLASSPATH PATH export HADOOP_HOME_WARN_SUPPRESS=1? 第五步:在NameNode上启动 ? 启动顺序:先启动hadoop,再次启动hbase ?
[hadoop@hadoopmaster ~]$ start-all.sh [hadoop@hadoopmaster ~]$ start-hbase.sh [hadoop@hadoopmaster ~]$ jps 4129 TaskTracker 3906 SecondaryNameNode 4675 HRegionServer 4003 JobTracker 3668 NameNode 4468 HQuorumPeer 3784 DataNode 4745 Jps 4524 HMaster? 查看DataNode节点进程启动情况: ?
[hadoop@hadoopslaver ~]$ jps 4016 Jps 3636 HRegionServer 3461 TaskTracker 3358 DataNode 3551 HQuorumPeer? 第六步:进入hbase shll客户端 ?
[hadoop@hadoopmaster conf]$ hbase shell HBase Shell; enter 'help<RETURN>' for list of supported commands. Type "exit<RETURN>" to leave the HBase Shell Version 0.94.2, r1395367, Sun Oct 7 19:11:01 UTC 2012 hbase(main):001:0> list TABLE 0 row(s) in 1.0800 seconds hbase(main):002:0>? 至此,hbase集群安装成功!!!! 如果,整合Mapreduce和Hbase开发,那么需要进入$HADOOP_HOME/conf中,修改hadoop-env.sh文件: 在Hadoop集群中添加hbase引用:
# Set Hadoop-specific environment variables here. # The only required environment variable is JAVA_HOME. All others are # optional. When running a distributed configuration it is best to # set JAVA_HOME in this file, so that it is correctly defined on # remote nodes. # Extra Java CLASSPATH elements. Optional. # export HADOOP_CLASSPATH= # The maximum amount of heap to use, in MB. Default is 1000. # export HADOOP_HEAPSIZE=2000 # Extra Java runtime options. Empty by default. # export HADOOP_OPTS=-server # Command specific options appended to HADOOP_OPTS when specified # The only required environment variable is JAVA_HOME. All others are # optional. When running a distributed configuration it is best to # set JAVA_HOME in this file, so that it is correctly defined on # remote nodes. # Extra Java CLASSPATH elements. Optional. # export HADOOP_CLASSPATH= # The maximum amount of heap to use, in MB. Default is 1000. # export HADOOP_HEAPSIZE=2000 # Extra Java runtime options. Empty by default. # export HADOOP_OPTS=-server # Command specific options appended to HADOOP_OPTS when specified # export HADOOP_TASKTRACKER_OPTS= # The following applies to multiple commands (fs, dfs, fsck, distcp etc) # export HADOOP_CLIENT_OPTS # Extra ssh options. Empty by default. # export HADOOP_SSH_OPTS="-o ConnectTimeout=1 -o SendEnv=HADOOP_CONF_DIR" # Where log files are stored. $HADOOP_HOME/logs by default. # export HADOOP_LOG_DIR=${HADOOP_HOME}/logs # remote nodes. # The java implementation to use. Required. export JAVA_HOME=/usr/devesoftware/java export HADOOP_HEAPSIZE=2000 export HADOOP_PID_DIR=/home/$USER/pids export HBASE_HOME=/usr/devesoftware/hbase export HADOOP_CLASSPATH=$HBASE_HOME/hbase-0.94.2.jar:$HBASE_HOME/hbase-0.94.2-t ests.jar:$HBASE_HOME/conf:$HBASE_HOME/lib:$HBASE_HOME/lib/zookeeper-3.4.5.jar:$HBASE_HOME/lib/protobuf-java-2.4.0a.jar:$HBASE_HOME/lib/guava-11.0.2.jar # Extra Java CLASSPATH elements. Optional. # export HADOOP_CLASSPATH= # The maximum amount of heap to use, in MB. Default is 1000. # export HADOOP_HEAPSIZE=2000?
?