简介
本文介绍了在Ubuntu 16.04 TLS系统下,如何配置Hadoop 2.7.3集群运行环境。环境大搭建使用了虚拟化平台VMware vSphere 5.1,方便虚机的拷贝和部署。读者也可以在个人PC上通过虚拟化软件VirutalBox或WorkStation部署Hadoop的集群环境。本文假定读者已经部署了Hadoop的伪分布式运行环境的虚拟机。
Ubuntu虚机基本部署
文章 在Ubuntu环境下配置Hadoop伪分布式模式运行环境介绍了伪分布式环境下Hadoop运行环境的搭建,在虚拟化平台下,我们可以复制相关的虚拟机,快速搭建Hadoop分布式开发环境。
在本文中我们将创建一个master节点(10.220.33.37),三个slave节点(10.220.33.34~10.220.33.36)。相关虚拟机节点上电后,除常规修改IP地址外,还需要对hostname和静态路由进行配置,保存后需要重启虚拟机。master节点的相关具体配置如下, slave节点参考做相应的配置:
hadoop@hadoop-master-vm:~$ cat /etc/hostname
hadoop-master-vm
hadoop@hadoop-master-vm:~$ cat /etc/hosts
127.0.0.1 localhost
127.0.1.1 hadoop-master-vm
10.220.33.37 hadoop-master-vm
10.220.33.36 hadoop-slave01-vm
10.220.33.35 hadoop-slave02-vm
10.220.33.34 hadoop-slave03-vm
# The following lines are desirable for IPv6 capable hosts
::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
配置完成后,可以在相关节点上分别做ping测试,确保相关静态路由正确配置。
SSH无密码登录节点
由于我们是基于Hadoop伪分布式虚拟机创建的Master和Slave节点,因此需要在master和slave节点上重新生产公钥:
hadoop@hadoop-master-vm:~$ cd ~/.ssh # 如果没有该目录,先执行一次ssh localhost
hadoop@hadoop-master-vm:~/.ssh$ rm ./id_rsa* # 删除之前生成的公匙(如果有)
hadoop@hadoop-master-vm:~/.ssh$ ssh-keygen -t rsa # 一直按回车就可以
hadoop@hadoop-master-vm:~/.ssh$ cat ./id_rsa.pub >> ./authorized_keys
配置完成后,可以通过ssh hostname命令来验证无密码登录是否生效。
hadoop@hadoop-master-vm:~$ ssh hadoop-master-vm
The authenticity of host 'hadoop-master-vm (127.0.1.1)' can't be established.
ECDSA key fingerprint is SHA256:1YeLhgGTygKaitVVyQCDDXKRCOHb59az/8fj0+nwvUI.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'hadoop-master-vm' (ECDSA) to the list of known hosts.
Welcome to Ubuntu 16.04.3 LTS (GNU/Linux 4.10.0-35-generic x86_64)
* Documentation: https://help.ubuntu.com
* Management: https://landscape.canonical.com
* Support: https://ubuntu.com/advantage
59 packages can be updated.
9 updates are security updates.
Last login: Sun Feb 12 21:09:20 2017 from 10.220.40.31
hadoop@hadoop-master-vm:~$
为了让Master节点可以无密码SSH登录各个Slave节点,需要将Master节点的公钥导入到各Slave节点中。我们使用scp命令将Master的公钥拷贝到Slave节点上,并导入Master公钥授权:
hadoop@hadoop-slave01-vm:~$ scp hadoop@hadoop-master-vm:/home/hadoop/.ssh/id_rsa.pub .
The authenticity of host 'hadoop-master-vm (10.220.33.37)' can't be established.
ECDSA key fingerprint is SHA256:1YeLhgGTygKaitVVyQCDDXKRCOHb59az/8fj0+nwvUI.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'hadoop-master-vm,10.220.33.37' (ECDSA) to the list of known hosts.
hadoop@hadoop-master-vm's password:
id_rsa.pub 100% 405 0.4KB/s 00:00
hadoop@hadoop-slave01-vm:~$ cat id_rsa.pub >> .ssh/authorized_keys
hadoop@hadoop-slave01-vm:~$ rm id_rsa.pub
配置完成后,可以在Master节点上通过ssh hadoop-slave01-vm命令验证,确保Master可以通过无密码登录Slave节点。
配置集群/分布式环境
接下来的内容进入我们本文的重点,配置Hadoop集群/分布式环境。Hadoop的相关配置文件保存在$HADOOP_INSTALL/etc/hadoop目录下,这里将对相关配置逐一介绍. 首先我们将在hadoop-master-vm主机上进行下面的配置:
1. slaves配置文件
slaves配置文件保存了DataNode节点的主机名列表,每行保存一个,缺省为localhost。因此在伪分布式模式下,我们并没有对slaves配置文件进行修改,因为在伪分布式模式下,节点即作为NameNode,也作为DataNode运行。在分布式配置中,可以根据具体需要确定是否需要将Master节点作为DataNode节点来运行。在我个人的环境中,删除了localhost节点,主要考虑后期需要做DataNode节点加入/离开验证。
hadoop-slave01-vm
hadoop-slave02-vm
hadoop-slave03-vm
2. core-site.xml配置
核心位置配置与伪分布式配置类似,差别主要是将fs.defaultFS配置为Master节点的地址,而非设置为localhost:
- <configuration>
- <property>
- <name>hadoop.tmp.dirname>
- <value>file:/usr/local/hadoop/tmpvalue>
- <description>Abase for other temporary directories.description>
- property>
- <property>
- <name>fs.defaultFSname>
- <value>hdfs://hadoop-master-vm:9000value>
- property>
- configuration>
3. hdfs-site.xml配置
dfs.replication一般配置为3,在当前的环境下,最多可以有4个DataNode节点,我们保留了这个配置。
- <configuration>
- <property>
- <name>dfs.namenode.secondary.http-addressname>
- <value>hadoop-master-vm:50090value>
- property>
- <property>
- <name>dfs.replicationname>
- <value>3value>
- property>
- <property>
- <name>dfs.namenode.name.dirname>
- <value>file:/usr/local/hadoop/tmp/dfs/namevalue>
- property>
- <property>
- <name>dfs.datanode.data.dirname>
- <value>file:/usr/local/hadoop/tmp/dfs/datavalue>
- property>
- configuration>
4. mapred-site.xml配置
MapReduce的相关配置:
- <configuration>
- <property>
- <name>mapreduce.framework.namename>
- <value>yarnvalue>
- property>
- <property>
- <name>mapreduce.jobhistory.addressname>
- <value>hadoop-master-vm:10020value>
- property>
- <property>
- <name>mapreduce.jobhistory.webapp.addressname>
- <value>hadoop-master-vm:19888value>
- property>
- configuration>
5. yarn-site.xml配置
- <configuration>
- <property>
- <name>yarn.resourcemanager.hostnamename>
- <value>hadoop-master-vmvalue>
- property>
- <property>
- <name>yarn.nodemanager.aux-servicesname>
- <value>mapreduce_shufflevalue>
- property>
- configuration>
将上述配置文件修改后,需要将配置同步到其它DataNode节点。
scp -r hadoop@hadoop-master-vm:/home/hadoop/hadoop-2.7.3/etc/hadoop/* /home/hadoop/hadoop-2.7.3/etc/hadoop/
接下来将进一步配置Master节点。由于之前运行过伪分布式模式,因此需要在所有节点上,将安装目录tmp和logs目录下的文件删除,恢复配置。该操作需要在所有节点在执行。在启动前需要在Master节点上执行NameNode格式化操作(仅在首次运行时执行该操作,以后不需要):
hadoop@hadoop-master-vm:~$ hdfs namenode -format
格式化完成之后,就可以在Master节点上来启动Hadoop了。
hadoop@hadoop-master-vm:~$ start-dfs.sh
Starting namenodes on [hadoop-master-vm]
hadoop-master-vm: starting namenode, logging to /usr/local/hadoop-2.7.3/logs/hadoop-hadoop-namenode-hadoop-master-vm.out
hadoop-slave02-vm: starting datanode, logging to /usr/local/hadoop-2.7.3/logs/hadoop-hadoop-datanode-hadoop-slave02-vm.out
hadoop-slave03-vm: starting datanode, logging to /usr/local/hadoop-2.7.3/logs/hadoop-hadoop-datanode-hadoop-slave03-vm.out
hadoop-slave01-vm: starting datanode, logging to /usr/local/hadoop-2.7.3/logs/hadoop-hadoop-datanode-hadoop-slave01-vm.out
Starting secondary namenodes [hadoop-master-vm]
hadoop-master-vm: starting secondarynamenode, logging to /usr/local/hadoop-2.7.3/logs/hadoop-hadoop-secondarynamenode-hadoop-master-vm.out
hadoop@hadoop-master-vm:~$ start-yarn.sh
starting yarn daemons
starting resourcemanager, logging to /usr/local/hadoop-2.7.3/logs/yarn-hadoop-resourcemanager-hadoop-master-vm.out
hadoop-slave02-vm: starting nodemanager, logging to /usr/local/hadoop-2.7.3/logs/yarn-hadoop-nodemanager-hadoop-slave02-vm.out
hadoop-slave03-vm: starting nodemanager, logging to /usr/local/hadoop-2.7.3/logs/yarn-hadoop-nodemanager-hadoop-slave03-vm.out
hadoop-slave01-vm: starting nodemanager, logging to /usr/local/hadoop-2.7.3/logs/yarn-hadoop-nodemanager-hadoop-slave01-vm.out
hadoop@hadoop-master-vm:~$ mr-jobhistory-daemon.sh start historyserver
starting historyserver, logging to /usr/local/hadoop-2.7.3/logs/mapred-hadoop-historyserver-hadoop-master-vm.out
从启动的log日志中,我们可以看出Master会启动Slave的相关服务。
通过jps命令可以查看各个节点上所启动的服务的。如果能够正常启动的话,在Master节点上,可以看到NameNode、ResourceManager、SecondaryNameNode、JobHisotryServer服务:
hadoop@hadoop-master-vm:~$ jps
3121 JobHistoryServer
2434 NameNode
2826 ResourceManager
2666 SecondaryNameNode
3215 Jps
在Slave节点上可以看到DataNode和NodeManager服务:
hadoop@hadoop-slave01-vm:~/hadoop-2.7.3$ jps
2289 Jps
2070 DataNode
2183 NodeManager
这表明Hadoop已经正常启动了。如果需要关闭Hadoop,执行下面的命令就可以了:
hadoop@hadoop-master-vm:~$ stop-yarn.sh
hadoop@hadoop-master-vm:~$ stop-dfs.sh
hadoop@hadoop-master-vm:~$ mr-jobhistory-daemon.sh stop historyserver
可以通过WEB来检查启动的正确性:
守护进程 | Web界面 |
NameNode | http://10.220.33.37:50070/ |
ResourceManager | http://10.220.33.37:8088 |
MapReduce JobHistory Server | http://10.220.33.37:19888 |
常见问题
1. 由于我们是基于现有伪分布式虚机搭建分布式环境,在第一次启动前一定要删除安装目录下tmp目录下旧的配置,否则可能出现DataNode无法启动问题。在博文Hadoop 2.7.x NameNode重新格式化后导致DataNode无法启动问题中有相关的介绍。
2. 由于/etc/hosts配置错误,导致DataNode无法连接问题,可参考博文如何解决Hadoop集群环境下DataNode无法连接NameNode问题。
评论记录:
回复评论: