2024-04-28 Hadoop的安装部署

随笔2周前发布 四溢
5 0 0

Hadoop的安装部署

下载安装包,放置服务器安装目录并解压

下载的安装包最好放到opt去解压安装,因为后面data有默认在opt创建,如果没有在opt,要把opt给bxwl授权,授权都需要切换到root用户  chmod 777 -R /opt ,否则sbin/start-dfs.sh 会报错Cannot set priority of datanode process XX;报错详情可格式化查看hdfs namenode -format

2024-04-28 Hadoop的安装部署

2024-04-28 Hadoop的安装部署

1、配置HADOOP_HOME环境变量

[bxwl@snode028 bin]$ vim /etc/profile.d/bxwl.sh

export JAVA_HOME=/opt/jdk1.8.0_291

export JRE_HOME=${JAVA_HOME}/jre

export CLASSPATH=.:${JAVA_HOME}/lib:${JRE_HOME}/lib

export HADOOP_HOME=/opt/hadoop-3.3.2

#export HADOOP_CONF_DIR=/opt/hadoop-3.3.2/etc

export PATH=${JAVA_HOME}/bin:${HADOOP_HOME}/bin:$PATH

2、配置集群服务器节点workers

[bxwl@snode028 ~]$ cd /opt/hadoop-3.3.2/etc/hadoop

[bxwl@snode028 hadoop]$ vim workers

snode028

snode029

snode030

3、配置core-site.xml

指定NameNode的地址、指定Hadoop数据的存储目录、配置HDFS网页登录使用的静态用户为bxwl(不能是root)

[bxwl@snode028 hadoop]$ vim core-site.xml

… <configuration> <property> <name>fs.defaultFS</name> <value>hdfs://snode028:8020</value> </property> <property> <name>hadoop.tmp.dir</name> <value>/opt/hadoop-3.3.2/data</value> </property> <property> <name>hadoop.http.staticuser.user</name> <value>bxwl</value> </property> </configuration>

4、配置hdfs-site.xml

指定NameNode的 http地址、指定secondary NameNode的http地址、在webhdfs后台系统能查看文件内容

[bxwl@snode028 hadoop]$ vim hdfs-site.xml

… <configuration> <property> <name>dfs.namenode.http-address</name> <value>snode028:9870</value> </property> <property> <name>dfs.namenode.secondary.http-address</name> <value>snode029:9868</value> </property> <property> <name>dfs.webhdfs.enabled</name> <value>true</value> </property> </configuration>

5、配置mapred-site.xml

指定MapReduce程序运行在Yarn上、Job历史服务器地址、Job历史服务器Web端地址

[bxwl@snode028 hadoop]$ vim mapred-site.xml

… <configuration> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> <property> <name>mapreduce.jobhistory.address</name> <value>snode028:10020</value> </property> <property> <name>mapreduce.jobhistory.webapp.address</name> <value>snode028:19888</value> </property> </configuration>

6、配置yarn-site.xml

指定MR走shuffle、指定ResourceManager的地址、环境变量的继承、开启日志聚集功能、设置日志聚集服务器地址、设置日志保留时间为7天

[bxwl@snode028 hadoop]$ vim yarn-site.xml

… yarn.nodemanager.aux-services mapreduce_shuffle yarn.resourcemanager.hostname snode030 yarn.nodemanager.env-whitelist JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASSPATH_PREPEND_DISTCACHE,HADOOP_YARN_HOME,HADOOP_MAPRED_HOME yarn.log-aggregation-enable true yarn.log.server.url http://snode028:19888/jobhistory/logs yarn.log-aggregation.retain-seconds 604800

7、格式化 NameNode (第一次启动时需要)

[bxwl@snode028 hadoop-3.3.2]$ hdfs namenode -format

WARNING: /opt/hadoop-3.3.2/logs does not exist. Creating.

2022-04-15 18:44:40,635 INFO namenode.NameNode: STARTUP_MSG:

/************************************************************

STARTUP_MSG: Starting NameNode

STARTUP_MSG:  host = snode028/192.168.100.28

STARTUP_MSG:  args = [-format]

STARTUP_MSG:  version = 3.3.2

STARTUP_MSG:  …

8、启动HDFS

[bxwl@snode028 hadoop-3.3.2]$ sbin/start-dfs.sh

Starting namenodes on [snode028]

Starting datanodes

snode030: WARNING: /opt/hadoop-3.3.2/logs does not exist. Creating.

snode029: WARNING: /opt/hadoop-3.3.2/logs does not exist. Creating.

Starting secondary namenodes [snode029]

Web访问HDFS

浏览器直接访问:http://192.168.100.28:9870

9、启动Yarn

// Yarn配置在snode030,所以要到snode030上启动

[bxwl@snode028 hadoop-3.3.2]$ ssh snode030

[bxwl@snode030 ~]$ cd /opt/hadoop-3.3.2

[bxwl@snode030 hadoop-3.3.2]$ sbin/start-yarn.sh

Starting resourcemanager

Starting nodemanagers

[bxwl@snode030 hadoop-3.3.2]$ ~/bin/jpsall

———- snode028 jps ————

6672 DataNode

6521 NameNode

7003 NodeManager

6029 QuorumPeerMain

7101 Jps

———- snode029 jps ————

6146 SecondaryNameNode

6306 NodeManager

6036 DataNode

5750 QuorumPeerMain

6406 Jps

———- snode030 jps ————

6195 NodeManager

6070 ResourceManager

5595 QuorumPeerMain

5837 DataNode

6527 Jps

[bxwl@snode030 hadoop-3.3.2]$

Web访问Yarn

浏览器直接访问:http://192.168.100.30:8088

10、编写hadoop 启动关闭的脚本

[bxwl@snode028 bin]$ vim hadoop.sh

#!/bin/bash

case $1 in

“start”){

  echo ———-HDFS 启动————

  ssh snode028 “/opt/hadoop-3.3.2/sbin/start-dfs.sh”

  echo ———- Yarn 启动————

  ssh snode030 “/opt/hadoop-3.3.2/sbin/start-yarn.sh”

  echo ———- Job历史服务器 启动————

  ssh snode028 “/opt/hadoop-3.3.2/bin/mapred –daemon start historyserver”

};;

“stop”){

  echo ———- Job历史服务器 关闭————

  ssh snode028 “/opt/hadoop-3.3.2/bin/mapred –daemon stop historyserver”

  echo ———- Yarn $i 关闭————

  ssh snode030 “/opt/hadoop-3.3.2/sbin/stop-yarn.sh”

  echo ———-HDFS $i 关闭————

  ssh snode028 “/opt/hadoop-3.3.2/sbin/stop-dfs.sh”

};;

esac

// 文件权限

[bxwl@snode028 bin]$ chmod +x hadoop.sh

// 关闭

[bxwl@snode028 bin]$ hadoop.sh stop

———- Job历史服务器 关闭————

———- Yarn 关闭————

Stopping nodemanagers

Stopping resourcemanager

———-HDFS 关闭————

Stopping namenodes on [snode028]

Stopping datanodes

Stopping secondary namenodes [snode029]

// 启动

[bxwl@snode028 bin]$ hadoop.sh start

———-HDFS 启动————

Starting namenodes on [snode028]

Starting datanodes

Starting secondary namenodes [snode029]

———- Yarn 启动————

Starting resourcemanager

Starting nodemanagers

———- Job历史服务器 启动————

© 版权声明

相关文章

暂无评论

您必须登录才能参与评论!
立即登录
暂无评论...