单台openEuler24.03 LTS下的开源大数据环境搭建
目录
概述
准备
虚拟机基本设置
关闭及禁用防火墙
修改主机名
静态ip
映射主机名
创建普通用户
SSH免密登录
目录准备
安装Java
下载Java
解压
设置环境变量
安装Hadoop
下载hadoop
解压
设置环境变量
查看版本
配置hadoop
配置hadoop_env.sh
配置core-site.xml
配置hdfs-site.xml
配置mapred-site.xml
配置yarn-site.xml
配置workers
格式化文件系统
启动hadoop
测试hadoop
安装MySQL
卸载原有mysql及mariadb
下载mysql
解压mysql
安装mysql
启动mysql服务
开机自启动mysql服务
登录mysql
修改mysql密码
远程连接mysql
安装Hive
下载hive
解压
设置环境变量
解决日志Jar包依赖冲突
下载mysql驱动包
配置Hive
初始化Hive元数据库
修改元数据库字符集
设置日志级别
启动Hive客户端
简单使用Hive
退出Hive客户端
安装Zookeeper
下载ZooKeeper
解压
配置环境变量
配置ZooKeeper
启动ZooKeeper
查看ZooKeeper状态
安装Kafka
下载Kafka
解压
设置环境变量
配置Kafka
启动Kafka
测试Kafka
安装Flume
下载Flume
解压
设置环境变量
修改日志文件
测试Flume
安装HBase
下载HBase
解压
设置环境变量
配置hbase
配置hbase-env.sh
配置hbase-site.xml
配置regionservers
启动HBase
查看Web UI界面
简单使用
关闭HBase
概述
本文介绍基于单台openEuler24.03 LTS的Linux下的开源大数据环境搭建,安装的大数据环境有:Hadoop、Hive、ZooKeeper、HBase、Kafka等,因为大数据集群通常有多台机器,开启集群时较为麻烦且占用资源多可能会造成电脑卡顿,本文搭建的单台大数据环境可作为替代大数据集群的快速的学习或测试环境。当然,如果电脑配置较好,可直接搭建多台机器的大数据集群环境。
准备
准备好一台openEuler24.03 LTS的Linux,可参考:Vmware下安装openEuler24.03 LTS
虚拟机基本设置
关闭及禁用防火墙
[root@localhost ~]# systemctl stop firewalld [root@localhost ~]# systemctl disable firewalld
修改主机名
# 修改主机名 [root@localhost ~]# hostnamectl set-hostname node1 # 重启 [root@localhost ~]# reboot
重启后,重新连接,看到显示的主机名已经变为node1了
[root@node1 ~]#
静态ip
默认为DHCP,ip可能会变化,ip变化会带来不必要的麻烦,所以需要将ip固定下来方便使用。
[root@node1 ~]# cd /etc/sysconfig/network-scripts/ [root@node1 network-scripts]# ls ifcfg-ens33 [root@node1 network-scripts]# vim ifcfg-ens33
修改内容如下
# 修改 BOOTPROTO=static # 添加 IPADDR=192.168.193.128 NETMASK=255.255.255.0 GATEWAY=192.168.193.2 DNS1=192.168.193.2 DNS2=114.114.114.114
这里设置的固定IP为192.168.193.128。注意:IPADDR、GATEWAY、DNS,使用192.168.193.*的网段要与Vmware查询到的NAT网络所在的网段一致,请根据实际情况修改网段值,网段查询方法:打开Vmware,文件-->虚拟网络编辑器。
reboot重启生效
reboot
映射主机名
修改/etc/hosts
[liang@node1 software]$ sudo vim /etc/hosts
末尾添加如下内容
192.168.193.128 node1
注意:ip和主机名,请根据实际情况修改。
创建普通用户
因为root
用户权限太高,误操作可能会造成不可挽回的损失,所以需要新建一个普通用户来进行后续大数据环境操作。这里创建一个名为liang的普通用户,密码也是liang,注意:用户名和密码请根据实际需要修改。命令如下:
useradd liang passwd liang
操作过程
[root@node1 ~]# useradd liang [root@node1 ~]# passwd liang 更改用户 liang 的密码 。 新的密码: 无效的密码: 密码少于 8 个字符 重新输入新的密码: passwd:所有的身份验证令牌已经成功更新。
虽然提示无效密码,但已经更新成功。
给新用户添加sudo权限
修改/etc/sudoers文件
vim /etc/sudoers
在%wheel这行下面添加如下一行
liang ALL=(ALL) NOPASSWD:ALL
注意:liang是用户名,需要根据实际情况修改。
SSH免密登录
后续安装操作都在普通用户下操作,所以需要在普通用户下设置SSH免密登录。
登录刚创建的普通用户(例如:liang),执行如下命令
[liang@node1 software]$ ssh-keygen -t rsa
执行命令后,再连续敲三次回车
[liang@node1 software]$ ssh-keygen -t rsa Generating public/private rsa key pair. Enter file in which to save the key (/home/liang/.ssh/id_rsa): Created directory '/home/liang/.ssh'. Enter passphrase (empty for no passphrase): Enter same passphrase again: Your identification has been saved in /home/liang/.ssh/id_rsa Your public key has been saved in /home/liang/.ssh/id_rsa.pub The key fingerprint is: SHA256:4y7NIz4HqFl7fbq4wsqo0rUlEt6Rm7hDqRPldhmaz4U liang@node1 The key's randomart image is: +---[RSA 3072]----+ | | | | | . | | o + | | + B O S | |. @ E +. . | | * % * =. | |+o* B =o* . | |=.oo ++*+= | +----[SHA256]-----+ [liang@node1 software]$
查看生成的密钥对
[liang@node1 software]$ ls ~/.ssh/ id_rsa id_rsa.pub
复制公钥到需要登录的机器,伪分布node1需要免密登录到node1,执行ssh-copy-id node1
命令,然后根据提示输入yes,再根据提示输入当前用户(liang)的登录密码。
[liang@node1 software]$ ssh-copy-id node1
/usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/home/liang/.ssh/id_rsa.pub"
The authenticity of host 'node1 (fe80::20c:29ff:fe87:612e%ens33)' can't be established.
ED25519 key fingerprint is SHA256:+clas7Tv8ltjV7X63z8IsHa21v+dfh0SWPvqNqzzlKE.
This key is not known by any other names.
Are you sure you want to continue connecting (yes/no/[fingerprint])? yes
/usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
Authorized users only. All activities may be monitored and reported.
liang@node1's password:
验证免密登录
[liang@node1 software]$ ssh node1 Authorized users only. All activities may be monitored and reported. Authorized users only. All activities may be monitored and reported. Last login: Mon Mar 10 00:32:12 2025 from 192.168.193.1 Welcome to 6.6.0-72.0.0.76.oe2403sp1.x86_64 System information as of time: 2025年 03月 10日 星期一 00:47:17 CST System load: 0.00 Memory used: 9.5% Swap used: 0% Usage On: 7% IP address: 192.168.193.128 Users online: 4 To run a command as administrator(user "root"),use "sudo <command>". [liang@node1 ~]$ exit 注销 Connection to node1 closed. [liang@node1 software]$
ssh登录时,可以看到不需要输入密码,同时观察路径的变化情况。
目录准备
目录规划:
1.把软件安装包放在/opt/software目录;
2.把可定义安装目录的软件安装在/opt/module目录。
注意:规划的目录可以根据实际需要修改。
创建目录及修改权限
[root@node1 ~]# mkdir /opt/module [root@node1 ~]# mkdir /opt/software [root@node1 ~]# chown liang:liang /opt/module [root@node1 ~]# chown liang:liang /opt/software
注意:如果普通用户不是liang,chown命令的liang需要根据实际情况修改。
安装Java
Java是基础软件,查看Hadoop支持的Java版本
Supported Java Versions Apache Hadoop 3.3 and upper supports Java 8 and Java 11 (runtime only) Please compile Hadoop with Java 8. Compiling Hadoop with Java 11 is not supported: Apache Hadoop from 3.0.x to 3.2.x now supports only Java 8 Apache Hadoop from 2.7.x to 2.10.x support both Java 7 and 8
看到Hadoop3.3及以上版本只支持Java8和Java11,编译只支持Java8。若使用更高版本的Java,需要做一定的适配。所以这里选择Java8。
下载Java
下载Java8,下载版本为:jdk-8u271-linux-x64.tar.gz,下载地址:
https://www.oracle.com/java/technologies/javase/javase8u211-later-archive-downloads.html
上传jdk-8u271-linux-x64.tar.gz到Linux的/opt/software
[liang@node1 opt]$ ls /opt/software/ jdk-8u271-linux-x64.tar.gz
解压
[liang@node1 opt]$ cd /opt/software/ [liang@node1 software]$ ls jdk-8u271-linux-x64.tar.gz [liang@node1 software]$ tar -zxvf jdk-8u271-linux-x64.tar.gz -C /opt/module/
设置环境变量
[liang@node1 software]$ sudo vim /etc/profile.d/my_env.sh
配置内容如下
#JAVA_HOME export JAVA_HOME=/home/hadoop/soft/jdk1.8.0_271 export PATH=$PATH:$JAVA_HOME/bin
让环境变量生效
[liang@node1 software]$ source /etc/profile
查看版本
[liang@node1 module]$ java -version
正常可以看到java version "1.8.0.271"版本号输出,如果看不到,再检查前面的步骤是否正确。
安装Hadoop
在单台Linux机器下,安装配置Hadoop伪分布式。
下载hadoop
官网下载hadoop3.3.4
https://archive.apache.org/dist/hadoop/common/hadoop-3.3.4/hadoop-3.3.4.tar.gz
上传hadoop安装包到Linux /opt/software
[liang@node1 opt]$ ls /opt/software/ | grep hadoop hadoop-3.3.4.tar.gz
解压
[liang@node1 opt]$ cd /opt/software/ [liang@node1 software]$ tar -zxvf hadoop-3.3.4.tar.gz -C /opt/module/
设置环境变量
[liang@node1 software]$ sudo vim /etc/profile.d/my_env.sh
文件末尾,添加如下内容
#HADOOP_HOME export HADOOP_HOME=/opt/module/hadoop-3.3.4 export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
让环境变量立即生效
[liang@node1 software]$ source /etc/profile
查看版本
[liang@node1 software]$ hadoop version Hadoop 3.3.4 Source code repository https://github.com/apache/hadoop.git -r a585a73c3e02ac62350c136643a5e7f6095a3dbb Compiled by stevel on 2022-07-29T12:32Z Compiled with protoc 3.7.1 From source with checksum fb9dd8918a7b8a5b430d61af858f6ec This command was run using /opt/module/hadoop-3.3.4/share/hadoop/common/hadoop-common-3.3.4.jar
配置hadoop
配置hadoop伪分布式
进入配置文件所在目录,并查看配置文件
[liang@node1 software]$ cd $HADOOP_HOME/etc/hadoop/ [liang@node1 hadoop]$ ls capacity-scheduler.xml httpfs-env.sh mapred-site.xml configuration.xsl httpfs-log4j.properties shellprofile.d container-executor.cfg httpfs-site.xml ssl-client.xml.example core-site.xml kms-acls.xml ssl-server.xml.example hadoop-env.cmd kms-env.sh user_ec_policies.xml.template hadoop-env.sh kms-log4j.properties workers hadoop-metrics2.properties kms-site.xml yarn-env.cmd hadoop-policy.xml log4j.properties yarn-env.sh hadoop-user-functions.sh.example mapred-env.cmd yarnservice-log4j.properties hdfs-rbf-site.xml mapred-env.sh yarn-site.xml hdfs-site.xml mapred-queues.xml.template
配置hadoop_env.sh
[liang@node1 hadoop]$ vim hadoop-env.sh
配置如下内容
export JAVA_HOME=/opt/module/jdk1.8.0_271
配置core-site.xml
[liang@node1 hadoop]$ vim core-site.xml
在<configuration>
和</configuration>
之间添加如下内容
<!-- 指定NameNode的地址 --><property><name>fs.defaultFS</name><value>hdfs://node1:8020</value></property><!-- 指定hadoop数据的存储目录 --><property><name>hadoop.tmp.dir</name><value>/opt/module/hadoop-3.3.4/data</value></property><!-- 配置HDFS网页登录使用的静态用户为liang --><property><name>hadoop.http.staticuser.user</name><value>liang</value></property><!-- 配置该liang(superUser)允许通过代理访问的主机节点 --><property><name>hadoop.proxyuser.liang.hosts</name><value>*</value></property><!-- 配置该liang(superUser)允许通过代理用户所属组 --><property><name>hadoop.proxyuser.liang.groups</name><value>*</value></property><!-- 配置该liang(superUser)允许通过代理的用户--><property><name>hadoop.proxyuser.liang.users</name><value>*</value></property>
注意:如果主机名不是node1,用户名不是liang,根据实际情况修改主机名和用户名,后续的配置同样注意修改。
配置hdfs-site.xml
[liang@node1 hadoop]$ vim hdfs-site.xml
在<configuration>
和</configuration>
之间添加如下内容
<!-- 测试环境指定HDFS副本的数量1 --><property><name>dfs.replication</name><value>1</value></property>
配置mapred-site.xml
[liang@node1 hadoop]$ vim mapred-site.xml
同样在<configuration>
与</configuration>
之间添加配置内容如下:
<!-- mapreduce运行在yarn框架之上 --><property><name>mapreduce.framework.name</name><value>yarn</value></property><!-- 历史服务器端地址 --><property><name>mapreduce.jobhistory.address</name><value>node1:10020</value></property><!-- 历史服务器web端地址 --><property><name>mapreduce.jobhistory.webapp.address</name><value>node1:19888</value></property>
配置yarn-site.xml
[liang@node1 hadoop]$ vim yarn-site.xml
同样在<configuration>
与</configuration>
之间添加配置内容如下:
<property><name>yarn.resourcemanager.hostname</name><value>node1</value></property><property><name>yarn.nodemanager.aux-services</name><value>mapreduce_shuffle</value></property><!-- 环境变量的继承 --><property><name>yarn.nodemanager.env-whitelist</name><value>JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASSPATH_PREPEND_DISTCACHE,HADOOP_YARN_HOME,HADOOP_MAPRED_HOME</value></property><!--yarn单个容器允许分配的最大最小内存 --><property><name>yarn.scheduler.minimum-allocation-mb</name><value>512</value></property><property><name>yarn.scheduler.maximum-allocation-mb</name><value>4096</value></property><!-- yarn容器允许管理的物理内存大小 --><property><name>yarn.nodemanager.resource.memory-mb</name><value>4096</value></property><!-- 关闭yarn对物理内存和虚拟内存的限制检查 --><property><name>yarn.nodemanager.pmem-check-enabled</name><value>false</value></property><property><name>yarn.nodemanager.vmem-check-enabled</name><value>false</value></property><!-- 开启日志聚集功能 --><property><name>yarn.log-aggregation-enable</name><value>true</value></property><!-- 设置日志聚集服务器地址 --><property><name>yarn.log.server.url</name><value>http://node1:19888/jobhistory/logs</value></property><!-- 设置日志保留时间为7天 --><property><name>yarn.log-aggregation.retain-seconds</name><value>604800</value></property>
配置workers
配置从节点所在的机器
[liang@node1 hadoop]$ vim workers
将localhost修改为主机名
node1
格式化文件系统
[liang@node1 hadoop]$ hdfs namenode -format
看到successfully formatted,说明格式化成功。如果格式化不成功,检查前面配置是否正确。
注意:格式化只需要做一次,格式化成功后,以后就不能再次格式化了。
启动hadoop
启动hadoop
[liang@node1 hadoop]$ start-dfs.sh [liang@node1 hadoop]$ start-yarn.sh
查看进程
[liang@node1 hadoop]$ jps 6016 Jps 5377 ResourceManager 4038 SecondaryNameNode 5718 NodeManager 3768 DataNode 3483 NameNode
如果少进程,可以切换到$HADOOP_HOME/logs目录,查看对应进程的log文件排查原因,例如:缺少DataNode,应该查看的DataNode的log文件,查看log报的错误、警告、异常信息(Error、Warning、Exception),找到少进程的原因,并解决。
测试hadoop
计算pi
hadoop jar $HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.3.4.jar pi 2 4
[liang@node1 ~]$ hadoop jar $HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.3.4.jar pi 2 4 Number of Maps = 2 Samples per Map = 4 Wrote input for Map #0 Wrote input for Map #1 Starting Job 2025-03-10 17:49:18,652 INFO client.DefaultNoHARMFailoverProxyProvider: Connecting to ResourceManager at node1/192.168.193.128:8032 2025-03-10 17:49:18,956 INFO mapreduce.JobResourceUploader: Disabling Erasure Coding for path: /tmp/hadoop-yarn/staging/liang/.staging/job_1741600078175_0001 2025-03-10 17:49:19,468 INFO input.FileInputFormat: Total input files to process : 2 2025-03-10 17:49:19,909 INFO mapreduce.JobSubmitter: number of splits:2 2025-03-10 17:49:20,399 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1741600078175_0001 2025-03-10 17:49:20,400 INFO mapreduce.JobSubmitter: Executing with tokens: [] 2025-03-10 17:49:20,522 INFO conf.Configuration: resource-types.xml not found 2025-03-10 17:49:20,522 INFO resource.ResourceUtils: Unable to find 'resource-types.xml'. 2025-03-10 17:49:20,903 INFO impl.YarnClientImpl: Submitted application application_1741600078175_0001 2025-03-10 17:49:20,940 INFO mapreduce.Job: The url to track the job: http://node1:8088/proxy/application_1741600078175_0001/ 2025-03-10 17:49:20,940 INFO mapreduce.Job: Running job: job_1741600078175_0001 2025-03-10 17:49:27,024 INFO mapreduce.Job: Job job_1741600078175_0001 running in uber mode : false 2025-03-10 17:49:27,026 INFO mapreduce.Job: map 0% reduce 0% 2025-03-10 17:49:31,090 INFO mapreduce.Job: map 100% reduce 0% 2025-03-10 17:49:36,137 INFO mapreduce.Job: map 100% reduce 100% 2025-03-10 17:49:37,160 INFO mapreduce.Job: Job job_1741600078175_0001 completed successfully 2025-03-10 17:49:37,229 INFO mapreduce.Job: Counters: 54File System CountersFILE: Number of bytes read=50FILE: Number of bytes written=829338FILE: Number of read operations=0FILE: Number of large read operations=0FILE: Number of write operations=0HDFS: Number of bytes read=522HDFS: Number of bytes written=215HDFS: Number of read operations=13HDFS: Number of large read operations=0HDFS: Number of write operations=3HDFS: Number of bytes read erasure-coded=0Job CountersLaunched map tasks=2Launched reduce tasks=1Data-local map tasks=2Total time spent by all maps in occupied slots (ms)=7850Total time spent by all reduces in occupied slots (ms)=3914Total time spent by all map tasks (ms)=3925Total time spent by all reduce tasks (ms)=1957Total vcore-milliseconds taken by all map tasks=3925Total vcore-milliseconds taken by all reduce tasks=1957Total megabyte-milliseconds taken by all map tasks=4019200Total megabyte-milliseconds taken by all reduce tasks=2003968Map-Reduce FrameworkMap input records=2Map output records=4Map output bytes=36Map output materialized bytes=56Input split bytes=286Combine input records=0Combine output records=0Reduce input groups=2Reduce shuffle bytes=56Reduce input records=4Reduce output records=0Spilled Records=8Shuffled Maps =2Failed Shuffles=0Merged Map outputs=2GC time elapsed (ms)=132CPU time spent (ms)=1280Physical memory (bytes) snapshot=821047296Virtual memory (bytes) snapshot=7756972032Total committed heap usage (bytes)=635437056Peak Map Physical memory (bytes)=303165440Peak Map Virtual memory (bytes)=2586820608Peak Reduce Physical memory (bytes)=225595392Peak Reduce Virtual memory (bytes)=2589212672Shuffle ErrorsBAD_ID=0CONNECTION=0IO_ERROR=0WRONG_LENGTH=0WRONG_MAP=0WRONG_REDUCE=0File Input Format CountersBytes Read=236File Output Format CountersBytes Written=97 Job Finished in 18.65 seconds Estimated value of Pi is 3.50000000000000000000 [liang@node1 hadoop]$
计算wordcount
准备输入数据
[liang@node1 ~]$ vim 1.txt [liang@node1 ~]$ cat 1.txt hello world hello hadoop [liang@node1 ~]$ hdfs dfs -put 1.txt / [liang@node1 ~]$ hdfs dfs -ls / Found 3 items -rw-r--r-- 1 liang supergroup 25 2025-03-10 17:56 /1.txt drwx------ - liang supergroup 0 2025-03-10 17:49 /tmp drwxr-xr-x - liang supergroup 0 2025-03-10 17:49 /user
运行
hadoop jar $HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.3.4.jar wordcount /1.txt /out
[liang@node1 ~]$ hadoop jar $HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.3.4.jar wordcount /1.txt /out 2025-03-10 17:57:00,277 INFO client.DefaultNoHARMFailoverProxyProvider: Connecting to ResourceManager at node1/192.168.193.128:8032 2025-03-10 17:57:00,588 INFO mapreduce.JobResourceUploader: Disabling Erasure Coding for path: /tmp/hadoop-yarn/staging/liang/.staging/job_1741600078175_0002 2025-03-10 17:57:00,741 INFO input.FileInputFormat: Total input files to process : 1 2025-03-10 17:57:01,613 INFO mapreduce.JobSubmitter: number of splits:1 2025-03-10 17:57:02,102 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1741600078175_0002 2025-03-10 17:57:02,102 INFO mapreduce.JobSubmitter: Executing with tokens: [] 2025-03-10 17:57:02,221 INFO conf.Configuration: resource-types.xml not found 2025-03-10 17:57:02,221 INFO resource.ResourceUtils: Unable to find 'resource-types.xml'. 2025-03-10 17:57:02,267 INFO impl.YarnClientImpl: Submitted application application_1741600078175_0002 2025-03-10 17:57:02,294 INFO mapreduce.Job: The url to track the job: http://node1:8088/proxy/application_1741600078175_0002/ 2025-03-10 17:57:02,294 INFO mapreduce.Job: Running job: job_1741600078175_0002 2025-03-10 17:57:06,360 INFO mapreduce.Job: Job job_1741600078175_0002 running in uber mode : false 2025-03-10 17:57:06,362 INFO mapreduce.Job: map 0% reduce 0% 2025-03-10 17:57:10,429 INFO mapreduce.Job: map 100% reduce 0% 2025-03-10 17:57:14,466 INFO mapreduce.Job: map 100% reduce 100% 2025-03-10 17:57:14,485 INFO mapreduce.Job: Job job_1741600078175_0002 completed successfully 2025-03-10 17:57:14,548 INFO mapreduce.Job: Counters: 54File System CountersFILE: Number of bytes read=43FILE: Number of bytes written=552173FILE: Number of read operations=0FILE: Number of large read operations=0FILE: Number of write operations=0HDFS: Number of bytes read=113HDFS: Number of bytes written=25HDFS: Number of read operations=8HDFS: Number of large read operations=0HDFS: Number of write operations=2HDFS: Number of bytes read erasure-coded=0Job CountersLaunched map tasks=1Launched reduce tasks=1Data-local map tasks=1Total time spent by all maps in occupied slots (ms)=2698Total time spent by all reduces in occupied slots (ms)=2910Total time spent by all map tasks (ms)=1349Total time spent by all reduce tasks (ms)=1455Total vcore-milliseconds taken by all map tasks=1349Total vcore-milliseconds taken by all reduce tasks=1455Total megabyte-milliseconds taken by all map tasks=1381376Total megabyte-milliseconds taken by all reduce tasks=1489920Map-Reduce FrameworkMap input records=2Map output records=4Map output bytes=41Map output materialized bytes=43Input split bytes=88Combine input records=4Combine output records=3Reduce input groups=3Reduce shuffle bytes=43Reduce input records=3Reduce output records=3Spilled Records=6Shuffled Maps =1Failed Shuffles=0Merged Map outputs=1GC time elapsed (ms)=60CPU time spent (ms)=780Physical memory (bytes) snapshot=517914624Virtual memory (bytes) snapshot=5170266112Total committed heap usage (bytes)=392691712Peak Map Physical memory (bytes)=299552768Peak Map Virtual memory (bytes)=2580688896Peak Reduce Physical memory (bytes)=218361856Peak Reduce Virtual memory (bytes)=2589577216Shuffle ErrorsBAD_ID=0CONNECTION=0IO_ERROR=0WRONG_LENGTH=0WRONG_MAP=0WRONG_REDUCE=0File Input Format CountersBytes Read=25File Output Format CountersBytes Written=25 [liang@node1 ~]$
查看输出结果
[liang@node1 ~]$ hdfs dfs -ls /out Found 2 items -rw-r--r-- 1 liang supergroup 0 2025-03-10 17:57 /out/_SUCCESS -rw-r--r-- 1 liang supergroup 25 2025-03-10 17:57 /out/part-r-00000 [liang@node1 ~]$ hdfs dfs -cat /out/part-r-00000 hadoop 1 hello 2 world 1
安装MySQL
卸载原有mysql及mariadb
sudo systemctl stop mysql mysqld 2>/dev/null
sudo rpm -qa | grep -i 'mysql\|mariadb' | xargs -n1 sudo rpm -e --nodeps 2>/dev/null
sudo rm -rf /var/lib/mysql /var/log/mysqld.log /usr/lib64/mysql /etc/my.cnf /usr/my.cnf
下载mysql
这里下载的mysql版本为mysql8.4.2,如果下载较旧的mysql版本可能与openEuler24.03不兼容。
cd /opt/software
wget https://downloads.mysql.com/archives/get/p/23/file/mysql-8.4.2-1.el9.x86_64.rpm-bundle.tar
注意:如果命令行下载较慢,可以直接使用浏览器访问https链接地址下载,再上传安装文件到Linux的/opt/software目录。
解压mysql
# 创建mysql目录
[liang@node1 software]$ mkdir mysql#解压
[liang@node1 software]$ tar -xvf mysql-8.4.2-1.el9.x86_64.rpm-bundle.tar -C mysql
删除不必要的rpm包
[liang@node1 software]$ cd mysql
[liang@node1 mysql]$ rm -f *debug*
[liang@node1 mysql]$ ls
mysql-community-client-8.4.2-1.el9.x86_64.rpm mysql-community-libs-8.4.2-1.el9.x86_64.rpm
mysql-community-client-plugins-8.4.2-1.el9.x86_64.rpm mysql-community-libs-compat-8.4.2-1.el9.x86_64.rpm
mysql-community-common-8.4.2-1.el9.x86_64.rpm mysql-community-server-8.4.2-1.el9.x86_64.rpm
mysql-community-devel-8.4.2-1.el9.x86_64.rpm mysql-community-test-8.4.2-1.el9.x86_64.rpm
mysql-community-icu-data-files-8.4.2-1.el9.x86_64.rpm
安装mysql
[liang@node1 mysql]$ sudo rpm -ivh *.rpm --force --nodeps
安装过程
[liang@node1 mysql]$ sudo rpm -ivh *.rpm --force --nodeps 警告:mysql-community-client-8.4.2-1.el9.x86_64.rpm: 头 V4 RSA/SHA256 Signature, 密钥 ID a8d3785c: NOKEY Verifying... ################################# [100%] 准备中... ################################# [100%] 正在升级/安装...1:mysql-community-common-8.4.2-1.el################################# [ 11%]2:mysql-community-client-plugins-8.################################# [ 22%]3:mysql-community-libs-8.4.2-1.el9 ################################# [ 33%]4:mysql-community-client-8.4.2-1.el################################# [ 44%]5:mysql-community-icu-data-files-8.################################# [ 56%]6:mysql-community-server-8.4.2-1.el################################# [ 67%]7:mysql-community-test-8.4.2-1.el9 ################################# [ 78%]8:mysql-community-devel-8.4.2-1.el9################################# [ 89%]9:mysql-community-libs-compat-8.4.2################################# [100%] /usr/lib/tmpfiles.d/dbus.conf:13: Line references path below legacy directory /var/run/, updating /var/run/dbus/containers → /run/dbus/containers; please update the tmpfiles.d/ drop-in file accordingly. [liang@node1 mysql]$
启动mysql服务
[liang@node1 mysql]$ sudo systemctl start mysqld
开机自启动mysql服务
[liang@node1 mysql]$ sudo systemctl enable mysqld
登录mysql
查看临时密码
[liang@node1 mysql]$ sudo grep 'temporary password' /var/log/mysqld.log
输出如下
2025-03-10T14:00:35.282869Z 6 [Note] [MY-010454] [Server] A temporary password is generated for root@localhost: Rh8f67o%F,v>
这里查看到的临时密码为:Rh8f67o%F,v>
注意:查看到的临时密码可能不一样,请使用实际查到的临时密码登录mysql。
登录mysql
[liang@node1 mysql]$ mysql -uroot -p'Rh8f67o%F,v>'
[liang@node1 mysql]$ mysql -uroot -p'Rh8f67o%F,v>' mysql: [Warning] Using a password on the command line interface can be insecure. Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 8 Server version: 8.4.2 Copyright (c) 2000, 2024, Oracle and/or its affiliates. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. mysql>
修改mysql密码
修改密码策略
[liang@node1 mysql]$ sudo vim /etc/my.cnf
在[mysqld]下面添加如下语句
validate_password.length=4
validate_password.policy=0
重启mysql
[liang@node1 mysql]$ sudo systemctl restart mysqld
登录mysql并修改密码
[liang@node1 mysql]$ mysql -uroot -p'Rh8f67o%F,v>'mysql> set password='000000';
Query OK, 0 rows affected (0.00 sec)mysql> update mysql.user set host='%' where user='root';
Query OK, 1 row affected (0.01 sec)
Rows matched: 1 Changed: 1 Warnings: 0mysql> alter user 'root'@'%' identified by '000000';
Query OK, 0 rows affected (0.01 sec)mysql> flush privileges;
Query OK, 0 rows affected (0.00 sec)mysql> exit;
Bye
使用新密码登录
[liang@node1 mysql]$ mysql -uroot -p000000
mysql: [Warning] Using a password on the command line interface can be insecure.
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 9
Server version: 8.4.2 MySQL Community Server - GPLCopyright (c) 2000, 2024, Oracle and/or its affiliates.Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.mysql> exit;
Bye
[liang@node1 mysql]$
远程连接mysql
使用navicat等工具,通过ip及端口号远程连接
安装Hive
下载hive
https://archive.apache.org/dist/hive/hive-3.1.3/apache-hive-3.1.3-bin.tar.gz
上传Hive安装包到Linux /opt/software目录
解压
[liang@node1 software]$ tar -zxvf /opt/software/apache-hive-3.1.3-bin.tar.gz -C /opt/module/
设置环境变量
[liang@node1 software]$ sudo vim /etc/profile.d/my_env.sh
末尾添加如下内容
#HIVE_HOME export HIVE_HOME=/opt/module/apache-hive-3.1.3-bin export PATH=$PATH:$HIVE_HOME/bin
让环境变量生效
[liang@node1 software]$ source /etc/profile
解决日志Jar包依赖冲突
解决日志Jar包依赖与Hadoop日志Jar冲突,进入/opt/module/hive/lib目录
[liang@node1 software]$ cd $HIVE_HOME/lib/ [liang@node1 lib]$ ls | grep slf4j log4j-slf4j-impl-2.17.1.jar [liang@node1 lib]$ mv log4j-slf4j-impl-2.17.1.jar log4j-slf4j-impl-2.17.1.jar.bak
下载mysql驱动包
https://mvnrepository.com/artifact/com.mysql/mysql-connector-j/8.4.0
点击jar
下载,并上传jar包文件到Linux /opt/software/mysql目录
[liang@node1 lib]$ ls /opt/software/mysql | grep connect mysql-connector-j-8.4.0.jar
将jar文件复制到HIVE_HOME的lib目录
[liang@node1 lib]$ cp /opt/software/mysql/mysql-connector-j-8.4.0.jar $HIVE_HOME/lib/ [liang@node1 lib]$ ls | grep connect mysql-connector-j-8.4.0.jar
配置Hive
切换到Hive配置目录
[liang@node1 lib]$ cd $HIVE_HOME/conf [liang@node1 conf]$ ls beeline-log4j2.properties.template ivysettings.xml hive-default.xml.template llap-cli-log4j2.properties.template hive-env.sh.template llap-daemon-log4j2.properties.template hive-exec-log4j2.properties.template parquet-logging.properties hive-log4j2.properties.template
创建hive-site.xml
[liang@node1 conf]$ vim hive-site.xml
配置内容
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration><!--配置Hive保存元数据信息所需的 MySQL URL地址--><property><name>javax.jdo.option.ConnectionURL</name><value>jdbc:mysql://node1:3306/metastore?useSSL=false&useUnicode=true&characterEncoding=UTF-8&allowPublicKeyRetrieval=true</value></property><!--配置Hive连接MySQL的驱动全类名--><property><name>javax.jdo.option.ConnectionDriverName</name><value>com.mysql.cj.jdbc.Driver</value></property><!--配置Hive连接MySQL的用户名 --><property><name>javax.jdo.option.ConnectionUserName</name><value>root</value></property><!--配置Hive连接MySQL的密码 --><property><name>javax.jdo.option.ConnectionPassword</name><value>000000</value></property><property><name>hive.metastore.warehouse.dir</name><value>/user/hive/warehouse</value></property><property><name>hive.metastore.schema.verification</name><value>false</value></property><property><name>hive.server2.thrift.port</name><value>10000</value></property><property><name>hive.server2.thrift.bind.host</name><value>node1</value></property><property><name>hive.metastore.event.db.notification.api.auth</name><value>false</value></property><property><name>hive.cli.print.header</name><value>true</value></property><property><name>hive.cli.print.current.db</name><value>true</value></property>
</configuration>
配置文件中元数据存放在mysql的metastore数据库中,所以需要先在mysql把metastore数据库创建出来
[liang@node1 conf]$ mysql -uroot -p000000 mysql: [Warning] Using a password on the command line interface can be insecure. Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 13 Server version: 8.4.2 MySQL Community Server - GPL Copyright (c) 2000, 2024, Oracle and/or its affiliates. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. mysql> create database metastore; Query OK, 1 row affected (0.01 sec) mysql> exit; Bye [liang@node1 conf]$
初始化Hive元数据库
[liang@node1 conf]$ schematool -initSchema -dbType mysql -verbose![]()
看到schemaTool completed,说明初始化成功。初始化本质是在元数据库里创建相关的表及初始化相关表数据,可以到mysql metastore数据库查看生成的表数据。
修改元数据库字符集
[liang@node1 conf]$ mysql -uroot -p000000 ... mysql> use metastore; Reading table information for completion of table and column names You can turn off this feature to get a quicker startup with -A Database changed mysql> alter table COLUMNS_V2 modify column COMMENT varchar(256) character set utf8; Query OK, 0 rows affected, 1 warning (0.03 sec) Records: 0 Duplicates: 0 Warnings: 1 mysql> alter table TABLE_PARAMS modify column PARAM_VALUE mediumtext character set utf8; Query OK, 0 rows affected, 1 warning (0.02 sec) Records: 0 Duplicates: 0 Warnings: 1 mysql> quit; Bye [liang@node1 conf]$
设置日志级别
为了避免执行HQL语句时输出太多INFO日志,这里将日志设置为WARN级别。
cd $HIVE_HOME/conf
新建log4j.properties
cat > log4j.properties <<EOL log4j.rootLogger=WARN, CAlog4j.appender.CA=org.apache.log4j.ConsoleAppenderlog4j.appender.CA.layout=org.apache.log4j.PatternLayout log4j.appender.CA.layout.ConversionPattern=%-4r [%t] %-5p %c %x - %m%n EOL
启动Hive客户端
启动Hive客户端之前,需要启动hadoop,如果还没启动,需要先启动hadoop
[liang@node1 conf]$ start-dfs.sh [liang@node1 conf]$ start-yarn.sh
进入hive客户端
[liang@node1 conf]$ hive which: no hbase in (/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/opt/module/jdk1.8.0_271/bin:/opt/module/hadoop-3.3.4/bin:/opt/module/hadoop-3.3.4/sbin:/opt/module/jdk1.8.0_271/bin:/opt/module/hadoop-3.3.4/bin:/opt/module/hadoop-3.3.4/sbin:/opt/module/apache-hive-3.1.3-bin/bin) Hive Session ID = f3862edd-3ad6-433f-9e12-104668037a1c Logging initialized using configuration in jar:file:/opt/module/apache-hive-3.1.3-bin/lib/hive-common-3.1.3.jar!/hive-log4j2.properties Async: true Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases. Hive Session ID = 0c36aabf-bd5f-4400-9ea1-70a9355713cd hive (default)>
简单使用Hive
查看一下数据库
hive (default)> show databases; OK database_name default Time taken: 0.609 seconds, Fetched: 1 row(s) hive (default)>
退出Hive客户端
hive (default)> quit; [liang@node1 conf]$
安装Zookeeper
安装ZooKeeper单机版
下载ZooKeeper
下载ZooKeeper并将安装包上传到Linux /opt/software目录
https://archive.apache.org/dist/zookeeper/zookeeper-3.7.1/apache-zookeeper-3.7.1-bin.tar.gz
解压
解压
[liang@node1 software]$ tar -zxvf apache-zookeeper-3.7.1-bin.tar.gz -C /opt/module
重命名
[liang@node1 software]$ cd /opt/module/ [liang@node1 module]$ ls apache-hive-3.1.3-bin apache-zookeeper-3.7.1-bin hadoop-3.3.4 jdk1.8.0_212 jdk1.8.0_271 # 重命名 [liang@node1 module]$ mv apache-zookeeper-3.7.1-bin zookeeper-3.7.1 [liang@node1 module]$ ls apache-hive-3.1.3-bin hadoop-3.3.4 jdk1.8.0_212 jdk1.8.0_271 zookeeper-3.7.1
配置环境变量
[liang@node1 module]$ sudo vim /etc/profile.d/my_env.sh
添加如下内容
#ZOOKEEPER_HOME export ZOOKEEPER_HOME=/opt/module/zookeeper-3.7.1 export PATH=$PATH:$ZOOKEEPER_HOME/bin
让环境变量生效
[liang@node1 module]$ source /etc/profile
配置ZooKeeper
[liang@node1 module]$ cd $ZOOKEEPER_HOME/conf/ [liang@node1 conf]$ ls configuration.xsl log4j.properties zoo_sample.cfg [liang@node1 conf]$ cp zoo_sample.cfg zoo.cfg [liang@node1 conf]$ ls configuration.xsl log4j.properties zoo.cfg zoo_sample.cfg [liang@node1 conf]$ vim zoo.cfg
修改内容
dataDir=/opt/module/zookeeper-3.7.1/tmp
启动ZooKeeper
[liang@node1 conf]$ zkServer.sh start ZooKeeper JMX enabled by default Using config: /opt/module/zookeeper-3.7.1/bin/../conf/zoo.cfg Starting zookeeper ... STARTED
查看ZooKeeper状态
[liang@node1 conf]$ zkServer.sh status ZooKeeper JMX enabled by default Using config: /opt/module/zookeeper-3.7.1/bin/../conf/zoo.cfg Client port found: 2181. Client address: localhost. Client SSL: false. Mode: standalone
可以看到ZooKeeper为standalone状态,即为单机状态。
安装Kafka
下载Kafka
https://archive.apache.org/dist/kafka/3.3.1/kafka_2.12-3.3.1.tgz
上传到Linux /opt/software目录
[liang@node1 conf]$ cd /opt/software/ [liang@node1 software]$ ls | grep kafka kafka_2.12-3.3.1.tgz
解压
[liang@node1 software]$ tar -zxvf kafka_2.12-3.3.1.tgz -C /opt/module/
设置环境变量
[liang@node1 module]$ sudo vim /etc/profile.d/my_env.sh
添加内容
#KAFKA_HOME export KAFKA_HOME=/opt/module/kafka_2.12-3.3.1 export PATH=$PATH:$KAFKA_HOME/bin
让环境变量生效
[liang@node1 module]$ source /etc/profile
配置Kafka
[liang@node1 module]$ cd $KAFKA_HOME/config [liang@node1 config]$ ls connect-console-sink.properties connect-mirror-maker.properties server.properties connect-console-source.properties connect-standalone.properties tools-log4j.properties connect-distributed.properties consumer.properties trogdor.conf connect-file-sink.properties kraft zookeeper.properties connect-file-source.properties log4j.properties connect-log4j.properties producer.properties [liang@node1 config]$ vim server.properties
找到如下配置项,并修改内容如下
log.dirs=/opt/module/kafka_2.12-3.3.1/datas zookeeper.connect=node1:2181/kafka
启动Kafka
先启动ZooKeeper,再启动Kafka
[liang@node1 kafka_2.12-3.3.1]$ zkServer.sh start [liang@node1 kafka_2.12-3.3.1]$ kafka-server-start.sh -daemon config/server.properties
出现这个警告
egrep: warning: egrep is obsolescent; using grep -E egrep: warning: egrep is obsolescent; using grep -E ...
警告原因:egrep命令已过时
解决方法:将kafka-run-class.sh中的egrep修改为grep -E
[liang@node1 kafka_2.12-3.3.1]$ cd $KAFKA_HOME/bin/ [liang@node1 bin]$ vim kafka-run-class.sh
输入/grep
关键字,回车光标跳到匹配的第一个关键字(全脚本也只有一个地方匹配)
重启kafka
[liang@node1 bin]$ kafka-server-stop.sh [liang@node1 kafka_2.12-3.3.1]$ kafka-server-start.sh -daemon config/server.properties
jps查看进程
[liang@node1 kafka_2.12-3.3.1]$ jps 2338 QuorumPeerMain 8614 Kafka 8665 Jps
测试Kafka
创建topic
[liang@node1 kafka_2.12-3.3.1]$ kafka-topics.sh --bootstrap-server node1:9092 --create --partitions 1 --replication-factor 1 --topic test1
查看topic
[liang@node1 kafka_2.12-3.3.1]$ kafka-topics.sh --bootstrap-server node1:9092 --list __consumer_offsets test1
生产者
执行如下命令生产消息
[liang@node1 kafka_2.12-3.3.1]$ kafka-console-producer.sh --bootstrap-server node1:9092 --topic test1 >kafka >HADOOP
消费者
再启动一个新的终端,执行消费者
[liang@node1 ~]$ kafka-console-consumer.sh --bootstrap-server node1:9092 --topic test1 kafka HADOOP
可以看到消费者能正常接收到生产者发来的数据。
安装Flume
下载Flume
https://archive.apache.org/dist/flume/1.10.1/apache-flume-1.10.1-bin.tar.gz
上传到Linux /opt/software目录
[liang@node1 software]$ ls apache-flume-1.10.1-bin.tar.gz hadoop-3.3.4.tar.gz kafka_2.12-3.3.1.tgz apache-hive-3.1.3-bin.tar.gz jdk-8u212-linux-x64.tar.gz mysql apache-zookeeper-3.7.1-bin.tar.gz jdk-8u271-linux-x64.tar.gz mysql-8.4.2-1.el9.x86_64.rpm-bundle.tar [liang@node1 software]$ ls | grep flume apache-flume-1.10.1-bin.tar.gz
解压
[liang@node1 software]$ tar -zxvf apache-flume-1.10.1-bin.tar.gz -C /opt/module/
重命名
[liang@node1 software]$ cd /opt/module/ [liang@node1 module]$ ls apache-flume-1.10.1-bin hadoop-3.3.4 jdk1.8.0_271 zookeeper-3.7.1 apache-hive-3.1.3-bin jdk1.8.0_212 kafka_2.12-3.3.1 [liang@node1 module]$ mv apache-flume-1.10.1-bin flume-1.10.1
设置环境变量
设置环境变量
[liang@node1 module]$ sudo vim /etc/profile.d/my_env.sh
末尾添加如下内容
#FLUME_HOME export FLUME_HOME=/opt/module/flume-1.10.1 export PATH=$PATH:$FLUME_HOME/bin
让环境变量生效
[liang@node1 module]$ source /etc/profile
修改日志文件
[liang@node1 flume-1.10.1]$ cd $FLUME_HOME/conf/ [liang@node1 conf]$ ls flume-conf.properties.template flume-env.ps1.template flume-env.sh.template log4j2.xml [liang@node1 conf]$ vim log4j2.xml
找到<Property name="LOG_DIR">.</Property>
修改为
<Property name="LOG_DIR">/opt/module/flume-1.10.1/log</Property>
引入控制台输出,方便学习查看日志
<AppenderRef ref="Console" />
引入后效果
<Root level="INFO"><AppenderRef ref="LogFile" /><AppenderRef ref="Console" /></Root>
测试Flume
测试flume
创建测试配置文件test.conf
vim test.conf
内容如下
# 设置agent
b1.sources = r1
b1.sinks = k1
b1.channels = c1# 配置source
b1.sources.r1.type = netcat
b1.sources.r1.bind = localhost
b1.sources.r1.port = 44444# 配置sink
b1.sinks.k1.type = logger# 配置channel
b1.channels.c1.type = memory
b1.channels.c1.capacity = 1000
b1.channels.c1.transactionCapacity = 100#将source和sink绑定到channel
b1.sources.r1.channels = c1
b1.sinks.k1.channel = c1
启动Flume代理,执行如下命令,执行命令后,进入监听数据阻塞状态,不要关闭这个终端
$ flume-ng agent --conf ./ --conf-file test.conf --name b1 -Dflume.root.logger=INFO,console
发送数据
另外启动新的终端,使用nc命令发送数据
[liang@node1 ~]$ nc localhost 44444
nc终端输入数据,然后在执行flume-ng命令的终端观察数据。
返回启动Flume代理终端,查看接收到的数据
安装HBase
下载HBase
下载HBase
https://archive.apache.org/dist/hbase/2.4.11/hbase-2.4.11-bin.tar.gz
上传到Linux的/opt/software目录
[liang@node1 software]$ ls | grep hbase hbase-2.4.11-bin.tar.gz
解压
解压HBase
[liang@node1 software]$ tar -zxvf hbase-2.4.11-bin.tar.gz -C /opt/module/
查看解压后文件
[liang@node1 software]$ cd /opt/module/ [liang@node1 module]$ ls apache-hive-3.1.3-bin hadoop-3.3.4 jdk1.8.0_212 kafka_2.12-3.3.1 flume-1.10.1 hbase-2.4.11 jdk1.8.0_271 zookeeper-3.7.1 [liang@node1 module]$ cd hbase-2.4.11/ [liang@node1 hbase-2.4.11]$ ls bin conf hbase-webapps lib NOTICE.txt RELEASENOTES.md CHANGES.md docs LEGAL LICENSE.txt README.txt [liang@node1 hbase-2.4.11]$ pwd /opt/module/hbase-2.4.11
设置环境变量
环境变量
[liang@node1 module]$ sudo vim /etc/profile.d/my_env.sh
添加内容
#HBASE_HOME export HBASE_HOME=/opt/module/hbase-2.4.11 export PATH=$PATH:$HBASE_HOME/bin
让环境变量生效
[liang@node1 module]$ source /etc/profile
配置hbase
配置hbase-env.sh
编辑hbase-env.sh
[liang@node1 conf]$ vim hbase-env.sh
去掉export HBASE_MANAGES_ZK
一行前面的#
注释,将值改为false,修改后效果如下:
把hadoop-env.sh
的export HBASE_DISABLE_HADOOP_CLASSPATH_LOOKUP="true"
一行前面的#号注释去掉。
配置hbase-site.xml
编辑hbae-site.xml
[liang@node1 conf]$ vim hbase-site.xml
配置内容
<property><name>hbase.cluster.distributed</name><value>true</value></property><property><name>hbase.zookeeper.quorum</name><value>node1</value></property><property><name>hbase.rootdir</name><value>hdfs://node1:8020/hbase</value><description>The directory shared by RegionServers.</description></property><property><name>hbase.wal.provider</name><value>filesystem</value></property>
配置regionservers
编辑regionservers
[liang@node1 conf]$ vim regionservers
内容修改为主机名
node1
启动HBase
先启动hadoop和zookeeper,再启动hbase(启动顺序要正确)
[liang@node1 conf]$ start-dfs.sh [liang@node1 conf]$ start-yarn.sh [liang@node1 conf]$ zkServer.sh start [liang@node1 conf]$ start-hbase.sh
jps查看进程
[liang@node1 conf]$ jps 4209 QuorumPeerMain 4433 HMaster 3026 ResourceManager 4674 HRegionServer 2515 DataNode 5161 Jps 3178 NodeManager 2220 NameNode 2781 SecondaryNameNode
查看Web UI界面
浏览器查看Web UI界面
192.168.193.128:16010
简单使用
[liang@node1 conf]$ hbase shell 2025-03-21 17:16:53,325 WARN [main] util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable HBase Shell Use "help" to get list of supported commands. Use "exit" to quit this interactive shell. For Reference, please visit: http://hbase.apache.org/2.0/book.html#shell Version 2.4.11, r7e672a0da0586e6b7449310815182695bc6ae193, Tue Mar 15 10:31:00 PDT 2022 Took 0.0014 seconds hbase:001:0> create 't1','f1','f2' Created table t1 Took 0.9881 seconds => Hbase::Table - t1 hbase:002:0> list TABLE t1 1 row(s) Took 0.0188 seconds => ["t1"] hbase:003:0> quit [liang@node1 conf]$
关闭HBase
关闭hbase
[liang@node1 conf]$ stop-hbase.sh
关闭ZooKeeper
[liang@node1 conf]$ zkServer.sh stop
关闭hadoop
[liang@node1 conf]$ stop-yarn.sh [liang@node1 conf]$ stop-dfs.sh
完成!enjoy it!