手把手教你Excel制作动态模糊匹配的下拉菜单(vba下拉框模糊匹配)
1002
2022-05-29
1 前言
昨天分享了Apache Kudu在华为云上的编译和使用,今天继续选择Apache Impala这个项目,来手把手指导大家从源码开始构建一个本地的Impala集群,同时会预加载1GB规模的tpc-ds和tpc-h的测试集数据,然后进行熟悉的SQL交互查询操作。
因为Impala依赖的组件较多,集群启动的时候会同时启动Hdfs、Kms、Yarn、Hive、HBase、Kudu、Ranger、Impala等组件,所以可能这也是Impala让人望而却步的一个重要原因。
注意,以下操作仍旧只需ctrl+c & ctrl+v 即可:)
2 准备工作
在开始本文之前,建议在华为云购买一台云服务器,同时考虑到后续的顺利操作,云服务器需要有一些要求:
CPU架构:x86计算
规格:c6.2xlarge.4(提高编译速度和内存资源)
镜像:公共镜像,CentOS CentOS 8.0 64bit
系统盘:高IO,100GB
弹性公网:按流量计费(提高下载速度)
3 操作系统
安装软件包
[root@ecs-impala ~]# yum install -y git ant maven.noarch python2.x86_64 python2-devel.x86_64 redhat-rpm-config postgresql postgresql-server lzo-devel cyrus-sasl* krb5-devel.x86_64 krb5-server.x86_64 autoconf automake libtool flex rsync gcc-c++.x86_64 openssl-devel.x86_64
使用python2
[root@ecs-impala ~]# cd /usr/bin [root@ecs-impala bin]# ln -s python2.7 python [root@ecs-kudu bin]# ls -lrt python* lrwxrwxrwx 1 root root 16 Nov 17 2019 python2-config -> python2.7-config -rwxr-xr-x 1 root root 1846 Nov 17 2019 python2.7-config lrwxrwxrwx 1 root root 9 Nov 17 2019 python2 -> python2.7 -rwxr-xr-x 1 root root 10760 Nov 17 2019 python2.7 lrwxrwxrwx 1 root root 32 Nov 21 2019 python3.6m -> /usr/libexec/platform-python3.6m lrwxrwxrwx 1 root root 31 Nov 21 2019 python3.6 -> /usr/libexec/platform-python3.6 lrwxrwxrwx 1 root root 25 Feb 12 10:34 python3 -> /etc/alternatives/python3 lrwxrwxrwx 1 root root 9 Jun 9 19:03 python -> python2.7
免密处理
[root@ecs-impala ~]# ssh-keygen -t rsa [root@ecs-impala ~]# cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
创建hdfs目录:
[root@ecs-impala ~]# mkdir -p /var/lib/hadoop-hdfs
初始化hive metastore数据库
这里选择postgresql为例,修改配置文件将以下三处`peer`和`ident`改成`trust`,并创建用户和授予权限:
[root@ecs-impala ~]# service postgresql initdb [root@ecs-impala ~]# vim /var/lib/pgsql/data/pg_hba.conf # "local" is for Unix domain socket connections only # "local" is for Unix domain socket connections only local all all trust # IPv4 local connections: host all all 127.0.0.1/32 trust # IPv6 local connections: host all all ::1/128 trust [root@ecs-impala ~]# service postgresql restart [root@ecs-impala ~]# sudo -iu postgres [postgres@ecs-impala ~]$ psql psql (10.6) Type "help" for help. postgres=# CREATE ROLE hiveuser LOGIN PASSWORD 'password'; CREATE ROLE postgres=# ALTER ROLE hiveuser WITH CREATEDB; ALTER ROLE postgres=# \q [postgres@ecs-impala ~]$ exit [root@ecs-impala ~]# useradd hiveuser [root@ecs-impala ~]# sudo -iu hiveuser [hiveuser@ecs-impala ~]$ psql -dpostgres psql (10.6) Type "help" for help. postgres=> create database "HMS_root_impala_cdp" owner hiveuser; CREATE DATABASE postgres=> grant all privileges on database "HMS_root_impala_cdp" to hiveuser; GRANT postgres=> \q [hiveuser@ecs-impala ~]$ exit logout [root@ecs-impala ~]#
4 编译hadoop-lzo库
[root@ecs-impala ~]# git clone https://github.com/cloudera/hadoop-lzo.git [root@ecs-impala ~]# cd ~/hadoop-lzo [root@ecs-impala ~]# export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.252.b09-2.el8_1.x86_64 [root@ecs-impala ~]# ant package
5 编译Impala源码
编译impala源码和加载测试数据部分会非常耗时,基本上是小时级别,所以一定要有耐心,而且中间有可能会失败,需要多试几次-_-||
[root@ecs-impala ~]# git clone https://github.com/apache/impala.git [root@ecs-impala ~]# cd impala [root@ecs-impala impala]# ./buildall.sh -noclean -testdata -format_metastore
6 测试验证
等以上编译和测试数据加载完,接下来就可以开心的跑sql了
[root@ecs-impala impala]# source bin/impala-config.sh ... [root@ecs-impala impala]# impala-shell.sh Starting Impala Shell with no authentication using Python 2.7.16 Opened TCP connection to localhost.localdomain:21000 Connected to localhost.localdomain:21000 Server version: impalad version 4.0.0-SNAPSHOT DEBUG (build f4f7fb53a48f114f520737af7be2433a5afd03d4) *********************************************************************************** Welcome to the Impala shell. (Impala Shell v4.0.0-SNAPSHOT (f4f7fb5) built on Wed Jun 10 14:32:22 CST 2020) You can run a single query from the command line using the '-q' option. *********************************************************************************** [localhost.localdomain:21000] default> [localhost.localdomain:21000] default> show databases; ... [localhost.localdomain:21000] default>
大数据
版权声明:本文内容由网络用户投稿,版权归原作者所有,本站不拥有其著作权,亦不承担相应法律责任。如果您发现本站中有涉嫌抄袭或描述失实的内容,请联系我们jiasou666@gmail.com 处理,核实后本网站将在24小时内删除侵权内容。