본문 바로가기

Data Engineering/Sqoop

[Sqoop] Sqoop 다운로드 및 실행

sqoop wget으로 다운로드 후 압축 풀기

이름 바꾸기

[root@localhost ~]# mv sqoop-1.4.7.bin__hadoop-2.6.0/ sqoop

환경변수 설정

~/.bashrc 수정

# sqoop path
export SQOOP_HOME=/root/sqoop
export SQOOP_CONF_DIR=/root/sqoop/conf
export PATH=$SQOOP_HOME/bin:$PATH

sqoop-env.sh 파일 생성

[root@localhost ~]# cd sqoop/conf
[root@localhost conf]# cp sqoop-env-template.sh sqoop-env.sh

sqoop-env.sh

#Set path to where bin/hadoop is available
export HADOOP_COMMON_HOME=/root/hadoop-3.3.6
export HADOOP_HOME=/root/hadoop-3.3.6

#Set path to where hadoop-*-core.jar is available
export HADOOP_MAPRED_HOME=/root/hadoop-3.3.6

#Set the path to where bin/hive is available
export HIVE_HOME=/root/apache-hive

mysql 설치하기

#  dnf install mysql-server
# systemctl start mysql
# mysql

mysql -u root -p 로 하니 패스워드 설정이 안되어서 안들어가져서

[root@localhost ~]# systemctl start mysqld.service
[root@localhost ~]# systemctl status mysqld
[root@localhost ~]# systemctl enable mysqld
[root@localhost ~]# mysql_secure_installation

mysql 접속 성공!

MySQL Connector/J

[root@localhost ~]# wget https://downloads.mysql.com/archives/get/p/3/file/mysql-connector-java-5.1.46.tar.gz
[root@localhost ~]# tar xvzf mysql-connector 어쩌구

[root@localhost mysql-connector-java-5.1.46]# mv mysql-connector-java-5.1.46-bin.jar ~/sqoop/lib/
[root@localhost sqoop]# cp sqoop-1.4.7.jar ~/hadoop-3.3.6/share/hadoop/tools/lib/

Create database

mysql> grant all privileges on hadoopguide.* TO 'root'@'localhost';
Query OK, 0 rows affected (0.00 sec)

mysql> use hadoopguide;
Database changed
mysql> show tables;
Empty set (0.00 sec)

mysql> create table widget(id int not null primary key auto_increment,
    -> widget_name varchar(64) not null, price decimal(10,2), design_date DATE,
    -> version int, design_comment varchar(100)
    -> );
Query OK, 0 rows affected (0.02 sec)

mysql> insert into widget values(null,'sprocket','0.25','2024-03-12',1,'Connects two gizmos');
Query OK, 1 row affected (0.01 sec)

mysql> insert into widget values(null, 'gizmo','4.00','2024-03-11',4,null);

commons-lang

[root@localhost sqoop]# wget https://mirror.navercorp.com/apache//commons/lang/binaries/commons-lang-2.6-bin.tar.gz
[root@localhost sqoop]# tar xvzf commons-lang-2.6-bin.tar.gz

[root@localhost sqoop]# cd commons-lang-2.6/
[root@localhost commons-lang-2.6]# cp commons-lang-2.6.jar ~/sqoop/lib/

[root@localhost commons-lang-2.6]# cd ~/sqoop/lib
[root@localhost lib]# mv commons-lang3-3.4.jar commons-lang3-3.4.jar.bak

mysql 계정 만들기

create user 'hadoopguide'@'localhost' identified by 'password';

grant all privileges on hadoopguide.* TO 'hadoopguide'@'localhost';
flush privileges;

hadoopguide으로 접속 성공

sqoop 접속

[root@localhost sqoop]# sqoop list-databases --connect jdbc:mysql://localhost/hadoopguide --username hadoopguide --P

스키마 정보 확인

demon start 후

sqoop import --connect jdbc:mysql://localhost/hadoopguide --table widget --username hadoopguide --P -m 1

데이터베이스 내용 읽어온 것 확인, 순서대로 가져오진 않았음

sqoop import --connect jdbc:mysql://localhost/hadoopguide --target-dir widget --username hadoopguide --P  --query "select * from widget where \$CONDITIONS order by id" -m 1

이렇게 하면 순서대로 가져올 수 있음 : 쿼리문에 \$CONDITIONS 포함시키기

Import

test 테이블에 데이터 넣은 후 가져오기

[root@localhost sqoop]# sqoop import --connect jdbc:mysql://localhost/hadoopguide --table test --target-dir /user/root/test --username hadoopguide --P -m 1

정렬시켜 가져오려면

sqoop import --connect jdbc:mysql://localhost/hadoopguide 
             --target-dir /user/root/test
             --username hadoopguide 
             --P 
             -query "select * from test where \$CONDITIONS" 
             -m 1

 Table로 Export

[root@localhost sqoop]# sqoop export \
> --connect jdbc:mysql://localhost/hadoopguide \
> --table tbl_target \
> --export-dir /user/root/test/part-m-00000 \
> --username hadoopguide \
> --password password