diff --git a/docs/get-started/3.dockerDeployment.md b/docs/get-started/3.dockerDeployment.md new file mode 100644 index 000000000..fa0d2bc42 --- /dev/null +++ b/docs/get-started/3.dockerDeployment.md @@ -0,0 +1,225 @@ +--- +id: 'docker-deployment' +title: 'Platform Deployment using Docker' +sidebar_position: 3 +--- + +This tutorial for docker deploy StreamPark + +## Prepare + Docker 1.13.1+ + Docker Compose 1.28.0+ + +### 1. Install docker + +To start the service with docker, you need to install [docker](https://www.docker.com/) first + +### 2. Install docker-compose + +To start the service with docker-compose, you need to install [docker-compose](https://docs.docker.com/compose/install/) first + +## Apache StreamPark™ Deployment + +### 1. Apache StreamPark™ deployment based on h2 and docker-compose + +- This method is suitable for beginners to learn, become familiar with the features, and use on a lightweight scale. + +- H2 is a lightweight, high-performance embedded relational database that is easy to integrate into Java applications. It has a syntax similar to MySQL and supports data persistence. StreamPark defaults to using the H2 database, allowing developers to quickly start projects + +- To achieve data persistence, you can configure the datasource.h2-data-dir parameter to specify the storage path for the data files. This ensures that data will not be lost even if the service is restarted + +- datasource.h2-data-dir: Used to specify the storage path of the H2 database files, by default located in the ~/streampark/h2-data/ directory in the user's home directory + +- Regarding the H2 console, according to the configuration code, the access path /h2-console has been set to be enabled by default and allows access from other computers. The console can be used to execute SQL queries, manage databases, etc + +- After opening /h2-console, fill in JDBC_URL as the H2 database file directory + /metadata, for example, if the default configuration has not been modified, fill in as jdbc:h2:~/streampark/h2-data/metadata + +- Username and password: The default username is admin, the password is streampark + +- Below, you can configure MySQL or PgSQL for persistence. + +### 2. Deployment + +```shell +wget https://raw.githubusercontent.com/apache/incubator-streampark/dev/docker/docker-compose.yaml +wget https://raw.githubusercontent.com/apache/incubator-streampark/dev/docker/.env +docker-compose up -d +``` + +Once the service is started, StreamPark can be accessed through http://localhost:10000 and also through http://localhost:8081 to access Flink. Accessing the StreamPark link will redirect you to the login page, where the default user and password for StreamPark are admin and streampark respectively. To learn more about the operation, please refer to the user manual for a quick start. + +### 3. Configure flink home + +![](/doc/image/streampark_flinkhome.png) + +### 4. Configure flink-session cluster + +![](/doc/image/remote.png) + +Note:When configuring the flink-sessin cluster address, the ip address is not localhost, but the host network ip, which can be obtained through ifconfig + +### 5. Submit flink job + +![](/doc/image/remoteSubmission.png) + +#### Use existing Mysql services +This approach is suitable for enterprise production, where you can quickly deploy StreamPark based on docker and associate it with an online database +Note: The diversity of deployment support is maintained through the .env configuration file, make sure there is one and only one .env file in the directory + +```shell +wget https://raw.githubusercontent.com/apache/incubator-streampark/dev/deploy/docker/docker-compose.yaml +wget https://raw.githubusercontent.com/apache/incubator-streampark/dev/deploy/docker/mysql/.env +vim .env +``` + +First, you need to create the "streampark" database in MySQL, and then manually execute the corresponding SQL found in the schema and data for the relevant data source. + +After that, modify the corresponding connection information. + +```shell +SPRING_PROFILES_ACTIVE=mysql +SPRING_DATASOURCE_URL=jdbc:mysql://localhost:3306/streampark?useSSL=false&useUnicode=true&characterEncoding=UTF-8&allowPublicKeyRetrieval=false&useJDBCCompliantTimezoneShift=true&useLegacyDatetimeCode=false&serverTimezone=GMT%2B8 +SPRING_DATASOURCE_USERNAME=root +SPRING_DATASOURCE_PASSWORD=streampark +``` + +``` +docker-compose up -d +``` +#### Use existing Pgsql services + +```shell +wget https://raw.githubusercontent.com/apache/incubator-streampark/dev/deploy/docker/docker-compose.yaml +wget https://raw.githubusercontent.com/apache/incubator-streampark/dev/deploy/docker/pgsql/.env +vim .env +``` +Modify the corresponding connection information + +```shell +SPRING_PROFILES_ACTIVE=pgsql +SPRING_DATASOURCE_URL=jdbc:postgresql://localhost:5432/streampark?stringtype=unspecified +SPRING_DATASOURCE_USERNAME=postgres +SPRING_DATASOURCE_PASSWORD=streampark +``` + +```shell +docker-compose up -d +``` + +## Build images based on source code for Apache StreamPark™ deployment +``` +git clone https://github.com/apache/streampark.git +cd incubator-streampark/deploy/docker +vim docker-compose +``` + +```shell + build: + context: ../.. + dockerfile: deploy/docker/console/Dockerfile +# image: ${HUB}:${TAG} +``` +![](/doc/image/streampark_source_generation_image.png) + +```shell +docker-compose up -d +``` + +## Docker-Compse Configuration + +The docker-compose.yaml file will reference the configuration from the env file, and the modified configuration is as follows: + +```yaml +version: '3.8' +services: + ## streampark-console container + streampark-console: + ## streampark image + image: apache/streampark:latest + ## streampark image startup command + command: ${ + RUN_COMMAND} + ports: + - 10000:10000 + ## Environment configuration file + env_file: .env + environment: + ## Declare environment variable + HADOOP_HOME: ${ + HADOOP_HOME} + volumes: + - flink:/streampark/flink/${ + FLINK} + - /var/run/docker.sock:/var/run/docker.sock + - /etc/hosts:/etc/hosts:ro + - ~/.kube:/root/.kube:ro + privileged: true + restart: unless-stopped + networks: + - streampark + + ## flink-jobmanager container + flink-jobmanager: + image: ${ + FLINK_IMAGE} + ports: + - "8081:8081" + command: jobmanager + volumes: + - flink:/opt/flink + env_file: .env + restart: unless-stopped + privileged: true + networks: + - streampark + + ## streampark-taskmanager container + flink-taskmanager: + image: ${ + FLINK_IMAGE} + depends_on: + - flink-jobmanager + command: taskmanager + deploy: + replicas: 1 + env_file: .env + restart: unless-stopped + privileged: true + networks: + - streampark + +networks: + streampark: + driver: bridge + +volumes: + flink: +``` + +Finally, execute the start command: + +```shell +cd deploy/docker +docker-compose up -d +``` + +You can use `docker ps` to check if the installation was successful. If the following information is displayed, it indicates a successful installation: + +![](/doc/image/streampark_docker_ps.png) + +## Uploading Configuration to the Container + +In the previous `env` file, `HADOOP_HOME` was declared, with the corresponding directory being "/streampark/hadoop". Therefore, you need to upload the `/etc/hadoop` from the Hadoop installation package to the `/streampark/hadoop` directory. The commands are as follows: + +```shell +## Upload Hadoop resources +docker cp entire etc directory streampark-docker_streampark-console_1:/streampark/hadoop +## Enter the container +docker exec -it streampark-docker_streampark-console_1 bash +## Check +ls +``` + +![](/doc/image/streampark_docker_ls_hadoop.png) + +In addition, other configuration files, such as Maven's `settings.xml` file, are uploaded in the same manner. diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/current/get-started/3.dockerDeployment.md b/i18n/zh-CN/docusaurus-plugin-content-docs/current/get-started/3.dockerDeployment.md new file mode 100644 index 000000000..2d8f43325 --- /dev/null +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/current/get-started/3.dockerDeployment.md @@ -0,0 +1,227 @@ +--- +id: 'docker-deployment' +title: '平台安装 (Docker)' +sidebar_position: 3 +--- + +本教程使用 Docker 完成 StreamPark 平台的部署。 + +## 前置条件 + + Docker 1.13.1+ + Docker Compose 1.28.0+ + +### 1. 安装 docker +使用 docker 启动服务,需要先安装 [docker](https://www.docker.com/) + +### 2. 安装 docker-compose +使用 docker-compose 启动服务,需要先安装 [docker-compose](https://docs.docker.com/compose/install/) + +## 部署 Apache StreamPark™ + +### 1. 基于 h2 和 docker-compose 部署 Apache StreamPark™ + +- 该方式适用于入门学习、熟悉功能特性以及轻量规模使用 + +- H2 是一个体积小、性能高、易于嵌入 Java 应用程序的轻量级嵌入式关系型数据库。与 MySQL 有着相似的语法,并且支持数据持久化。StreamPark 默认使用 H2 数据库,使开发者能够快速开始项目 + +- 从 2.1.5 开始,StreamPark 支持了 h2 数据文件的本地持久化(系统重启后,数据不丢失), 您可以配置 datasource.h2-data-dir 参数来指定数据文件的存储路径。这确保了即便重启服务,数据也不会丢失 + +- datasource.h2-data-dir: 用于指定 H2 数据库文件的存储路径,默认在用户主目录下的 ~/streampark/h2-data/ 目录下 + +- 关于 H2 控制台,根据配置代码,访问路径 /h2-console 已被设定为默认启用,并允许从其他计算机访问。控制台可用于执行 SQL 查询、管理数据库等操作 + +- 打开 /h2-console 以后填写 JDBC_URL 为 H2 数据库文件目录 + /metadata,如默认配置未修改的情况下填写为 jdbc:h2:~/streampark/h2-data/metadata + +- 用户名和密码: 默认的用户名为 admin,密码为 streampark + +- 同时下方也可以配置 MySQL、PgSQL 进行持久化 + +#### 2. 部署 + +```shell +wget https://raw.githubusercontent.com/apache/incubator-streampark/dev/docker/docker-compose.yaml +wget https://raw.githubusercontent.com/apache/incubator-streampark/dev/docker/.env +docker-compose up -d +``` +服务启动后,可以通过 http://localhost:10000 访问 StreamPark,同时也可以通过 http://localhost:8081 访问Flink。访问StreamPark链接后会跳转到登陆页面,StreamPark 默认的用户和密码分别为 admin 和 streampark。想要了解更多操作请参考用户手册快速上手。 +![](/doc/image/streampark_docker-compose.png) + +该部署方式会自动给你启动一个flink-session集群供你去进行flink任务使用,同时也会挂载本地docker服务以及~/.kube来用于k8s模式的任务提交 + +#### 3. 配置 Flink home + +![](/doc/image/streampark_flinkhome.png) + +#### 4. 配置 Session集群 + +![](/doc/image/remote.png) + +注意:在配置flink-sessin集群地址时,填写的ip地址不是localhost,而是宿主机网络ip,该ip地址可以通过ifconfig来进行获取 + +#### 5. 提交 Flink 作业 + +![](/doc/image/remoteSubmission.png) + + +#### 使用已有的 Mysql 服务 +该方式适用于企业生产,你可以基于 docker 快速部署 StreamPark 并将其和线上数据库进行关联 +注意:部署支持的多样性是通过.env这个配置文件来进行维护的,要保证目录下有且仅有一个.env文件 + +```shell +wget https://raw.githubusercontent.com/apache/incubator-streampark/dev/docker/docker-compose.yaml +wget https://raw.githubusercontent.com/apache/incubator-streampark/dev/docker/mysql/.env +vim .env +``` + +需要先在mysql先创建streampark数据库,然后手动执行对应的schema和data里面对应数据源的sql + +然后修改对应的连接信息 + +```shell +SPRING_PROFILES_ACTIVE=mysql +SPRING_DATASOURCE_URL=jdbc:mysql://localhost:3306/streampark?useSSL=false&useUnicode=true&characterEncoding=UTF-8&allowPublicKeyRetrieval=false&useJDBCCompliantTimezoneShift=true&useLegacyDatetimeCode=false&serverTimezone=GMT%2B8 +SPRING_DATASOURCE_USERNAME=root +SPRING_DATASOURCE_PASSWORD=streampark +``` + +```shell +docker-compose up -d +``` +### 沿用已有的 PgSQL 服务 + +```shell +wget https://raw.githubusercontent.com/apache/incubator-streampark/dev/docker/docker-compose.yaml +wget https://raw.githubusercontent.com/apache/incubator-streampark/dev/docker/pgsql/.env +vim .env +``` + +修改对应的连接信息 +```shell +SPRING_PROFILES_ACTIVE=pgsql +SPRING_DATASOURCE_URL=jdbc:postgresql://localhost:5432/streampark?stringtype=unspecified +SPRING_DATASOURCE_USERNAME=postgres +SPRING_DATASOURCE_PASSWORD=streampark +``` +```shell +docker-compose up -d +``` + +## 基于源码构建镜像进行Apache StreamPark™部署 + +```shell +git clone https://github.com/apache/streampark.git +cd incubator-streampark/docker +vim docker-compose.yaml +``` + +```shell + build: + context: ../.. + dockerfile: docker/Dockerfile +# image: ${HUB}:${TAG} +``` +![](/doc/image/streampark_source_generation_image.png) + +```shell +docker-compose up -d +``` + +## docker-compse配置 + +docker-compose.yaml会引用env文件的配置,修改后的配置如下: + +```yaml +version: '3.8' +services: + ## streampark-console容器 + streampark-console: + ## streampark的镜像 + image: apache/streampark:latest + ## streampark的镜像启动命令 + command: ${ + RUN_COMMAND} + ports: + - 10000:10000 + ## 环境配置文件 + env_file: .env + environment: + ## 声明环境变量 + HADOOP_HOME: ${ + HADOOP_HOME} + volumes: + - flink:/streampark/flink/${ + FLINK} + - /var/run/docker.sock:/var/run/docker.sock + - /etc/hosts:/etc/hosts:ro + - ~/.kube:/root/.kube:ro + privileged: true + restart: unless-stopped + networks: + - streampark + + ## flink-jobmanager容器 + flink-jobmanager: + image: ${ + FLINK_IMAGE} + ports: + - "8081:8081" + command: jobmanager + volumes: + - flink:/opt/flink + env_file: .env + restart: unless-stopped + privileged: true + networks: + - streampark + + ## streampark-taskmanager容器 + flink-taskmanager: + image: ${ + FLINK_IMAGE} + depends_on: + - flink-jobmanager + command: taskmanager + deploy: + replicas: 1 + env_file: .env + restart: unless-stopped + privileged: true + networks: + - streampark + +networks: + streampark: + driver: bridge + +volumes: + flink: +``` + +最后,执行启动命令: + +```shell +cd docker +docker-compose up -d +``` + +可以使用docker ps来查看是否安装成功,显示如下信息,表示安装成功: + +![](/doc/image/streampark_docker_ps.png) + +## 上传配置至容器 + +在前面的env文件,声明了HADOOP_HOME,对应的目录为 `/streampark/hadoop`,所以需要上传hadoop安装包下的 `/etc/hadoop` 至 `/streampark/hadoop` 目录,命令如下: + +```shell +## 上传hadoop资源 +docker cp etc整个目录 streampark-docker_streampark-console_1:/streampark/hadoop +## 进入容器 +docker exec -it streampark-docker_streampark-console_1 bash +## 查看 +ls +``` + +![](/doc/image/streampark_docker_ls_hadoop.png) + +同时,其它配置文件,如maven的settings.xml文件也是以同样的方式上传。