您的位置:首页 > 汽车 > 新车 > 【实战系列】DataX 是什么及使用场景

【实战系列】DataX 是什么及使用场景

2024/7/1 14:55:57 来源:https://blog.csdn.net/cc20100608/article/details/139978985  浏览:    关键词:【实战系列】DataX 是什么及使用场景

【实战】DataX 是什么及使用场景

介绍

DataX 是阿里巴巴集团开源的一个数据同步工具,用于实现不同数据源之间的数据同步和迁移。它提供了一个框架,通过插件的形式支持各种数据源,如 MySQL、Oracle、HDFS、HBase 等。DataX 的核心设计理念是“简单、可靠、高效”,旨在解决大数据领域复杂的数据同步问题。

DataX 的特点

  1. 插件化:DataX采用插件化架构,可以方便地扩展支持更多的数据源。
  2. 高性能:DataX针对大量数据的传输做了优化,支持多线程并发读取、写入,以及数据的批量传输。
  3. 稳定性:DataX提供了完善的错误处理机制和重试机制,确保数据同步的可靠性。
  4. 易用性:DataX提供了简单易用的命令行工具和JSON配置文件,用户可以轻松地定义数据同步任务。

DataX 的使用场景

1. 数据迁移

当企业需要将数据从旧的数据存储系统迁移到新的数据存储系统时,DataX可以作为一个高效的数据迁移工具。通过配置不同的数据源插件,DataX可以方便地实现数据在不同系统之间的迁移。

2. 数据备份

DataX也可以用于数据的备份。企业可以定期使用DataX将数据从生产环境同步到备份环境,以确保数据的可靠性。同时,DataX的高性能和稳定性也保证了备份过程的高效和可靠。

3. 数据同步

DataX可以实现不同数据源之间的数据同步。例如,企业可能需要在MySQL数据库和HBase之间保持数据的实时同步,以支持在线分析和实时查询。DataX可以帮助企业实现这种跨数据源的数据同步。

4. 数据交换

在分布式系统中,不同系统之间可能需要共享数据。DataX可以作为一个数据交换工具,将数据从一个系统同步到另一个系统。通过配置不同的数据源插件,DataX可以支持各种类型的数据交换需求。

5. 数据集成

对于需要集成多个数据源进行大数据分析的场景,DataX也可以提供有力的支持。企业可以使用DataX将多个数据源的数据集成到一个统一的数据存储系统中,以便进行后续的数据分析和挖掘。

编译安装

克隆源码,进行编译即可,

git clone https://github.com/alibaba/DataX.git
cd DataX
mvn -U clean package assembly:assembly -Dmaven.test.skip=true

编译过程中如果报错,一般都是插件相关的报错。可以把编译不通过的插件从 pom.xml 中注释即可。如 otsreaderotswriter。配置为:

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd"><modelVersion>4.0.0</modelVersion><groupId>com.alibaba.datax</groupId><artifactId>datax-all</artifactId><version>0.0.1-SNAPSHOT</version><dependencies><dependency><groupId>org.hamcrest</groupId><artifactId>hamcrest-core</artifactId><version>1.3</version></dependency></dependencies><name>datax-all</name><packaging>pom</packaging><properties><jdk-version>1.8</jdk-version><datax-project-version>0.0.1-SNAPSHOT</datax-project-version><commons-lang3-version>3.3.2</commons-lang3-version><commons-configuration-version>1.10</commons-configuration-version><commons-cli-version>1.2</commons-cli-version><fastjson-version>1.1.46.sec01</fastjson-version><guava-version>16.0.1</guava-version><diamond.version>3.7.2.1-SNAPSHOT</diamond.version><!--slf4j 1.7.10 和 logback-classic 1.0.13 是好基友 --><slf4j-api-version>1.7.10</slf4j-api-version><logback-classic-version>1.0.13</logback-classic-version><commons-io-version>2.4</commons-io-version><junit-version>4.11</junit-version><tddl.version>5.1.22-1</tddl.version><swift-version>1.0.0</swift-version><project-sourceEncoding>UTF-8</project-sourceEncoding><project.build.sourceEncoding>UTF-8</project.build.sourceEncoding><project.reporting.outputEncoding>UTF-8</project.reporting.outputEncoding><maven.compiler.encoding>UTF-8</maven.compiler.encoding></properties><modules><module>common</module><module>core</module><module>transformer</module><!-- reader --><module>mysqlreader</module><module>drdsreader</module><module>sqlserverreader</module><module>postgresqlreader</module><module>oraclereader</module><module>odpsreader</module><module>otsreader</module><!--<module>otsstreamreader</module>--><module>txtfilereader</module><module>hdfsreader</module><module>streamreader</module><module>ossreader</module><module>ftpreader</module><module>mongodbreader</module><module>rdbmsreader</module><module>hbase11xreader</module><module>hbase094xreader</module><module>tsdbreader</module><module>opentsdbreader</module><module>cassandrareader</module><!-- writer --><module>mysqlwriter</module><module>drdswriter</module><module>odpswriter</module><module>txtfilewriter</module><module>ftpwriter</module><module>hdfswriter</module><module>streamwriter</module><!--<module>otswriter</module>--><module>oraclewriter</module><module>sqlserverwriter</module><module>postgresqlwriter</module><module>osswriter</module><module>mongodbwriter</module><module>adswriter</module><module>ocswriter</module><module>rdbmswriter</module><module>hbase11xwriter</module><module>hbase094xwriter</module><module>hbase11xsqlwriter</module><module>hbase11xsqlreader</module><module>elasticsearchwriter</module><module>tsdbwriter</module><module>adbpgwriter</module><module>gdbwriter</module><module>cassandrawriter</module><!-- common support module --><module>plugin-rdbms-util</module><module>plugin-unstructured-storage-util</module><module>hbase20xsqlreader</module><module>hbase20xsqlwriter</module></modules><dependencyManagement><dependencies><dependency><groupId>org.apache.commons</groupId><artifactId>commons-lang3</artifactId><version>${commons-lang3-version}</version></dependency><dependency><groupId>com.alibaba</groupId><artifactId>fastjson</artifactId><version>${fastjson-version}</version></dependency><!--<dependency><groupId>com.google.guava</groupId><artifactId>guava</artifactId><version>${guava-version}</version></dependency>--><dependency><groupId>commons-io</groupId><artifactId>commons-io</artifactId><version>${commons-io-version}</version></dependency><dependency><groupId>org.slf4j</groupId><artifactId>slf4j-api</artifactId><version>${slf4j-api-version}</version></dependency><dependency><groupId>ch.qos.logback</groupId><artifactId>logback-classic</artifactId><version>${logback-classic-version}</version></dependency><dependency><groupId>com.taobao.tddl</groupId><artifactId>tddl-client</artifactId><version>${tddl.version}</version><exclusions><exclusion><groupId>com.google.guava</groupId><artifactId>guava</artifactId></exclusion><exclusion><groupId>com.taobao.diamond</groupId><artifactId>diamond-client</artifactId></exclusion></exclusions></dependency><dependency><groupId>com.taobao.diamond</groupId><artifactId>diamond-client</artifactId><version>${diamond.version}</version></dependency><dependency><groupId>com.alibaba.search.swift</groupId><artifactId>swift_client</artifactId><version>${swift-version}</version></dependency><dependency><groupId>junit</groupId><artifactId>junit</artifactId><version>${junit-version}</version></dependency><dependency><groupId>org.mockito</groupId><artifactId>mockito-all</artifactId><version>1.9.5</version><scope>test</scope></dependency></dependencies></dependencyManagement><build><plugins><plugin><artifactId>maven-assembly-plugin</artifactId><configuration><finalName>datax</finalName><descriptors><descriptor>package.xml</descriptor></descriptors></configuration><executions><execution><id>make-assembly</id><phase>package</phase></execution></executions></plugin><plugin><groupId>org.apache.maven.plugins</groupId><artifactId>maven-compiler-plugin</artifactId><version>2.3.2</version><configuration><source>${jdk-version}</source><target>${jdk-version}</target><encoding>${project-sourceEncoding}</encoding></configuration></plugin></plugins></build>
</project>

如果不想编译,可以下载已编译好的。命令如下:

wget -c "http://datax-opensource.oss-cn-hangzhou.aliyuncs.com/datax.tar.gz"

解压后即可使用。其目录结构为:

# tree -L 2 datax
datax
├── bin   # 可执行文件
│   ├── datax.py
│   ├── dxprof.py
│   └── perftrace.py
├── conf  # 配置文件
│   ├── core.json
│   ├── logback.xml
│   ├── mysql01.json
│   ├── mysql01.json.splitPk
│   ├── mysql_writer.json
│   └── mysql_writer.json.bak
├── job
│   ├── job.json
│   └── stream2stream.json
├── lib
│   ├── commons-beanutils-1.9.2.jar
│   ├── commons-cli-1.2.jar
│   ├── commons-codec-1.9.jar
│   ├── commons-collections-3.2.1.jar
│   ├── commons-configuration-1.10.jar
│   ├── commons-io-2.4.jar
│   ├── commons-lang-2.6.jar
│   ├── commons-lang3-3.3.2.jar
│   ├── commons-logging-1.1.1.jar
│   ├── commons-math3-3.1.1.jar
│   ├── datax-common-0.0.1-SNAPSHOT.jar
│   ├── datax-core-0.0.1-SNAPSHOT.jar
│   ├── datax-transformer-0.0.1-SNAPSHOT.jar
│   ├── fastjson-1.1.46.sec01.jar
│   ├── fluent-hc-4.4.jar
│   ├── groovy-all-2.1.9.jar
│   ├── hamcrest-core-1.3.jar
│   ├── httpclient-4.4.jar
│   ├── httpcore-4.4.jar
│   ├── janino-2.5.16.jar
│   ├── logback-classic-1.0.13.jar
│   ├── logback-core-1.0.13.jar
│   └── slf4j-api-1.7.10.jar
├── log
│   ├── 2019-11-28
│   └── 2020-04-22
├── log_perf
│   ├── 2019-11-28
│   └── 2020-04-22
├── plugin
│   ├── reader
│   └── writer
├── script
│   └── Readme.md
└── tmp└── readme.txt15 directories, 36 files

生成配置文件

创建配置文件(json格式):

# 通过命令生成配置模板
python datax.py -r oraclereader -w mysqlwriter > oracle2mysql2.json# 开始同步
python datax.py ./oracle2mysql2.json

几个示例

读取 MySQL 并打印到标准输出

先看配置文件,其内容如下:

{"job": {"setting": {"speed": {"channel": 3},"errorLimit": {"record": 0,"percentage": 0.02}},"content": [{"reader": {"name": "mysqlreader","parameter": {"username": "root","password": "12345678","column": ["id","name"],"connection": [{"table": ["item"],"jdbcUrl": ["jdbc:mysql://127.0.0.1:3306/it"]}]}},"writer": {"name": "streamwriter","parameter": {"print":true}}}]}
}

运行程序:

[root@ip-192-168-2-56 datax]# python bin/datax.py conf/mysql01.jsonDataX (DATAX-OPENSOURCE-3.0), From Alibaba !
Copyright (C) 2010-2017, Alibaba Group. All Rights Reserved.2020-04-22 13:29:09.124 [main] INFO  VMInfo - VMInfo# operatingSystem class => sun.management.OperatingSystemImpl
2020-04-22 13:29:09.190 [main] INFO  Engine - the machine info  => osInfo:	Oracle Corporation 1.8 25.121-b13jvmInfo:	Linux amd64 3.10.0-1062.9.1.el7.x86_64cpu num:	4totalPhysicalMemory:	-0.00GfreePhysicalMemory:	-0.00GmaxFileDescriptorCount:	-1currentOpenFileDescriptorCount:	-1GC Names	[PS MarkSweep, PS Scavenge]MEMORY_NAME                    | allocation_size                | init_size     PS Eden Space                  | 256.00MB                       | 256.00MB       Code Cache                     | 240.00MB                       | 2.44MB         Compressed Class Space         | 1,024.00MB                     | 0.00MB PS Survivor Space              | 42.50MB                        | 42.50MB       PS Old Gen                     | 683.00MB                       | 683.00MB      Metaspace                      | -0.00MB                        | 0.00MB  2020-04-22 13:29:09.211 [main] INFO  Engine - 
{"content":[{"reader":{"name":"mysqlreader","parameter":{"column":["id","name"],"connection":[{"jdbcUrl":["jdbc:mysql://127.0.0.1:3306/it"],"table":["item"]}],"password":"************","username":"root"}},"writer":{"name":"streamwriter","parameter":{"print":true}}}],"setting":{"errorLimit":{"percentage":0.02,"record":0},"speed":{"channel":3}}
}2020-04-22 13:29:09.235 [main] WARN  Engine - prioriy set to 0, because NumberFormatException, the value is: null
2020-04-22 13:29:09.237 [main] INFO  PerfTrace - PerfTrace traceId=job_-1, isEnable=false, priority=0
2020-04-22 13:29:09.238 [main] INFO  JobContainer - DataX jobContainer starts job.
2020-04-22 13:29:09.240 [main] INFO  JobContainer - Set jobId = 0
2020-04-22 13:29:09.949 [job-0] INFO  OriginalConfPretreatmentUtil - Available jdbcUrl:jdbc:mysql://127.0.0.1:3306/it?yearIsDateType=false&zeroDateTimeBehavior=convertToNull&tinyInt1isBit=false&rewriteBatchedStatements=true.
2020-04-22 13:29:10.086 [job-0] INFO  OriginalConfPretreatmentUtil - table:[item] has columns:[id,avatar,dept,description,link,name,type].
2020-04-22 13:29:10.107 [job-0] INFO  JobContainer - jobContainer starts to do prepare ...
2020-04-22 13:29:10.108 [job-0] INFO  JobContainer - DataX Reader.Job [mysqlreader] do prepare work .
2020-04-22 13:29:10.108 [job-0] INFO  JobContainer - DataX Writer.Job [streamwriter] do prepare work .
2020-04-22 13:29:10.110 [job-0] INFO  JobContainer - jobContainer starts to do split ...
2020-04-22 13:29:10.110 [job-0] INFO  JobContainer - Job set Channel-Number to 3 channels.
2020-04-22 13:29:10.115 [job-0] INFO  JobContainer - DataX Reader.Job [mysqlreader] splits to [1] tasks.
2020-04-22 13:29:10.116 [job-0] INFO  JobContainer - DataX Writer.Job [streamwriter] splits to [1] tasks.
2020-04-22 13:29:10.144 [job-0] INFO  JobContainer - jobContainer starts to do schedule ...
2020-04-22 13:29:10.149 [job-0] INFO  JobContainer - Scheduler starts [1] taskGroups.
2020-04-22 13:29:10.152 [job-0] INFO  JobContainer - Running by standalone Mode.
2020-04-22 13:29:10.163 [taskGroup-0] INFO  TaskGroupContainer - taskGroupId=[0] start [1] channels for [1] tasks.
2020-04-22 13:29:10.170 [taskGroup-0] INFO  Channel - Channel set byte_speed_limit to -1, No bps activated.
2020-04-22 13:29:10.170 [taskGroup-0] INFO  Channel - Channel set record_speed_limit to -1, No tps activated.
2020-04-22 13:29:10.192 [taskGroup-0] INFO  TaskGroupContainer - taskGroup[0] taskId[0] attemptCount[1] is started
2020-04-22 13:29:10.198 [0-0-0-reader] INFO  CommonRdbmsReader$Task - Begin to read record by Sql: [select id,name from item 
] jdbcUrl:[jdbc:mysql://127.0.0.1:3306/it?yearIsDateType=false&zeroDateTimeBehavior=convertToNull&tinyInt1isBit=false&rewriteBatchedStatements=true].
2020-04-22 13:29:10.233 [0-0-0-reader] INFO  CommonRdbmsReader$Task - Finished read record by Sql: [select id,name from item 
] jdbcUrl:[jdbc:mysql://127.0.0.1:3306/it?yearIsDateType=false&zeroDateTimeBehavior=convertToNull&tinyInt1isBit=false&rewriteBatchedStatements=true].
1	Confluence
2	Jira
3	Gitlab
4	XShell
2020-04-22 13:29:10.293 [taskGroup-0] INFO  TaskGroupContainer - taskGroup[0] taskId[0] is successed, used[107]ms
2020-04-22 13:29:10.294 [taskGroup-0] INFO  TaskGroupContainer - taskGroup[0] completed it's tasks.
2020-04-22 13:29:20.175 [job-0] INFO  StandAloneJobContainerCommunicator - Total 4 records, 30 bytes | Speed 3B/s, 0 records/s | Error 0 records, 0 bytes |  All Task WaitWriterTime 0.000s |  All Task WaitReaderTime 0.044s | Percentage 100.00%
2020-04-22 13:29:20.175 [job-0] INFO  AbstractScheduler - Scheduler accomplished all tasks.
2020-04-22 13:29:20.176 [job-0] INFO  JobContainer - DataX Writer.Job [streamwriter] do post work.
2020-04-22 13:29:20.176 [job-0] INFO  JobContainer - DataX Reader.Job [mysqlreader] do post work.
2020-04-22 13:29:20.177 [job-0] INFO  JobContainer - DataX jobId [0] completed successfully.
2020-04-22 13:29:20.178 [job-0] INFO  HookInvoker - No hook invoked, because base dir not exists or is a file: /root/datax/hook
2020-04-22 13:29:20.180 [job-0] INFO  JobContainer - [total cpu info] => averageCpu                     | maxDeltaCpu                    | minDeltaCpu                    -1.00%                         | -1.00%                         | -1.00%[total gc info] => NAME                 | totalGCCount       | maxDeltaGCCount    | minDeltaGCCount    | totalGCTime        | maxDeltaGCTime     | minDeltaGCTime     PS MarkSweep         | 0                  | 0                  | 0                  | 0.000s             | 0.000s             | 0.000s             PS Scavenge          | 0                  | 0                  | 0                  | 0.000s             | 0.000s             | 0.000s             2020-04-22 13:29:20.180 [job-0] INFO  JobContainer - PerfTrace not enable!
2020-04-22 13:29:20.180 [job-0] INFO  StandAloneJobContainerCommunicator - Total 4 records, 30 bytes | Speed 3B/s, 0 records/s | Error 0 records, 0 bytes |  All Task WaitWriterTime 0.000s |  All Task WaitReaderTime 0.044s | Percentage 100.00%
2020-04-22 13:29:20.182 [job-0] INFO  JobContainer - 
任务启动时刻                    : 2020-04-22 13:29:09
任务结束时刻                    : 2020-04-22 13:29:20
任务总计耗时                    :                 10s
任务平均流量                    :                3B/s
记录写入速度                    :              0rec/s
读出记录总数                    :                   4
读写失败总数                    :                   0

准备数据写入到 MySQL

先看数据库结构,

MariaDB [(none)]> use test;MariaDB [test]> show tables;
+----------------+
| Tables_in_test |
+----------------+
| test           |
+----------------+
1 row in set (0.00 sec)MariaDB [test]> desc test;
+-------+-------------+------+-----+---------+----------------+
| Field | Type        | Null | Key | Default | Extra          |
+-------+-------------+------+-----+---------+----------------+
| id    | int(11)     | NO   | PRI | NULL    | auto_increment |
| name  | varchar(32) | YES  |     | NULL    |                |
+-------+-------------+------+-----+---------+----------------+
2 rows in set (0.00 sec)

接下来看看配置文件,

{"job": {"setting": {"speed": {"channel": 1}},"content": [{"reader": {"name": "streamreader","parameter": {"column" : [{"value": "DataX", # 这里是要插入的数据的值"type": "string"  # 数据类型}],"sliceRecordCount": 1000  # 插入多少条}},"writer": {"name": "mysqlwriter","parameter": {"writeMode": "replace","username": "root","password": "123456789","column": ["name"   # 对应数据库的哪个字段],"session": ["set session sql_mode='ANSI'"],"preSql": ["delete from test"],"connection": [{"jdbcUrl": "jdbc:mysql://127.0.0.1:3306/test?useUnicode=true&characterEncoding=utf8","table": ["test"]}]}}}]}
}

执行一下看看,

# python bin/datax.py conf/mysql_writer.jsonDataX (DATAX-OPENSOURCE-3.0), From Alibaba !
Copyright (C) 2010-2017, Alibaba Group. All Rights Reserved.2020-04-22 13:53:55.201 [main] INFO  VMInfo - VMInfo# operatingSystem class => sun.management.OperatingSystemImpl
2020-04-22 13:53:55.209 [main] INFO  Engine - the machine info  => osInfo:	Oracle Corporation 1.8 25.121-b13jvmInfo:	Linux amd64 3.10.0-1062.9.1.el7.x86_64cpu num:	4totalPhysicalMemory:	-0.00GfreePhysicalMemory:	-0.00GmaxFileDescriptorCount:	-1currentOpenFileDescriptorCount:	-1GC Names	[PS MarkSweep, PS Scavenge]MEMORY_NAME                    | allocation_size                | init_size PS Eden Space                  | 256.00MB                       | 256.00MB Code Cache                     | 240.00MB                       | 2.44MB Compressed Class Space         | 1,024.00MB                     | 0.00MBPS Survivor Space              | 42.50MB                        | 42.50MBPS Old Gen                     | 683.00MB                       | 683.00MBMetaspace                      | -0.00MB                        | 0.00MB 2020-04-22 13:53:55.230 [main] INFO  Engine - 
{"content":[{"reader":{"name":"streamreader","parameter":{"column":[{"type":"string","value":"DataX"}],"sliceRecordCount":1000}},"writer":{"name":"mysqlwriter","parameter":{"column":["name"],"connection":[{"jdbcUrl":"jdbc:mysql://127.0.0.1:3306/test?useUnicode=true&characterEncoding=utf8","table":["test"]}],"password":"************","preSql":["delete from test"],"session":["set session sql_mode='ANSI'"],"username":"root","writeMode":"replace"}}}],"setting":{"speed":{"channel":1}}
}2020-04-22 13:53:55.249 [main] WARN  Engine - prioriy set to 0, because NumberFormatException, the value is: null
2020-04-22 13:53:55.251 [main] INFO  PerfTrace - PerfTrace traceId=job_-1, isEnable=false, priority=0
2020-04-22 13:53:55.251 [main] INFO  JobContainer - DataX jobContainer starts job.
2020-04-22 13:53:55.253 [main] INFO  JobContainer - Set jobId = 0
2020-04-22 13:53:55.642 [job-0] INFO  OriginalConfPretreatmentUtil - table:[test] all columns:[
id,name
].
2020-04-22 13:53:55.656 [job-0] INFO  OriginalConfPretreatmentUtil - Write data [
replace INTO %s (name) VALUES(?)
], which jdbcUrl like:[jdbc:mysql://127.0.0.1:3306/test?useUnicode=true&characterEncoding=utf8&yearIsDateType=false&zeroDateTimeBehavior=convertToNull&tinyInt1isBit=false&rewriteBatchedStatements=true]
2020-04-22 13:53:55.657 [job-0] INFO  JobContainer - jobContainer starts to do prepare ...
2020-04-22 13:53:55.658 [job-0] INFO  JobContainer - DataX Reader.Job [streamreader] do prepare work .
2020-04-22 13:53:55.659 [job-0] INFO  JobContainer - DataX Writer.Job [mysqlwriter] do prepare work .
2020-04-22 13:53:55.667 [job-0] INFO  CommonRdbmsWriter$Job - Begin to execute preSqls:[delete from test]. context info:jdbc:mysql://127.0.0.1:3306/test?useUnicode=true&characterEncoding=utf8&yearIsDateType=false&zeroDateTimeBehavior=convertToNull&tinyInt1isBit=false&rewriteBatchedStatements=true.
2020-04-22 13:53:55.732 [job-0] INFO  JobContainer - jobContainer starts to do split ...
2020-04-22 13:53:55.732 [job-0] INFO  JobContainer - Job set Channel-Number to 1 channels.
2020-04-22 13:53:55.733 [job-0] INFO  JobContainer - DataX Reader.Job [streamreader] splits to [1] tasks.
2020-04-22 13:53:55.734 [job-0] INFO  JobContainer - DataX Writer.Job [mysqlwriter] splits to [1] tasks.
2020-04-22 13:53:55.754 [job-0] INFO  JobContainer - jobContainer starts to do schedule ...
2020-04-22 13:53:55.759 [job-0] INFO  JobContainer - Scheduler starts [1] taskGroups.
2020-04-22 13:53:55.761 [job-0] INFO  JobContainer - Running by standalone Mode.
2020-04-22 13:53:55.769 [taskGroup-0] INFO  TaskGroupContainer - taskGroupId=[0] start [1] channels for [1] tasks.
2020-04-22 13:53:55.774 [taskGroup-0] INFO  Channel - Channel set byte_speed_limit to -1, No bps activated.
2020-04-22 13:53:55.774 [taskGroup-0] INFO  Channel - Channel set record_speed_limit to -1, No tps activated.
2020-04-22 13:53:55.785 [taskGroup-0] INFO  TaskGroupContainer - taskGroup[0] taskId[0] attemptCount[1] is started
2020-04-22 13:53:55.801 [0-0-0-writer] INFO  DBUtil - execute sql:[set session sql_mode='ANSI']
2020-04-22 13:53:55.808 [0-0-0-writer] INFO  DBUtil - execute sql:[set session sql_mode='ANSI']
2020-04-22 13:53:55.986 [taskGroup-0] INFO  TaskGroupContainer - taskGroup[0] taskId[0] is successed, used[202]ms
2020-04-22 13:53:55.987 [taskGroup-0] INFO  TaskGroupContainer - taskGroup[0] completed it's tasks.
2020-04-22 13:54:05.783 [job-0] INFO  StandAloneJobContainerCommunicator - Total 1000 records, 5000 bytes | Speed 500B/s, 100 records/s | Error 0 records, 0 bytes |  All Task WaitWriterTime 0.021s |  All Task WaitReaderTime 0.000s | Percentage 100.00%
2020-04-22 13:54:05.783 [job-0] INFO  AbstractScheduler - Scheduler accomplished all tasks.
2020-04-22 13:54:05.784 [job-0] INFO  JobContainer - DataX Writer.Job [mysqlwriter] do post work.
2020-04-22 13:54:05.785 [job-0] INFO  JobContainer - DataX Reader.Job [streamreader] do post work.
2020-04-22 13:54:05.785 [job-0] INFO  JobContainer - DataX jobId [0] completed successfully.
2020-04-22 13:54:05.786 [job-0] INFO  HookInvoker - No hook invoked, because base dir not exists or is a file: /root/datax/hook
2020-04-22 13:54:05.788 [job-0] INFO  JobContainer - [total cpu info] => averageCpu                     | maxDeltaCpu                    | minDeltaCpu                    -1.00%                         | -1.00%                         | -1.00%[total gc info] => NAME                 | totalGCCount       | maxDeltaGCCount    | minDeltaGCCount    | totalGCTime        | maxDeltaGCTime     | minDeltaGCTime     PS MarkSweep         | 0                  | 0                  | 0                  | 0.000s             | 0.000s             | 0.000s             PS Scavenge          | 0                  | 0                  | 0                  | 0.000s             | 0.000s             | 0.000s             2020-04-22 13:54:05.788 [job-0] INFO  JobContainer - PerfTrace not enable!
2020-04-22 13:54:05.789 [job-0] INFO  StandAloneJobContainerCommunicator - Total 1000 records, 5000 bytes | Speed 500B/s, 100 records/s | Error 0 records, 0 bytes |  All Task WaitWriterTime 0.021s |  All Task WaitReaderTime 0.000s | Percentage 100.00%
2020-04-22 13:54:05.790 [job-0] INFO  JobContainer - 
任务启动时刻                    : 2020-04-22 13:53:55
任务结束时刻                    : 2020-04-22 13:54:05
任务总计耗时                    :                 10s
任务平均流量                    :              500B/s
记录写入速度                    :            100rec/s
读出记录总数                    :                1000
读写失败总数                    :                   0

查看一下数据库中的数据,

MariaDB [test]> select * from test limit 10;
+------+-------+
| id   | name  |
+------+-------+
| 4001 | DataX |
| 4002 | DataX |
| 4003 | DataX |
| 4004 | DataX |
| 4005 | DataX |
| 4006 | DataX |
| 4007 | DataX |
| 4008 | DataX |
| 4009 | DataX |
| 4010 | DataX |
+------+-------+
10 rows in set (0.00 sec)

版权声明:

本网仅为发布的内容提供存储空间,不对发表、转载的内容提供任何形式的保证。凡本网注明“来源:XXX网络”的作品,均转载自其它媒体,著作权归作者所有,商业转载请联系作者获得授权,非商业转载请注明出处。

我们尊重并感谢每一位作者,均已注明文章来源和作者。如因作品内容、版权或其它问题,请及时与我们联系,联系邮箱:809451989@qq.com,投稿邮箱:809451989@qq.com