The Second stress test involves, Transfering 6 millions of records from a Flat File, to a table in MySQL. |
|
As in most of the Test, there are several cases using differents techniques, setting of variables, with the same objective, satisfy the requirement and improve the performance. Always the first case is the best, and the rest of cases, are jobs that don't have the most appropiate set up options, but helped in the way to reach the goal of improving performance.
From the first test we had optimal results because Talend have a specific plugin for MySQL, with great results of performance, (this differentiates it from other tools like DS,SISS,..) Equally try to find the point of equilibrium between the saturation of the resources and benefits.
LINKS
CASE 1:[939 secs.] CASE 2:[1031 secs.] |
|
CASE 1: -Xms256M, -Xmx1024M, + 200Kcommit + 50K ArrayS
Objective: | To measure elapsed time reading and writing 6 million rows, from Flat file, to a MySQL database. The DB runs in another machine. The benefit is "Clear", only Talend runs alone in the virtual machine. The negative point is the transfer of data through the network, using the generic drivers installed by VM. |
Rows: | 6.024.000 M |
Columns: | 37 Columns |
Resources: | Virtual machine with: 2 GB RAM, Talend like main process over the virtual plataform. The resources used are anecdotal, today, Any production environment has enough processing power for current and future requirements. The objective here, is to build, to execute and to measure with the same environment (regardless of the limited resources) |
Structure: (Metadata) |
|
Design & Run |
|
Elapsed time (s) | 939 Secs. |
Rows per sec (avg) | 6.415 rows/sec |
How to Improve Perform |
- Adjust the parameters: - Xms + Xmx + Csize + ArraySize (as shown in the figure above) |
CASE 2: -Xms256M, -Xmx1024M, + 100Kcommit + 10K ArrayS
Objective: | To measure elapsed time reading and writing 6 million rows, from Flat file, to a MySQL database. The DB runs in another machine. The benefit is "Clear", only Talend runs alone in the virtual machine. The negative point is the transfer of data through the network, using the generic drivers installed by VM. |
Rows: | 6.024.000 M |
Columns: | 37 Columns |
Resources: | Virtual machine with: 2 GB RAM, Talend like main process over the virtual plataform. The resources used are anecdotal, today, Any production environment has enough processing power for current and future requirements. The objective here, is to build, to execute and to measure with the same environment (regardless of the limited resources) |
Structure: (Metadata) |
|
Design & Run |
|
Elapsed time (s) | 1031 Secs. |
Rows per sec (avg) | 5.847 rows/sec |
How to Improve Perform |
- Adjust the parameters: - Xms + -Xmx + CSize + ArraySize (as shown in the figure above) |