The Second stress test involves, Pass 6 millions of records from a Flat File, to a table in MySQL. |
|
The use of ODBC results in terrible times. The absence of a specific plugin for MySQL translates into low performance. When it's time to compare DS vs Talend, these differences will be clearly marked.
The stage used is the enterprise. The new ODBC stage, generates an error when you modify the Array Size.
JR37389: THE ODBC CONNECTOR JOBS FAILING WHILE CONNECTING TO MYSQL SERVER AND WHEN THE ARRAYSIZE IS GREATER THAN 1.
Error description
While using the ODBC Connector with the MySQL server, the connector jobs are failing with a Fatal exception and logs
the following message in the job log "Caught unknown exception from runLocally()". This happens the arraysize is greater than 1. The job works fine when the arraysize is set to 1.
Local fix
Use Array Size of 1.
Problem conclusion
In the code, at couple of places, the arraysize was hardcoded to 1 when the Database Type is MySQL, since the older versions of the MySQL drivers did not support the array functionality. This was causing the current problem when the Arraysize was set to a value greater than 1. After analysing the code, I have corrected the code to properly set the Arraysize at all the relavant places in the code flow. In case customer is using the older drivers, they either need to upgrade to the new drivers or explicitly set the arraysize to 1, if they need to use the old drivers only.
CASE 1:
Objective: | To measure elapsed time reading and writing 6 million rows, from Flat file, to a MySQL database. The DB runs in another machine. The benefit es "Clear", only DS runs alone in the virtual machine. The negative point is the transfer of data through the network, using the generic drivers installed by VM. |
Rows: | 6.024.000 M |
Columns: | 37 Columns |
Resources: | Virtual machine with: 2 GB RAM, Datastage like main process over the virtual plataform. The resources used are anecdotal, today, Any production environment has enough processing power for current and future requirements. The objective here, is to build, to execute and to measure with the same environment (regardless of the limited resources) |
Structure (Metadata) |
|
Design & Run |
|
Elapsed time (s) |
2832 sec. |
Rows per sec (avg) |
2.100 r/s |
Summary of log |
|