Another analysis to do is measure the time taken for the transformation of "lookup" and "filter" with the same file that the previous tests. |
![]() |
There are 2 ways to approach this analysis, one of them involves disassembling the solution into 2 parts. This is because instead of making a reference directly to ODBC, you could use a Fileset. There are many benefits with this second option, however, with the 2 results in hand, I can say that there was too much gain. It only remains to wait for the second goal of BI-LAB to be carried out (same tests but using a monster such as infrastructure) to see differences. But that lack enough time, I hope that by then, have a visitor on the page, but will dedicate myself to cooking blogs ;).
CASE 1: Using ODBC like Reference
Objective: | To measure elapsed time reading 6 million rows, from Flat file, join the main flow with a lookup table (MySql) and take attributes. Filter the flow and write a txt file. |
Rows: | 6.024.000 M |
Columns: | 37 Columns |
Resources: | Virtual machine with: 2 GB RAM, IBM Datastage 8.1 like main process over the virtual plataform. The resources used are anecdotal, today, Any production environment has enough processing power for current and future requirements. The objective here, is to build, to execute and to measure with the same environment (regardless of the limited resources) |
Structure |
|
Design & Run |
Reference: |
Elapsed time (s) |
57 sec. |
Rows per sec (avg) |
105.684 r/s |
Summary of log |
|
CASE 2: Using FILESET like Reference
Objective: | To measure elapsed time reading 6 million rows, from Flat file, join the main flow with a lookup table (MySql) and take attributes. Filter the flow and write a txt file. |
Rows: | 6.024.000 M |
Columns: | 37 Columns |
Resources: | Virtual machine with: 2 GB RAM, IBM Datastage 8.1 like main process over the virtual plataform. The resources used are anecdotal, today, Any production environment has enough processing power for current and future requirements. The objective here, is to build, to execute and to measure with the same environment (regardless of the limited resources) |
Structure |
|
Design & Run |
Job A: Job B: |
Elapsed time (s) |
Job A: (5 secs) + Job B: (51 secs): 56 Secs. |
Rows per sec (avg) |
118.000 r/s |
Summary of log |
|