Fourth Test Clover ETL, interaction between the ETL Tool and Oracle via Conventional Inserts. Performance comparison between the results obtained by Talend, Datastage, Pentaho and Clover ETL. Talend is dethroned of the Second place. Coming soon, the tests of heavyweights such as Informatica and ODI... |
|
Continuing with the 4th Test (Impact over Oracle), which plays with common parameters in all the tools, as batch_size and Commit Size, Clover ETL has reached an excellent time. Leaving aside the 50 seconds reached by Datastage with the BULK, until now the Ranking was conformed by Pentaho (3rd place): 196 seconds (Pentaho) Stress Test 4: Flat to ORACLE, Talend (2nd place): 168 Seconds (Talend) Stress Test 4: Flat to ORACLE and Datastage (1st place): 80 Seconds (IBM DS 8.1) Stress Test 4: Flat to ORACLE.
After trying different configurations, Clover ETL reached a time of 143 seconds (taking the 2nd place). The COMMIT_SIZE = 60.000 AND BATCH_SIZE = 20.000
I felt right to run again the test Talend (who lost the position 2) with these parameters, however, times were much higher than those recorded previously. Talend achieved best time with a higher batch_size.
Coming soon, the tests of heavyweights such as Informatica and ODI...
TIMES:
Generally, comparisons between tools developed by some of the interested, are difficult to see as valid by the interests involved. That's one of the reasons that motivated me to create this free and independent benchmarck. However, one of the articles written by Clover ETL is very interesting: DataDirect's OracleDB JDBC Driver Speed Test. In this section are implemented different ways of carrying out the test, looking for the best solution, with higher performance. |
![]() |
The fastest way of loading data is unquestionably direct data loading with sqlldr utility. Even when inter-storing data in pipe, the sqlldr utility is about 50% faster than any other method. Also, we can see how the increased allocation of resources in our tests, generate more performant times, even when the number of rows is 6 times more. |
ARCHITECTURE: Environment: Infraestructure composed of 3 nodes 1) ESXi 5.0: a) Physical Datastore 1: VM ETL Clover (10GB RAM - 2 Cores * 2 Sockets) b) Physical Datastore 2: VM Database Server Oracle (6GB RAM - 2 Cores * 2 Sockets) 2) Monitor Performance: VM Monitor ESXi + SQL Server 2008 (with 4 GB RAM) 3) Operator ETL: ESXi Client (with 4 GB RAM) |
![]() |
CASE 1: CONVENTIONAL INSERTS
Objective: |
- To measure elapsed time reading and writing 6 million rows, from Flat file, to Oracle Database. - Analyze use of the resources. |
ETL Tool | Clover ETL |
Rows: | 6.024.000 M |
Columns: | 37 Columns |
Structure: (Metadata) |
|
Design & Run |
Parameters: |
LOG |
Log: |
Elapsed time (s) | 143 Secs |
Rows p/s (avg) |
42.125 r/s avg |
How to Improve Performance |
- Adjust the parameters: - COMMIT SIZE- BATCH_SIZE |
USE OF RESOURCES: VM CLOVER
CPU: |
CPU Monitoring, "Passive and Active state" in different executions. Last Execution: 19:34-19:37 Legend CPU: |
Memory: |
Memory Monitoring: Last Execution: 19:34-19:37 Legend Memory: |
Network |
Network Monitoring: Last Execution: 19:34-19:37 Legend Network |
DataST |
Legend DataStore |
USE OF RESOURCES: VM ORACLE
|