Jul
13
2010 Posted by DB2 Geek

DB2 pureScale – Insert Load & Replication

Following on from James Gill’s recent blog post about the Coupling Facility today I’m sharing with you our experiences of working with Insert Load & Replication in DB2 pureScale. 

In order to test the load capacity of the IBM DB2 pureScale machine we needed some test data and lots of it!  Being somewhat familiar with the capabilities of the CF on z/OS I knew that a few thousand records would not be enough.

After some checking about, I decided to “stand on the shoulders of giants” and use the EMPLOYEE table from the DB2 SAMPLE database.  Simple, elegant and to the point.  The only problem was that with the current “out of the box” table definition, there would only be 999,999 records that could be produced.

Column name Schema  Data type name Length Scale Nulls
EMPNO   SYSIBM CHARACTER  6  0 No
FIRSTNME                                                 SYSIBM VARCHAR      12   0 No
MIDINIT                                                 SYSIBM CHARACTER  1 0 YES
LASTNAME                                            SYSIBM  VARCHAR  15 0 NO
WORKDEPT              SYSIBM CHARACTER           3 0 YES
PHONENO                      SYSIBM     CHARACTER    4 0 YES
HIREDATE                                               SYSIBM DATE  4 0 YES
JOB                                              SYSIBM   CHARACTER   8 0 YES
GENDER SYSIBM  CHARACTER     1 0 YES
BIRTHDATE                                       SYSIBM DATE    4 0 YES
SALARY                                           SYSIBM DECIMAL    9 2 YES
BONUS                                      SYSIBM DECIMAL  9 2 YES
COMM                                             SYSIBM     DECIMAL    9 2 YES

14 record(s) selected.

With a little transformation, we changed the EMPNO to be CHAR(12).  This would allow for the increase of data.  (a possible 999,999,999,999 records)

Column name Schema  Data type name Length Scale Nulls
EMPNO                                SYSIBM  CHARACTER     12 0 NO
FIRSTNME                                      SYSIBM VARCHAR    12 0 NO
MIDINIT                           SYSIBM    CHARACTER        1 0 YES
LASTNAME   SYSIBM  VARCHAR  15 0 NO
WORKDEPT           SYSIBM   CHARACTER    3 0 YES
PHONENO  SYSIBM  CHARACTER   4 0 YES
HIREDATE       SYSIBM   DATE       4 0 YES
JOB      SYSIBM  CHARACTER     8 0 YES
EDLEVEL   SYSIBM   SMALLINT 2 0 NO
GENDER SYSIBM  CHARACTER     1 0 YES
BIRTHDATE  SYSIBM   DATE       4 0 YES
SALARY     SYSIBM  DECIMAL         9 2 YES
BONUS     SYSIBM   DECIMAL    9 2 YES
COMM    SYSIBM  DECIMAL 9 2 YES

14 record(s) selected. 

We exported the data into a text delimited file (all 41 records), created a table “LIKE” the employee table, then altered the definition.

I wrote a Q&D Perl script to iterate through the delimited data file and reassign the new EMPNO value.  After a little testing, I just started the program and then turned my attention to items.

After a period of time, I checked the progress and we had approximately 8.5 million records.  I then set up a db2batch script to load the data from the single file.

Other loading strategies attempted were made by splitting the 8.5 million records into multiple files and loading from a single member, then from multiple members.   Autocommit was initially on for the first run, but was switched off and commits were performed at every 10K records.  For the multiple member loads the commit count was at every 500 records.

After the initial load, James suggested that the data file be partitioned three ways for the three data members and to test for contention on the single table.

This was set up quickly and we were good to go.

« | »
Have a Question?

Get in touch with our expert team and see how we can help with your IT project, call us on +44(0) 870 2411 550 or use our contact form…