We have a huge partition table (25 partitions) on db1, and need to copy it to another database db2. The straightforward way I could think of is to insert in parallel over network. So run the command on db2:
insert /*+ append parallel(a,4) */ into tab1 a select /*+ parallel(b,4) */ * from tab1@db1;
At the remote DB side db1, I could see 4 slaves are waiting for “PX Deq Credit: send blkd”, and the coordinator is waiting for “more data to client”. That means 4 slaves are waiting for the coordinator to consume the data, and the coordinator is waiting for data to be transferred over network. The bottleneck here is the network. To improve the transfer speed over network, I need kick off more non-parallel sql instead of 1 sql with parallel.
For a partition table with 25 partitions, if I use 5 separate non-parallel sql (each load 5 partitions) to load the data , then we could have 5 inter-database connections instead of 1 inter-database connection, in this way we could get better network throughput. (update 25 Dec, 2009: I saw many cases where the client side used 100% of a single cpu, so the network throughput can’t increase even there are still lots of free bandwidth, increase sessions could help)
So what need to do is:
1: Remove the hint of parallel(b,4). This hint is of no use, because the speed to read the data from the table is quicker than the speed to send the data to network. FTS is not the bottleneck.
2: one sql statement inserts data from one partition, and kick off 5 sessions, each session run 5 of the following sql.
insert /*+ append */ into tab1 a select * from tab1@db1 partition (part00);
insert /*+ append */ into tab1 a select * from tab1@db1 partition (part01);
insert /*+ append */ into tab1 a select * from tab1@db1 partition (part02);
insert /*+ append */ into tab1 a select * from tab1@db1 partition (part03);
insert /*+ append */ into tab1 a select * from tab1@db1 partition (part24);
After the change, the data transferred per second increases a lot,