plsql – dannorris.com blog

I recently had an opportunity to do a deeper dive into a performance issue related to the performance of a PIPELINED function. The results weren’t quite what I had expected, so I thought they may be of general interest. As with many performance issues, the issues encountered here are likely data- and database-specific. This was testing performed on 10.2.0.3 64-bit non-RAC on Solaris 9.

The PIPELINED function was called like this:

select * from TABLE(pkg1.f1(arg1,arg2,arg3));

And pkg1 is defined like this:

CREATE OR REPLACE PACKAGE pkg1 AS
TYPE coll_table IS TABLE OF custom_persistent_obj_type;
FUNCTION f1 (
p1        IN   NUMBER,
p2             IN   NUMBER,
p3        IN   NUMBER
) RETURN coll_table PIPELINED;
END;
/

The function pkg1.f1 returns a collection of object types, is a little more than 1400 lines long, and contains more than 10 distinct, separate queries, each of them significantly complex involving at least 2 tables, many of them with multiple levels of subqueries. Anyhoo, my chore was to open the idea bin and see if there were some possible enhancements that hadn’t been considered yet.

While reviewing overall performance metrics for the instance, I noted that the DBMS_PICKLER package appeared in the top 10 resource-intensive SQL statements on a regular basis. More digging identified this package as being the primary workhorse behind the TABLE function. So, then the test was clear. Let’s see what happens if we remove PIPELINED, remove TABLE and call the function in a more conventional way. So, my version of the pkg.f would create a collection, add records to it throughout the function’s execution, then return the whole collection at the end with a RETURN statement. The calling function would then need to process this collection into some form to make it more usable by the calling application (similar to the functionality that TABLE provides). For this task, I wrote a stand alone function called DAN.

The new package, pkg2, is defined like this:

CREATE OR REPLACE PACKAGE pkg2 AS
  TYPE coll_table IS TABLE OF custom_persistent_obj_type;
  FUNCTION f2 (
    p1        IN   NUMBER,
    p2             IN   NUMBER,
    p3        IN   NUMBER
  )  RETURN coll_table;
END;
/

DAN is defined as:

CREATE OR REPLACE FUNCTION dan (p_b_tab  IN  pkg2.coll_table,
    p_delimiter     IN  VARCHAR2 DEFAULT ',')
  RETURN VARCHAR2 IS
    l_string     VARCHAR2(32767);
  BEGIN
    for i in p_b_tab.FIRST .. p_b_tab.LAST LOOP
      l_string := l_string || to_char(p_b_tab(i).id1)  ||p_delimiter
                           ||         p_b_tab(i).id2   ||p_delimiter
                           || to_char(p_b_tab(i).array_id)   ||p_delimiter
                           ||         p_b_tab(i).value
      ;
    END LOOP;
  RETURN l_string;
END dan;
/

So, my new units would be called like this:

DECLARE
  l_string      VARCHAR2(32767);
BEGIN
  l_string := dan(pkg2.f2(arg1,arg2,arg3));
  DBMS_OUTPUT.PUT_LINE(l_string);
END;
/

Of course, performance is our primary objective in this case, so timing these two examples side-by-side is the primary chore. I was satisfied that they performed equivalent work and provided equivalent results using different methods. The calling program (custom java, in this case) would have to be modified to handle the changes required, but that may be worth it depending on the time difference.

So, I devised a script to do some very rough testing. The script looked like this:

set timing on echo on termout on pages 0 lines 450 feedb on serveroutput on size 100000
select * from table(pkg1.f1(1095482041,60983,1));
DECLARE
  l_string      VARCHAR2(32767);
BEGIN
  l_string := dan(pkg2.f2(1095482041,60983,1));
  DBMS_OUTPUT.PUT_LINE(l_string);
END;
/

The results were somewhat surprising. Keep in mind that all I’m changing is the way in which data is returned and processed from the function. The data returned, though in slightly different formats, is exactly the same and is not filtered further after it is returned (so there’s no benefit in having it available to run SQL against it).

SELECT (sec)	PL/SQL (sec)	Difference (sec)	Percent Improvement
0.15	0.09	0.06	40.00%
0.14	0.09	0.05	35.71%
0.14	0.10	0.04	28.57%
0.16	0.09	0.07	43.75%
0.15	0.09	0.06	40.00%
0.14	0.12	0.02	14.29%
0.18	0.12	0.06	33.33%
0.18	0.10	0.08	44.44%
0.25	0.12	0.13	52.00%
0.14	0.10	0.04	28.57%
0.18	0.10	0.08	44.44%
0.15	0.10	0.05	33.33%
0.17	0.10	0.07	41.18%
0.18	0.10	0.08	44.44%
0.14	0.10	0.04	28.57%
0.17	0.11	0.06	35.29%
0.17	0.10	0.07	41.18%

0.16	0.10	0.06	37.01%	Average
0.14	0.09	0.02	14.29%	Min
0.25	0.12	0.13	52.00%	Max
			8.84%	Standard Deviation

As you can see, the results are pretty clear that there is a performance improvement in this case when we avoid using PIPELINED functions. I will also say that the result set from the function body is only about 30 rows. From my review of common uses for PIPELINED functions, it appears that larger collections (result sets) may benefit more from PIPELINED functions as they can likely take more advantage of the parallelism allowed with PIPELINED functions.

Tag: plsql

PIPELINED PL/SQL function performance