I recently had an opportunity to do a deeper dive into a performance issue related to the performance of a PIPELINED function. The results weren’t quite what I had expected, so I thought they may be of general interest. As with many performance issues, the issues encountered here are likely data- and database-specific. This was testing performed on 10.2.0.3 64-bit non-RAC on Solaris 9.

The PIPELINED function was called like this:

select * from TABLE(pkg1.f1(arg1,arg2,arg3));

And pkg1 is defined like this:

CREATE OR REPLACE PACKAGE pkg1 AS
TYPE coll_table IS TABLE OF custom_persistent_obj_type;
FUNCTION f1 (
p1        IN   NUMBER,
p2             IN   NUMBER,
p3        IN   NUMBER
) RETURN coll_table PIPELINED;
END;
/

The function pkg1.f1 returns a collection of object types, is a little more than 1400 lines long, and contains more than 10 distinct, separate queries, each of them significantly complex involving at least 2 tables, many of them with multiple levels of subqueries. Anyhoo, my chore was to open the idea bin and see if there were some possible enhancements that hadn’t been considered yet.

While reviewing overall performance metrics for the instance, I noted that the DBMS_PICKLER package appeared in the top 10 resource-intensive SQL statements on a regular basis. More digging identified this package as being the primary workhorse behind the TABLE function. So, then the test was clear. Let’s see what happens if we remove PIPELINED, remove TABLE and call the function in a more conventional way. So, my version of the pkg.f would create a collection, add records to it throughout the function’s execution, then return the whole collection at the end with a RETURN statement. The calling function would then need to process this collection into some form to make it more usable by the calling application (similar to the functionality that TABLE provides). For this task, I wrote a stand alone function called DAN.

The new package, pkg2, is defined like this:

CREATE OR REPLACE PACKAGE pkg2 AS
  TYPE coll_table IS TABLE OF custom_persistent_obj_type;
  FUNCTION f2 (
    p1        IN   NUMBER,
    p2             IN   NUMBER,
    p3        IN   NUMBER
  )  RETURN coll_table;
END;
/

DAN is defined as:

CREATE OR REPLACE FUNCTION dan (p_b_tab  IN  pkg2.coll_table,
    p_delimiter     IN  VARCHAR2 DEFAULT ',')
  RETURN VARCHAR2 IS
    l_string     VARCHAR2(32767);
  BEGIN
    for i in p_b_tab.FIRST .. p_b_tab.LAST LOOP
      l_string := l_string || to_char(p_b_tab(i).id1)  ||p_delimiter
                           ||         p_b_tab(i).id2   ||p_delimiter
                           || to_char(p_b_tab(i).array_id)   ||p_delimiter
                           ||         p_b_tab(i).value
      ;
    END LOOP;
  RETURN l_string;
END dan;
/

So, my new units would be called like this:

DECLARE
  l_string      VARCHAR2(32767);
BEGIN
  l_string := dan(pkg2.f2(arg1,arg2,arg3));
  DBMS_OUTPUT.PUT_LINE(l_string);
END;
/

Of course, performance is our primary objective in this case, so timing these two examples side-by-side is the primary chore. I was satisfied that they performed equivalent work and provided equivalent results using different methods. The calling program (custom java, in this case) would have to be modified to handle the changes required, but that may be worth it depending on the time difference.

So, I devised a script to do some very rough testing. The script looked like this:

set timing on echo on termout on pages 0 lines 450 feedb on serveroutput on size 100000
select * from table(pkg1.f1(1095482041,60983,1));
DECLARE
  l_string      VARCHAR2(32767);
BEGIN
  l_string := dan(pkg2.f2(1095482041,60983,1));
  DBMS_OUTPUT.PUT_LINE(l_string);
END;
/

The results were somewhat surprising. Keep in mind that all I’m changing is the way in which data is returned and processed from the function. The data returned, though in slightly different formats, is exactly the same and is not filtered further after it is returned (so there’s no benefit in having it available to run SQL against it).

SELECT (sec) PL/SQL (sec) Difference (sec) Percent Improvement
0.15 0.09 0.06 40.00%
0.14 0.09 0.05 35.71%
0.14 0.10 0.04 28.57%
0.16 0.09 0.07 43.75%
0.15 0.09 0.06 40.00%
0.14 0.12 0.02 14.29%
0.18 0.12 0.06 33.33%
0.18 0.10 0.08 44.44%
0.25 0.12 0.13 52.00%
0.14 0.10 0.04 28.57%
0.18 0.10 0.08 44.44%
0.15 0.10 0.05 33.33%
0.17 0.10 0.07 41.18%
0.18 0.10 0.08 44.44%
0.14 0.10 0.04 28.57%
0.17 0.11 0.06 35.29%
0.17 0.10 0.07 41.18%
0.16 0.10 0.06 37.01% Average
0.14 0.09 0.02 14.29% Min
0.25 0.12 0.13 52.00% Max
8.84% Standard Deviation

As you can see, the results are pretty clear that there is a performance improvement in this case when we avoid using PIPELINED functions. I will also say that the result set from the function body is only about 30 rows. From my review of common uses for PIPELINED functions, it appears that larger collections (result sets) may benefit more from PIPELINED functions as they can likely take more advantage of the parallelism allowed with PIPELINED functions.

  • http://www.ToadWorld.com/SF Steven Feuerstein

    Dan,

    I am glad to see you talking about pipelined functions; it is a feature of PL/SQL that is unfamiliar to most developers.

    One thing I have noticed from my trainings regarding pipelined functions, however, is that PL/SQL developers seem to use them when what they really mean to use is a table function.

    A table function is a function that can be called in the FROM clause of a query, and that fact offers all sorts of tantalizing possibilities. Generally, you can hide complex transformations inside a function’s body and expose the result as nothing more than rows and columns in a query.

    A pipelined function is a specialized form of the table function that returns data back to the calling query (it must be called inside a query) while the function is still executing.

    A pipelined function will not reduce the total elapsed time of a function’s execution; in fact, in my experience non-pipelined table functions often perform a little bit faster than their pipelined equivalent.

    The special thing about the PF is that the calling query can start using data from the function before it has processed all its data – and this means that you can reduce the perceived elapsed time to the user (a webpage built around a PF can display the first 100 rows while the next 10,000 are being retrieved) as well as parallelize execution of table functions.

    So…don’t use a pipelined function unless you actually need its special properties!

    Steven Feuerstein
    http://www.ToadWorld.com/SF

  • http://www.dannorris.com/ Dan

    Hi Steven,

    I didn’t get in to the reasons why the customer had originally used PF, but for them, I don’t think it was as much about the processing time (since the complete record set was only about 30-40 rows). In this case, they had several different queries that needed to run, but they wouldn’t know which queries to run until runtime. So, it was relatively easy for them to just run a query and PIPE the rows out at runtime. For my test, I just added those rows to a collection and then returned the whole collection at the end.

    Either way, I found this interesting and I’m grateful that you stopped by to comment!

  • Bradd Piontek

    Dan,
    Excellent test case. I’ve been pushing Table Function and Pipelined Functions on an internal application that launched a few weeks ago. Mostly for Search Functions where the .NET code was attempting do do everything in one big SQL statement. Large performance gains were seen. (I was very open with the developers and said they could have implemented the same logical solution in their language of choice, I prefer PL/SQL).

    Regardless, I’m intrigued by your numbers and may start doing some performance numbers between a Pipelined table-function (what I had the developers implement) and a regular table-functions.

  • http://www.dannorris.com/ Dan

    Hi Bradd,

    I was a bit surprised by the numbers as well and that’s why I decided to come out of blog-hiding and write it up. I suppose some of the “less interesting” results would have made good blog material too, but I don’t seem to be able to form blogging habits very well. Anyway, thanks for your comments and I would love to hear how your testing goes–especially if the results are different than mine.

  • Bradd Piontek

    I am a bit confused about Steven’s comment. I can’t find any examples of non-pipelined table functions (table function and pipeline table function look to be synonymous). Maybe I’m missing something.

  • Michael OShea

    I think I’m confused by the plethora of tables here. So is a table function just any function that returns a table type? But for a function to be used with the TABLE operator it must be pipelined? So a non-pipelined table function can only be used where its being called by PLSQL (as per dan’s example), yes?

  • Pingback: Log Buffer #93: a Carnival of the Vanities for DBAs()

  • http://www.dannorris.com/ Dan

    I saw Steven at Collaborate earlier this week and asked him about this. He mentioned that PIPELINED functions will almost always have a longer overall runtime as compared to non-PIPELINED table functions.

    To @Michael OShea’s question, yes, a table function is any function that returns a collection.

    The benefits of a PIPELINED table function are that the client processing of rows from the function can be overlapped with the function’s runtime, so the application can start doing something useful instead of waiting for the function to complete. I believe that the true “power” of PIPELINED functions are that they can be used in conjunction with parallelization which can be a real performance boost–however, I don’t say that confidently as I haven’t worked with them enough to know all those details. I requested that Steven follow up here when he has time, so hopefully we’ll get some more expert input soon.

  • http://www.club-penguin.org/ Club Penguin Cheats

    Mostly for Search Functions where the .NET code was attempting do do everything in one big SQL statement. Large performance gains were seen.

    I was very open with the developers and said they could have implemented the same logical solution in their language of choice, I prefer PL/SQL.

  • http://www.club-penguine.org club penguine cheats

    Mostly for Search Functions where the .NET code was attempting do do everything in one big SQL statement. Large performance gains were seen.

    I was very open with the developers and said they could have implemented the same logical solution in their language of choice, I prefer PL/SQL.

  • Fox