Monday, July 15, 2013

BULK COLLECT & FORALL

Bulk Processing in PL/SQL

  • BULK COLLECT : SELECT statements that retrieve multiple rows with a single fetch, improving the speed of data retrieval
  • FORALL : INSERTs, UPDATEs, and DELETEs that use collections to change multiple rows of data very quickly

The bulk processing features of PL/SQL are designed specifically to reduce the number of context switches required to communicate from the PL/SQL engine to the SQL engine.
Use the BULK COLLECT clause to fetch multiple rows into one or more collections with a single context switch.
Use the FORALL statement when you need to execute the same DML statement repeatedly for different bind variable values.

About BULK COLLECT
To take advantage of bulk processing for queries, you simply put BULK COLLECT before the INTO keyword and then provide one or more collections after the INTO keyword. Here are some things to know about how BULK COLLECT works:

  • It can be used with all three types of collections: associative arrays, nested tables, and VARRAYs.
  • You can fetch into individual collections (one for each expression in the SELECT list) or a single collection of records.
  • The collection is always populated densely, starting from index value 1.
  • If no rows are fetched, then the collection is emptied of all elements.
An example of fetching values for two columns into a collection of records.

DECLARE
   TYPE two_cols_rt IS RECORD (
                          employee_id   employees.employee_id%TYPE,
                          salary        employees.salary%TYPE
                       );

   TYPE employee_info_t IS TABLE OF two_cols_rt;

   l_employees   employee_info_t;
BEGIN
   SELECT   employee_id, salary
     BULK   COLLECT
     INTO   l_employees
     FROM   employees
    WHERE   department_id = 10;

END;

If you are fetching lots of rows, the collection that is being filled could consume too much session memory and raise an error. To help you avoid such errors, Oracle Database offers a LIMIT clause for BULK COLLECT. Suppose that, for example, there could be tens of thousands of employees in a single department and my session does not have enough memory available to store 20,000 employee IDs in a collection.Instead use the approach mentioned below.

Fetching up to the number of rows specified

DECLARE
   c_limit          PLS_INTEGER := 100;

   CURSOR employees_cur
   IS
      SELECT   employee_id
        FROM   employees
       WHERE   department_id = department_id_in;

   TYPE employee_ids_t IS TABLE OF employees.employee_id%TYPE;

   l_employee_ids   employee_ids_t;
BEGIN
   OPEN employees_cur;

   LOOP
      FETCH employees_cur BULK COLLECT INTO   l_employee_ids LIMIT c_limit;

      EXIT WHEN l_employee_ids.COUNT = 0;
   END LOOP;

END;

With this approach, I open the cursor that identifies all the rows I want to fetch. Then, inside a loop, I use FETCH-BULK COLLECT-INTO to fetch up to the number of rows specified by the c_limit constant (set to 100). Now, no matter how many rows I need to fetch, my session will never consume more memory than that required for those 100 rows, yet I will still benefit from the improvement in performance of bulk querying.

About FORALL

Whenever you execute a DML statement inside of a loop, you should convert that code to use FORALL. The performance improvement will amaze you and please your users.
The FORALL statement is not a loop; it is a declarative statement to the PL/SQL engine: “Generate all the DML statements that would have been executed one row at a time, and send them all across to the SQL engine with one context switch.”

FORALL statement looks just like a numeric FOR loop, yet there are no LOOP or END LOOP keywords.

Here are some things to know about FORALL:

  • Each FORALL statement may contain just a single DML statement. If your loop contains two updates and a delete, then you will need to write three FORALL statements.
  • PL/SQL declares the FORALL iterator as an integer, just as it does with a FOR loop. You do not     need to—and you should not—declare a variable with this same name.
  • In at least one place in the DML statement, you need to reference a collection and use the FORALL   iterator as the index value in that collection.
  • When using the IN low_value . . . high_value syntax in the FORALL header, the collections referenced inside the FORALL statement must be densely filled. That is, every index value between the low_value and high_value must be defined.
  • If your collection is not densely filled, you should use the INDICES OF or VALUES OF syntax in your FORALL header. 

FORALL and DML Errors

Suppose that I’ve written a program that is supposed to insert 10,000 rows into a table. After inserting 9,000 of those rows, the 9,001st insert fails with a DUP_VAL_ON_INDEX error (a unique index violation). The SQL engine passes that error back to the PL/SQL engine, and if the FORALL statement is written like the one in Listing 4, PL/SQL will terminate the FORALL statement. The remaining 999 rows will not be inserted.

If you want the PL/SQL engine to execute as many of the DML statements as possible, even if errors are raised along the way, add the SAVE EXCEPTIONS clause to the FORALL header. Then, if the SQL engine raises an error, the PL/SQL engine will save that information in a pseudo collection named SQL%BULK_EXCEPTIONS, and continue executing statements. When all statements have been attempted, PL/SQL then raises the ORA-24381 error.

You can—and should—trap that error in the exception section and then iterate through the contents of SQL%BULK_EXCEPTIONS to find out which errors have occurred. You can then write error information to a log table and/or attempt recovery of the DML statement.

An example of using SAVE EXCEPTIONS in a FORALL statement;
In this case, I simply display on the screen the index in the l_eligible_ids collection on which the error occurred, and the error code that was raised by the SQL engine.

Using SAVE EXCEPTIONS with FORALL

BEGIN
   FORALL indx IN 1 .. l_eligible_ids.COUNT
   SAVE EXCEPTIONS
      UPDATE   employees emp
         SET   emp.salary = emp.salary + emp.salary * increase_pct_in
       WHERE   emp.employee_id = l_eligible_ids (indx);
EXCEPTION
   WHEN OTHERS
   THEN
      IF SQLCODE = -24381
      THEN
         FOR indx IN 1 .. SQL%BULK_EXCEPTIONS.COUNT
         LOOP
            DBMS_OUTPUT.put_line(   SQL%BULK_EXCEPTIONS(indx).ERROR_INDEX
                                 || ': '
                                 || SQL%BULK_EXCEPTIONS(indx).ERROR_CODE);
         END LOOP;
      ELSE
         RAISE;
      END IF;

END;

From SQL to PL/SQL
This article talks mostly about the context switch from the PL/SQL engine to the SQL engine that occurs when a SQL statement is executed from within a PL/SQL block. It is important to remember that a context switch also takes place when a user-defined PL/SQL function is invoked from within an SQL statement.
Suppose that I have written a function named betwnstr that returns the string between a start and end point. Here’s the header of the function:

FUNCTION betwnstr (string_in   IN VARCHAR2,
                   start_in    IN INTEGER,
                   end_in      IN INTEGER)

   RETURN VARCHAR2

I can then call this function as follows:

SELECT   betwnstr (last_name, 2, 6)
  FROM   employees
 WHERE   department_id = 10;

If the employees table has 100 rows and 20 of those have department_id set to 10, then there will be 20 context switches from SQL to PL/SQL to run this function.
You should, consequently, play close attention to all invocations of user-defined functions in SQL, especially those that occur in the WHERE clause of the statement. Consider the following query:

SELECT   employee_id
  FROM   employees

 WHERE   betwnstr (last_name, 2, 6) = 'MITHY'

In this query, the betwnstr function will be executed 100 times—and there will be 100 context switches.
FORALL with Sparse Collections
If you try to use the IN low_value .. high_value syntax with FORALL and there is an undefined index value within that range, Oracle Database will raise the “ORA-22160: element at index [N] does not exist” error.
To avoid this error, you can use the INDICES OF or VALUES OF clauses. To see how these clauses can be used, let’s go back to the code in Listing 4. In this version of increase_salary, I declare a second collection, l_eligible_ids, to hold the IDs of those employees who are eligible for a raise.
Instead of doing that, I can simply remove all ineligible IDs from the l_employee_ids collection, as follows:

   FOR indx IN 1 .. l_employee_ids.COUNT
   LOOP
      check_eligibility (l_employee_ids (indx),
                         increase_pct_in,
                         l_eligible);

      IF NOT l_eligible
      THEN
         l_employee_ids.delete (indx);
      END IF;
   END LOOP;

But now my l_employee_ids collection may have gaps in it: index values that are undefined between 1 and the highest index value populated by the BULK COLLECT.
No worries. I will simply change my FORALL statement to the following:

FORALL indx IN INDICES OF l_employee_ids

UPDATE   employees emp
   SET   emp.salary =
            emp.salary + emp.salary * increase_salary.increase_pct_in

 WHERE   emp.employee_id = l_employee_ids (indx);

Now I am telling the PL/SQL engine to use only those index values that are defined in l_employee_ids, rather than specifying a fixed range of values. Oracle Database will simply skip any undefined index values, and the ORA-22160 error will not be raised.

Content posted in this blog post is referred from the article published in oracle technical forum.

No comments:

Post a Comment