Use a Single Connection in Oracle- Developing Successful Oracle Applications-2

If you use bind variables, then everyone who submits the same exact query that references the same object will use the compiled plan from the pool. You will compile your subroutine once and use it over and over again. This is very efficient and is the way the database intends you to work. Not only will you use fewer resources (a soft parse is much less resource-intensive), but also you will hold latches for less time and need them less frequently. This increases your performance and greatly increases your scalability.

Just to give you a tiny idea of how huge a difference this can make performance-wise, you only need to run a very small test. In this test, we’ll just be inserting some rows into a table; the simple table we will use is

SQL> drop table t purge;

SQL> create table t ( x int );

Table created.

Now we’ll create two very simple stored procedures. They both will insert the numbers 1 through 10,000 into this table; however, the first procedure uses a single SQL statement with a bind variable:

Procedure created.

The second procedure constructs a unique SQL statement for each row to be inserted:

SQL> create or replace procedure proc2 as

Now, the only difference between the two is that one uses a bind variable and the other does not. Both are using dynamic SQL and the logic is otherwise identical. The only difference is the use of a bind variable in the first.

Note For details on runstats and other utilities, see the “Setting Up Your Environment” section at the beginning of this book. You may not observe exactly the same values for CPU or any metric. Differences are caused by different Oracle versions, different operating systems, and different hardware platforms. The idea will be the same, but the exact numbers will undoubtedly be marginally different.

We are ready to evaluate the two approaches, and we’ll use runstats, a simple tool
I’ve developed, to compare the two in detail:

Now, the preceding result clearly shows that based on CPU time (measured in hundredths of seconds), it took significantly longer and significantly more resources to insert 10,000 rows without bind variables than it did with them. In fact, it took more than a magnitude more CPU time to insert the rows without bind variables. For every insert without bind variables, we spent the vast preponderance of the time to execute the statement simply parsing the statement! But it gets worse. When we look at other information, we can see a significant difference in the resources utilized by each approach:

The runstats utility produces a report that shows differences in latch utilization as well as differences in statistics. Here, I asked runstats to print out anything with a difference greater than 9500. You can see that we hard parsed 3 times in the first approach using bind variables and that we hard parsed 10,000 times without bind variables (once for each of the inserts). But that difference in hard parsing is just the tip of the iceberg. You can see here that we used an order of magnitude as many “latches” in the nonbind variable approach as we did with bind variables. That difference might beg the question “What is a latch?”

Let’s answer that question. A latch is a type of lock that is used to serialize access to shared data structures used by Oracle. The shared pool is an example; it’s a big, shared data structure found in the System Global Area (SGA), and this is where Oracle stores parsed, compiled SQL. When you modify anything in this shared structure, you must take care to allow only one process in at a time. (It is very bad if two processes or threads attempt to update the same in-memory data structure simultaneously— corruption would abound.) So, Oracle employs a latching mechanism, a lightweight locking method to serialize access. Don’t be fooled by the word lightweight. Latches are serialization devices, allowing access (to a memory structure) one process at a time. The latches used by the hard parsing implementation are some of the most used latches out there. These include the latches for the shared pool and for the library cache. Those are “big time” latches that people compete for frequently. What all this means is that as we increase the number of users attempting to hard parse statements simultaneously, our performance gets progressively worse over time. The more people parsing, the more people waiting in line to latch the shared pool, the longer the queues, the longer the wait.

Use a Single Connection in Oracle- Developing Successful Oracle Applications-1

Now, in SQL Server it is a very common practice to open a connection to the database for each concurrent statement you want to execute. If you are going to do five queries, you might well see five connections in SQL Server. In Oracle, on the other hand, if you want to do 5 queries or 500, the maximum number of connections you want to open is one. So, a practice that is common in SQL Server is something that is not only not encouraged in Oracle, it is actively discouraged; having multiple connections to the database (when you can use just one) is just something you don’t want to do.

But do it they did. A simple web-based application would open 5, 10, 15, or more connections per web page, meaning that their server could support only 1/5, 1/10, or 1/15 the number of concurrent users that it should have been able to. My recommendation to them was to re-architect the application to allow it to take advantage of the connection to generate a page, not somewhere between 5 and 15 connections.

This is the only solution that would actually solve the problem. As you can imagine, this is not an “OK, we’ll do that this afternoon” sort of solution. It is a nontrivial solution to a problem that could have most easily been corrected during the database port phase while you were in the code poking around and changing things in the first place. Furthermore, a simple test to scale before rolling out to production would have caught such issues prior to the end users feeling the pain.

Use Bind Variables

If I were to write a book about how to build nonscalable Oracle applications, “Don’t Use Bind Variables” would be the first and last chapter. Not using bind variables is a major cause of performance issues and a major inhibitor of scalability—not to mention a security risk of huge proportions. The way the Oracle shared pool (a very important shared memory data structure) operates is predicated on developers using bind variables in most cases. If you want to make a transactional Oracle implementation run slowly, even grind to a total halt, just refuse to use them.

A bind variable is a placeholder in a query. For example, to retrieve the record for employee 123, I can query

SQL> select * from emp where empno = 123;

Alternatively, I can query

SQL> select * from emp where empno = :empno;

In a typical system, you would query up employee 123 maybe once or twice and then never again for a long period of time. Later, you would query up employee 456, then 789, and so on. Or, foregoing SELECT statements, if you do not use bind variables in your insert statements, your primary key values will be hard-coded in them, and I know for a fact that these insert statements can’t ever be reused later!!! If you use literals (constants) in the query, then every query is a brand-new query, never before seen by the database. It will have to be parsed, qualified (names resolved), security-checked, optimized, and so on. In short, each and every unique statement you execute will have to be compiled every time it is executed.

The second query uses a bind variable, :empno, the value of which is supplied at query execution time. This query is compiled once, and then the query plan is stored in a shared pool (the library cache), from which it can be retrieved and reused. The difference between the two in terms of performance and scalability is huge, dramatic even.

From the preceding description, it should be fairly obvious that parsing unique statements with hard-coded variables (called a hard parse) will take longer and consume many more resources than reusing an already parsed query plan (called a soft parse).

What may not be so obvious is the extent to which the former will reduce the number of users your system can support. Obviously, this is due in part to the increased resource consumption, but an even more significant factor arises due to the latching mechanisms for the library cache. When you hard parse a query, the database will spend more time holding certain low-level serialization devices called latches (see Chapter 6 for more details).

These latches protect the data structures in Oracle’s shared memory from concurrent modifications by two sessions (otherwise, Oracle would end up with corrupt data structures) and from someone reading a data structure while it is being modified. The longer and more frequently you have to latch these data structures, the longer the queue to get these latches will become.

You will start to monopolize scarce resources. Your machine may appear to be underutilized at times, and yet everything in the database is running very slowly. The likelihood is that someone is holding one of these serialization mechanisms and a line is forming—you are not able to run at top speed.

It only takes one ill-behaved application in your database to dramatically affect the performance of every other application. A single, small application that does not use bind variables will cause the relevant SQL of other well-tuned applications to get discarded from the shared pool over time. You only need one bad apple to spoil the entire barrel.