The Black Box Approach- Developing Successful Oracle Applications-2

The very idea that developers building a database application should be shielded from the database is amazing to me, but that attitude persists. Many people still insist that developers can’t take the time to get trained in the database and, basically, that they shouldn’t have to know anything about the database. Why? Well, more than once I’ve heard “but Oracle is the most scalable database in the world, my people don’t have to learn about it, it’ll just work.” That’s true; Oracle is the most scalable database in the world. However, I can write bad code that does not scale in Oracle as easily—if not more easily—as I can write good, scalable code in Oracle. You can replace Oracle with any piece of software and the same is true. This is a fact: it is easier to write applications that perform poorly than it is to write applications that perform well. It is sometimes too easy to build a single-user system in the world’s most scalable database if you don’t know what you are doing. The database is a tool, and the improper use of any tool can lead to disaster. Would you take a nutcracker and smash walnuts with it as if it were a hammer? You could, but it wouldn’t be a proper use of that tool and the result would be a mess (and probably some seriously hurt fingers). Similar effects can be achieved by remaining ignorant of your database.

I was called into a project that was in trouble. The developers were experiencing massive performance issues—it seemed their system was serializing many transactions, that is to say—so instead of many people working concurrently, everyone was getting into a really long line and waiting for everyone in front of them to complete. The application architects walked me through the architecture of their system—the classic three-tier approach. They would have a web browser talk to a middle tier application server running Java Server Pages (JSPs). The JSPs would in turn utilize another layer— Enterprise JavaBeans (EJBs)—that did all of the SQL. The SQL in the EJBs was generated by a third-party tool and was done in a database-independent fashion.

Now, in this system it was very hard to diagnose anything, as none of the code was instrumented or traceable. Instrumenting code is the fine art of making every other line of developed code be debug code of some sort—so when you are faced with performance or capacity or even logic issues, you can track down exactly where the problem is. In this case, we could only locate the problem somewhere between the browser and the database—in other words, the entire system was suspect. The Oracle database is heavily instrumented, but the application needs to be able to turn the instrumentation on and off at appropriate points—something it was not designed to do.

So, we were faced with trying to diagnose a performance issue with not too many details, just what we could glean from the database itself. Fortunately, in this case it was fairly easy. When someone who knew the Oracle V$ tables (the V$ tables are one way Oracle exposes its instrumentation, its statistics, to us) reviewed them, it became apparent that the major contention was around a single table—a queue table of sorts.

The application would place records into this table, while another set of processes would pull the records out of this table and process them. Digging deeper, we found a bitmap index on a column in this table (See Chapter 11 on indexing for more information about bitmapped indexes). The reasoning was that this column, the processed-flag column, had only two values—Y and N. As records were inserted, they would have a value of N for not processed. As the other processes read and processed the record, they would update the N to Y to indicate that processing was done. The developers needed to find the N records rapidly and hence knew they wanted to index that column. They had read somewhere that bitmap indexes are for low-cardinality columns—columns that have but a few distinct values—so it seemed a natural fit. (Go ahead, use Google to search for when to use bitmap indexes; low cardinality will be there over and over. Fortunately, there are also many articles refuting that too simple concept today.)

But that bitmap index was the cause of all of their problems. In a bitmap index, a single key entry points to many rows, hundreds or more of them. If you update a bitmap index key (and thus locking it), the hundreds of records that key points to are effectively locked as well. So, someone inserting the new record with N would lock the N record in the bitmap index, effectively locking hundreds of other N records as well. Meanwhile, the process trying to read this table and process the records would be prevented from modifying some N record to be a Y (processed) record, because in order for it to update this column from N to Y, it would need to lock that same bitmap index key. In fact, other sessions just trying to insert a new record into this table would be blocked as well, as they would be attempting to lock the same bitmap key entry. In short, the developers had created a table that at most one person would be able to insert or update against at a time! We can see this easily using a simple scenario.

Note If you haven’t done so already, visit the “Setting Up Your Environment” section of the front matter of this book. This section contains the code to create the EODA and SCOTT users. These users are used extensively in the examples in this book. The “Setting Up Your Environment” section also contains the source code for many of the utilities used throughout this book. For your convenience, the setup source code can also be downloaded/cloned from the GitHub site.

The Black Box Approach- Developing Successful Oracle Applications-1

I have an idea, borne out by first-hand personal experience (meaning I made the mistake myself), as to why database-backed software development efforts so frequently fail.

Let me be clear that I’m including here those projects that may not be documented as failures, but nevertheless take much longer to roll out and deploy than originally planned because of the need to perform a major rewrite, re-architecture, or tuning effort. Personally, I call such delayed projects failures: more often than not they could have been completed on schedule (or even faster).

The single most common reason for failure is a lack of practical knowledge of the database—a basic lack of understanding of the fundamental tool that is being used. The black box approach involves a conscious decision to protect the developers from the database. They are actually encouraged not to learn anything about it! In many cases, they are prevented from exploiting it. The reasons for this approach appear to be FUD related (Fear, Uncertainty, and Doubt). Developers have heard that databases are “hard,” that SQL, transactions, and data integrity are “hard.” The solution: don’t make anyone do anything hard. They treat the database as a black box and have some software tool generate all of the code. They try to insulate themselves with many layers of protection so that they don’t have to touch this “hard” database.

This is an approach to database development that I’ve never been able to understand, in part because, for me, learning Java and C was a lot harder than learning the concepts behind the database. I’m now pretty good at Java and C, but it took a lot more hands-on experience for me to become competent using them than it did to become competent using the database. With the database, you need to be aware of how it works, but you don’t have to know everything inside and out. When programming in C or Java/J2EE, you do need to know everything inside and out—and these are huge languages.

If you are building a database application, the most important piece of software is the database. A successful development team will appreciate this and will want its people to know about it, to concentrate on it. Many times I’ve walked into a project where almost the opposite was true. A typical scenario would be as follows:

•\ The developers were fully trained in the GUI tool or the language they were using to build the front end (such as Java). In many cases, they had had weeks if not months of training in it.

•\ The team had zero hours of Oracle training and zero hours of Oracle experience. Most had no database experience whatsoever. They would also have a mandate to be “database independent”—a mandate (edict from management or learned through theoretical academic instruction) they couldn’t hope to follow for many reasons. The most obvious one is they didn’t know enough about what databases are or what they do to even find the lowest common denominator among them.

•\ The developers encountered massive performance problems, data integrity problems, hanging issues, and the like (but very pretty screens).

As a result of the inevitable performance problems, I now get called in to help solve the difficulties (in the past, as a learning developer I was sometimes the cause of such issues).

Note Even today, I often find that the developers of database applications have spent no time reading the documentation. On my website, asktom.oracle. com, I frequently get questions along the lines of “what is the syntax for…” coupled with “we don’t have the documentation so please just tell us.” I refuse to directly answer many of those questions, but rather point them to the online documentation freely available to anyone, anywhere in the world. In the last 15 years, the excuses like “We don’t have documentation,” or “We don’t have access to resources,” have disappeared. Sites like www.oracle.com/technical-resources (Oracle Technical Resources, formerly known as the Oracle Technology Network) make it inexcusable to not have a full set of documentation at your fingertips! Today, everyone has access to all of the documentation; they just have to read it or—even easier—Google it.