"Java Liaison" column
Learning to Love Loss of Control
Many people get nervous when they get on an airplane, and even more nervous anytime it does something unexpected. Even though they rationally know that air travel is statistically the safest form of transportation, they're afraid the plane will crash. These same people get into a car casually, without a care in the world, even though automobile travel is one of the most dangerous forms of transportation, and thousands of times more dangerous than air travel.
Why? When you're driving a car, you're in control. In an airplane, not only are you not in control, usually you can't even see what the person in control is actually doing! I think much the same thing is what causes many C and C++ programmers to reject Java so vigorously. Loss of control.
Two of the biggest complaints I've heard about Java from C++ programmers are "It's an interpreted language" and "it's a garbage-collected language," as if these things were inherently bad. The primary argument against both of these techniques is that they hurt performance. To some degree, both interpretation and garbage collection have earned their bad reputations; many people's first experiences with interpreted languages were early BASIC implementations, and many people's first exposures to garbage collection were early LISP or Smalltalk implementations. And yes, they were horribly slow. But that was twenty years ago.
Even though the state of the art in both interpretation and garbage collection has progressed light years beyond those early examples, the first impressions still hold. The only way to get decent performance, it's said, is to do everything yourself. After all, you know more about your program than the computer's ever going to know, and should be able to hand-optimize it far beyond the capabilities of any automatic optimization system.
I tend to think there's also a certain element of macho behind these arguments: You're not a "real programmer" if you don't manage your own memory, for example; somehow, you're "cheating." Doing things by hand is what initiates you into "the club." Suffering builds character.
Well, I have some sympathy for all these arguments, but you've also got to take a pragmatic attitude: What's going to allow me to get something workable out the door quickly? How much time am I willing to spend tracking down bugs in my program's memory-management code, or my hand optimizations, or my ports to different platforms? What am I willing to pay for faster development time and more stable code? How much control am I willing to give up? How much can I trust the people who are doing things like memory management for me?
These are all good and necessary questions, and there have been excellent reasons to ask them about Java. But it isn't fair to dismiss Java out of hand just because it's an interpreted, garbage-collected language. Let's take a closer look at both of these things.
UCSD Pascal, on the other hand, was based on a compiler, just as most languages are. The difference was that instead of compiling to native machine code, it compiled to pseudomachine code (or "p-code" for short), a kind of machine code for a hypothetical system that can be easily emulated on a real system. To run the program, you needed a p-code interpreter for the target platform.
The advantage of doing it this way is ease of porting. To implement Pascal for a new machine, you didn't have to write a whole new compiler (or even a whole new compiler back end); you just had to rewrite the p-code interpreter (a theoretically easier task, since much of the interpreter's work would have already been done by the compiler). Even better, compiled Pascal code could be moved from one machine to another without recompilation.
On top of this, because half of the work of a regular native-code compiler has been done by the compiler and doesn't have to be done by the interpreter, you get much better run-time performance. Unfortunately, back in the late seventies and early eighties when UCSD Pascal came out, interpretation technology (not to mention hardware technology) were nowhere near where they are now, and it slowly faded out because of the performance issue. Today all commercial Pascal compilers are typical native-code compilers.
But the designers of Java have rediscovered the advantages of pseudomachine code. If your object is to write a language whose target deployment mechanism is the Web, where clients could want to execute the code on any of myriad system architectures, or whose target deployment mechanism was TV set-top boxes and other embedded applications, which run on an even greater variety of hardware and operating systems), or which allows you to write server software you won't have to rewrite every time you move up to bigger iron, you need to produce portable object code. This, of course, is exactly what UCSD p-code was good for.
Of course, Java doesn't actually use UCSD Pascal p-code; its runtime mechanism is similar in concept, but not the same thing. In Java, they refer to the code that is interpreted at runtime as "Java byte code."
By itself, even on today's fast hardware, this is still drastically slower than native code. Many Java devotees might claim that this is a reasonable price to pay for code that doesn't have to be recompiled for each platform, but most developers used to writing full-fledged applications in regular compiled languages such as C or C++ won't buy this argument. Early Java implementations worked this way because it's simple to implement. Fortunately, it's possible to do better.
The first thing you can do to speed up Java code is to blow off the whole virtual-machine idea in the first place and just compile it straight to native code. Of course, you lose the whole benefit of portable object code, but you get performance. For some kinds of application development, this may well be the right answer, but it also would make Java as bad for the things Java was designed for as other languages are.
But we can also do better while retaining the benefits of portability. The job of a Java byte code interpreter is substantially the same job that is done by software designed to emulate one real machine architecture on another. There's been a lot of activity in this area recently, much of it in the Mac world, and it's all directly applicable to Java byte code.
When the PowerPC-based Macs first came out, one thing Apple did to improve the performance of their 680x0 emulator was to port the most-frequently-used OS calls into native PowerPC code. This meant that you'd spend the bulk of your time in native PowerPC code even when executing an application in 680x0 machine code. One thing that a JVM vendor could do to speed things up is to compile key parts of the Java runtime into native code. All Java implementations do this at their lowest levels, or in APIs that call through to the host OS, but nobody really goes beyond this. Partially this is because you can lose the benefits of dynamic linking (making it harder to upgrade the runtime to apply bug fixes, for example), and partially this is because you can actually get some performance benefits by waiting until run time to compile.
The next step is to package a byte-code-to-native-code compiler as part of the virtual machine and compile the byte code into native code as each class is loaded. This technique is called "flash compilation," and is used by at least one current Java implementation I know of. You can also wait to compile a function until that function is actually called. This spreads the cost of compiling the code more evenly across the program's execution time, and also saves you the expense of compiling functions you never execute. This technique, which has become quite popular, is called "just-in-time compilation," or "JITting" for short. (The compiler is called a "JIT.")
A JIT can be tuned to optimize for either speed or memory usage by controlling the size of the cache where compiled code is kept. The larger the cache, the less often a function will have to be compiled twice. Compiling twice is bad, of course, because it eats up a huge part of the advantage of running natively. Conversely, a big cache can be bad because it increases program's memory footprint, degrading performance or cooperating poorly with other applications.
JIT technology has been improving dramatically in the last few years; many companies (including IBM, where I happen to work) are working very hard on this technology. Much of the motivation has come from the desire to improve performance of Java, and much has come from work that's been done on other emulation technology (such as 680x0 on PowerPC or 80x86/Pentium on PowerPC emulators). Some of this work has gone into speeding up the compiler itself, and much of it has gone into improving optimization. (Of course, these two go hand in hand: the faster the compiler is, the more time you have to do optimization. Early JITs couldn't do much optimization in the time they allotted for compilation.)
One of the cool things you can do by waiting until the last second to compile is to take advantage of the runtime environment to help you optimize. At run time, you know things about how the code is being used that you don't know at compile time. For instance, if you know the type of the object you're calling a virtual method on, you don't have to go through the virtual function dispatch mechanism. You can also optimize out conditional branches that aren't taken, and you can inline much more aggressively. Since most optimizers can't optimize across function calls, the more inlining you can do, the more other optimizations are possible, which helps even more. You could have different versions of the same function that are based on different assumptions about the runtime environment, and you can recompile a function on the fly when assumptions you made the last time you compiled it turn out not to be true anymore.
Sun's HotSpot JVM goes one step further by packaging both a fast optimizing JIT and a fast interpreter together and deferring compilation of a function until it's been called a sufficient number of times. This spreads the cost of compilation out even more evenly across the program's running time, and means the JVM concentrates its efforts where they're most likely to do good, avoiding wasting the time and memory compiling functions that are going to be called one or two times. Since the cost of compilation generally has to be amortized across several calls, running interpreted can actually be faster in these situations, and it can also improve startup time by avoiding large numbers of up-front compilation delays when a program is initializing.
These are typical of the kinds of techniques that are being used in the most advanced JITs that the various Java vendors are working on. In some circumstances, these runtime optimizations can actually make Java code run faster than C++ code. For all but the most demanding applications, these techniques should bring performance of Java code up into the same ballpark as C and C++ code.
It's important to note that as I write this, all of the truly advanced JVM technologies from the various vendors are still under development, and everybody's being pretty tight-lipped about exactly what they're doing and how well it's really working. Sun is planning to release the HotSpot technology as an add-on to the regular JVM sometime in the first half of this year, and I suspect equally advanced JVMs will be coming out of the other Java vendors in the same general time frame.
The other big complaint is that Java is garbage collected. I plan to devote a whole feature article to garbage collection sometime in the near future, so I won't go into the same level of detail here. But you can follow many of the same lines of reasoning that we followed when we looked at interpretation.
The beauty of garbage collection is that it frees the programmer from many of the biggest hassles of memory management. Garbage collection isn't a panacea, but it drastically cuts down on memory leaks and makes other memory-management bugs both more benign and easier to find. More importantly, it can dramatically improve modularity of object-oriented code and dramatically cut down on the bookkeeping the programmer has to do.
Garbage collection gets a bad rap because it deprives the programmer of control: "How can I trust the garbage collector to do a good job of managing memory in my program?" Well, this distrust was certainly justified in the early days of garbage-collected languages, but as people have become more aware of the difficulty of manual memory management in large-scale applications and garbage collection has increased in popularity, a lot of work has gone into improving on those early implementations. Early versions of Java used a relatively simple and inefficient garbage collector, again because it was easy to implement, but the technology has been steadily improving since.
Modern automatic memory management systems can often do just as good (and fast) a job as programmers managing memory manually, and the problems with the program screeching to a halt for uncomfortably long intervals at inconvenient times exhibited by early garbage collectors have, in many situations, been eliminated. A good garbage collector can often improve a program's spatial locality and cache/VM performance.
Even the common complaint that Java and languages like it don't have stack-allocated memory and allocate everything on the heap has largely been taken care of. In a generational garbage collector, the cost of allocating and releasing heap memory for short-lived objects comes pretty close to the cost of stack allocation in C++. (Heap allocation for longer-lived objects is roughly comparable in both environments, depending on the actual application, but you spend the time in different places.)
There are costs, of course: Garbage collection often requires a larger address space and a larger working set. The less memory available, the more frequently the collector runs, hurting performance. And often there is more copying of data in garbage-collected languages than there strictly needs to be. Early versions of Java didn't handle memory-based caches well, and didn't provide the programmer much control over the behavior of the collector, but this has been improved in JDK 1.2 and will probably continue to improve.
Again, all the Java vendors are working furiously away on garbage collectors, so this technology will probably improve rapidly, but their best efforts still aren't shipping as I write this. Again, Sun says HotSpot will include a significantly better garbage collector.
I don't mean to finesse the arguments against Java, and it's entirely likely that the best Java will ever do in performance will still lag behind the best that C++ can do. For applications where absolute maximum performance or absolute control is truly necessary, this will mean C++ (or even assembly language) is likely always to be the better language to use. But for others, the advantages of programming in Java (which, in many cases, boil down to getting a good product out the door quicker with lower porting costs) will be worth it. The cost of Java may have been too high when it first exploded onto the scene, but it's coming down rapidly.