Getting Java ready for the world: A brief history of IBM and Sun's internationalization efforts

Laura Werner
Unicode Technology Group, IBM
July 1999

Originally published on http://www-106.ibm.com/developerworks/java/library/j-sun-ibm-java.html

The Java programming language and the Unicode character standard are well teamed to help developers compete in a global marketplace, but the road to internationalization has been a long one. Since the Java Developer's Kit was introduced four years ago, both Sun and IBM have worked constantly to improve its support of Unicode and other internationalization efforts.
Contents
Java's first steps
Ready for prime time?
Teaming up for international acceptance
More work ahead
About the author

Java's first steps

Let's roll back the clock for a moment--all the way back to 1996. Java had been out for little more than a year, and it was already supported in browsers such as Netscape. The new language's popularity was beginning to snowball, and most of the programmers I knew had already decided they wanted to learn it.

The Java language, tools, and library were much easier to use than most existing C++ tools, and many programmers felt they could be more productive writing in Java. In addition, Java was already "cool." Java applets ran on the Web, so you could create Web pages with animation, embedded programs, and so on.

Enter IBM. IBM was early to recognize the importance of Java--way back in 1995. At the time, "client/server" development was all the rage. Traditionally, big hardware companies like IBM have made their money on servers: S/390s, AS/400s, and so on. But that's only half the picture. All the mainframe databases in the world aren't much use if you don't have a client that allows people to do something meaningful with the data.

That's where Java comes in: Java is easy to learn, efficient to develop, and portable--the perfect platform for writing clients that can run on multiple platforms and access data from multiple sources. And if there are more and better clients, there will be more servers, more software tools for creating them, and more opportunities for service and consulting businesses.

Ready for prime time?
But at the same time, a lot of people felt Java wasn't "real" yet. Sure, you could use it to create cool little toys to embed in your Web pages, but that seemed to be about all anyone was doing with it. Most people weren't writing Java programs that did anything particularly useful yet, and for several reasons:

First, there was the time factor. Java hadn't been out for all that long, and many developers still viewed it as an experiment--something to learn and to play with, but not to do real work in yet. And second, Java was fairly slow. This was in the days before JITs, so Java bytecode was always interpreted (rather than compiled to machine language). Finally, the support needed by business developers was still embryonic: database access, security, internationalization, and so on.

Yes, internationalization. Throughout the '80s and '90s, business has become more global. The explosive growth of the Internet over the last five years has contributed to this trend, and many businesses now operate in multiple countries and languages. The software used to run those businesses must also be global, with support for the different locales in which companies operate. Java, in its JDK 1.0 incarnation, had the beginnings of international support, because a Java char is stored as a Unicode character. But that was all. You couldn't enter or display non-Latin characters, and there were no facilities for language-sensitive formatting, sorting, and so on.

Issues such as performance, database access, and security were being addressed in other ways by Sun, IBM, JIT developers, database companies, and others. But nobody was addressing the internationalization issue. Sun recognized there was a problem, but didn't really have the resources to solve it on their own.

Teaming up for international acceptance
Meanwhile, in early 1996, IBM had assumed full ownership of Taligent, the joint venture started by IBM and Apple and later joined by HP. Taligent had been developing an object-oriented application framework and operating system called CommonPoint, which included extensive international support. When IBM took over, CommonPoint was mothballed, but much of the underlying technology was integrated into other IBM products.

Fairly quickly, the management of IBM and Taligent came to a realization: Java was missing international support. But Taligent had great international technology, talented engineers--including Dr. Mark Davis, president of the Unicode Consortium--and a location about 100 meters from Sun's JavaSoft division in Cupertino, California. Thus, a partnership was born: IBM arranged for Taligent's Text and International group to contribute international classes to Sun's Java Development Kit, making Java powerful enough for real-world business applications.

For JDK 1.1, Taligent provided the new java.text package, plus a number of new classes in java.util. This included Format and all its subclasses for formatting dates, times, numbers and messages; Collator, for language-sensitive string sorting; and BreakIterator , for determining line, word, and sentence boundaries in Unicode text. In java.util, Taligent contributed parts of ResourceBundle, as well as the Calendar and TimeZone classes (which provide flexible, international-friendly date and time support). In addition, IBM contributed a large collection of locale data from their National Language Technical Center in Toronto. This API and data provide a standard way to handle the requirements of different countries and languages, making this transparent to developers.

While reimplementing the international classes in Java, the developers took advantage of the opportunity to redesign the frameworks for easier use. The design of the Java language itself allowed for a simpler application program interface (API) and implementation. (The effort was led by Mark Davis, Taligent's international architect, with participants from both Taligent and JavaSoft.) The initial phase of this work was completed in early 1997 and released in JDK 1.1. Taligent continued to maintain the code, contributing several bug fixes to releases up through JDK 1.1.4.

That brings us to the end of 1997, when IBM absorbed Taligent and its employees completely. The Text and International group survived the transition intact, becoming the Unicode Technology Group at the new IBM Center for Java Technology in Cupertino. We're at the other end of the parking lot from Taligent's old building--and still just down the street from Sun. We continued to work closely with Sun to maintain and enhance the international features in the JDK, contributing more bug fixes in JDK 1.1.6.

At the same time, IBM worked closely with Sun to design and implement enhancements for Java 2, under development throughout 1998. The Unicode Technology group was no exception: We added several APIs and features to Java 2, including support for bidirectional Hebrew and Arabic text. We also extended the classes we'd contributed for 1.1, adding enhancements such as Collator search support, better TimeZone support, and an improved DecimalFormat .

More work ahead
But Java 2 wasn't the end of our work on internationalization in Java: We're still working closely with Sun's international team: We now have an IBM engineer on site at Sun full-time, and others work remotely, occasionally visiting the Sun site. We've continued to fix bugs in the JDK--both in JDK 1.2.2 and 1.3--and we're involved in Sun design processes. We're also working on new features for possible inclusion in future JDKs, releasing preliminary versions of some of these technologies on the IBM alphaWorks Web site.

Finally, we're working on a few new features for IBM's own versions of the JDK. Since IBM and Sun sometimes have different priorities, there are international features that IBM feels are important but that aren't as high on Sun's list. In the past, IBM has usually waited until we could persuade Sun to add each new feature into their reference release of Java, but there are now a few features we've decided simply can't wait.

On the international front, these new features include support for Hindi and Thai, full bidirectional-orientation support in Swing, and a few new locales and code pages. Though this work will appear first in an IBM JDK, we're still working closely with Sun to ensure that what we do remains compatible with the enhancements Sun is making in 1.3 and beyond. In fact, the Swing-orientation work is being done jointly by Sun and IBM engineers; it will appear in Sun's JDK 1.3 at around the same time it surfaces in an IBM release.


About the author
Laura Werner is the manager for the Java Internationalization effort at the IBM Center for Java Technology in Cupertino, CA. After receiving bachelor's degrees in geological sciences and integrated science from Northwestern University, she worked at SPSS, Inc. and UC Berkeley before joining Taligent in 1994. Now at IBM, she is the project lead for IBM's international contributions to the JDK and an architect for other international projects.