In this book, you'll find information about a set of libraries
developed within the Apache Commons (also referred to as "Commons"). Commons
(http://commons.apache.org/) is a set of small,
popular components which forms a Top-level Project at the Apache Software
Foundation. Ranging from the elementary to the complex, many would consider
some of these libraries indispensable to any Java project. These components
are so widespread, they may already be on your classpath. If you develop an
application using Wicket, Maven, Struts, Tomcat, Spring, Hibernate, or any
other popular Java library, you likely have Commons Lang and Commons
BeanUtils in your classpath. If you just installed Red Hat Enterprise Linux
with the default configuration, you've got Commons libraries somewhere in
/usr
. While Apache Commons may be
everywhere, many are still unaware of the capabilities these components
provide. This book is an attempt to provide some documentation for these
popular components.
This book focuses on tactical implementation details, answering such questions as: How do we parse XML? How do we serialize beans? Is there an easier way to work with Collections? How do we work with HTTP and keep track of cookies? In enterprise software development, the tactical is often sacrificed for the strategic. Consider a complex enterprise-scale system with a solid, well-conceived architecture. The strategic (or high-level) design appears reasonable from 40,000 feet, but as soon as you drill into the details, you notice that every component contains pages upon pages of unmaintainable and unnecessary code because the developers were not aware of some valuable time-saver like BeanUtils, Collections, or the Digester. Or, worse, the developer may have spent a week reimplementing most of the capabilities of Commons BeanUtils even though BeanUtils was already in the classpath. While a familiarity with Apache Commons may not directly affect the architecture of your application, knowing what Apache Commons can do often helps to inform decisions made at the class level.
Few application developers would consider writing a custom XML parser,
but developers will frequently write custom components that duplicate freely
available libraries. Take, as an example, a set of static utility methods
that seems to pop up in almost every complex project. A common process such
as reading a file to a String
may be
refactored into a CommonFileUtils
class,
or turning a DOM Document
into a set of
beans may be accomplished with a set of classes in some custom code. Apache
Commons provides solutions to both of these problems and many more, and
reading this book may help you avoid unnecessary wheel reinvention.
Many people know of these components in a general sense, but few have the months or weeks necessary to sit down and read the relevant tutorials, FAQs, blogs, and archived mailing lists associated with each component. The amount of work involved in keeping up-to-date with an array of open source communities is not trivial. This is why I've tried to compact much of this information into easily accessible recipes. These recipes were written to provide you with the information you need to start using Commons in a few minutes, but the Discussion and See Also sections give you an opportunity to dig deeper into the motivation behind each Commons component if you need more information.
The tools introduced herein save you serious time and provide you with a set of alternatives you may not currently be aware of. I wish I had read a book like this five years ago; it would have accelerated my learning and helped me to avoid some costly design decisions. Use this text as you will; if you are only interested in Commons Collections, you should be able to quickly familiarize yourself with Collections by browsing Chapter 5. On the other hand, if you are looking for a survey of some of the major projects in Apache Commons, read this book from start to finish. Part structured reference, part prose, the cookbook format lets you customize your reading experience, and I hope this book is as interesting to read as it was to write.
This book covers components from Apache Commons, and a few projects outside of the Apache Software Foundation. This book covers the following components:
Apache Commons BeanUtils
Apache Commons Betwixt
Apache Commons CLI
Apache Commons Codec
Apache Commons Collections
Apache Commons Configuration
Apache Commons Digester
Apache Commons HttpClient
Apache Commons ID
Apache Commons IO
Apache Commons JEXL
Apache Commons JXPath
Apache Commons Lang
Apache Commons Logging
Apache Commons Math
Apache Commons Net
Apache Log4J
Apache Velocity
FreeMarker
Apache Lucene
Apache Slide
All of these projects are covered in detail in the following chapters. Here's what's in each chapter:
This chapter introduces Commons Lang. Automation of toString( )
, working with arrays,
formatting and rounding dates, working with enumerations, generating
identifiers, and measuring time are some of the topics discussed in
this chapter. This chapter also covers the generation of unique
identifiers with Commons ID.
While Java does not have the extensive text manipulation
capabilities of a scripting language like Perl, Commons Lang's
StringUtils
has a number of
utility methods that can be used to manipulate text. This chapter
deals with StringUtils
, WordUtils
, and Commons Codec.
Beans appear throughout Java; from Apache Struts to Hibernate, beans are a unit of information in an object model. This chapter introduces Commons BeanUtils, one of the most widely used components from Apache Commons.
Functors are a fundamental way of thinking about programming as a set of functional objects. Commons Collections introduced predicates, transformers, and closures, and functors, which can be used to model control structures and loops. This chapter demonstrates how one would apply functors to any program.
Iterators, filtering with predicates, buffers, queues, bidirectional maps, type-safe collections, constraining collections, lazy maps, and set operations are a few of the topics introduced in this chapter. This chapter deals with Commons Collections, new collection types introduced, and the application of functors to various collections.
If you are constantly parsing or creating XML documents, this chapter introduces some alternatives to the standard parser APIs (SAX, DOM, and JDOM). This chapter introduces Commons Digester, Commons Betwixt, and Commons JXPath.
Commons Configuration is introduced as a way to parse properties files and XML configuration files. Other recipes in this chapter show how Commons CLI can be used to parse a complex set of required and optional command-line options. This chapter also details the configuration and use of Commons Logging and Apache Log4J.
This chapter focuses on simple mathematical capabilities in both Commons Lang and Commons Math. This chapter introduces classes to work with fractions, complex numbers, matrices, and simple univariate statistics.
This chapter deals with simple expression languages such as Commons JEXL to more complex templating engines such as Apache Velocity and FreeMarker. This chapter also demonstrates the integration of both Velocity and FreeMarker with a J2EE servlet container such as Apache Tomcat.
This chapter introduces Commons IO, which contains a number of utilities for working with streams and files, and Commons Net, which contains simple clients for the FTP, POP, and SMTP protocols.
If you need to communicate with anything over HTTP, read this chapter, which deals with Apache HttpClient and the WebDAV client library from Apache Slide.
Commons JXPath can be used to apply XPath expressions to collections and object graphs. Apache Lucene is a fully functional search engine that can index any structured document. This chapter demonstrates the use of Lucene with Commons Digester.
Limited time and resources forced me to make some decisions about which projects to include in this text. Projects like Velocity, FreeMarker, and Log4J, while not Commons components, were included because they fit the mold of a small, easily reusable component. Other Commons components were not included in this book because they were still being developed at the time of writing, or because a short recipe would have been impossible without a detailed 30-page introduction. Commons DbUtils, DBCP, Discovery, Jelly, Launcher, Modeler, Pool, Primitives, Chain, and promising sandbox components could fill another entire volume. Some projects, such as Apache HiveMind, started as components in the Commons Sandbox only to be promoted directly to subproject status of the Apache project. Classification of projects and components in Apache can also be somewhat arbitrary; Apache ORO and Apache RegExp would both seem to be prime candidates for the Apache Commons, but they are both subprojects of Apache. Other projects, such as Apache Commons HttpClient, have recently been promoted to be subprojects of Apache, leaving the Commons entirely. Think of this book as focusing on Apache Commons with some other projects thrown in to liven up the discussion. I apologize in advance if I left your favorite project out.
Writing a book about a series of frequently released components is reminiscent of a game called whack-a-mole. Just when you finish updating a chapter for a new release, another component has been released. On average, one commons component is released every one or two weeks; therefore, a few of the versions in this book may be obsolete as soon as this book hits the shelves. In general, Apache Commons makes a concerted effort to preserve backward compatibility and keep a stable public interface. Lessons learned on Commons BeanUtils 1.6 should remain applicable to Commons BeanUtils 1.7. If you find that a more recent version of a component has been released, you should download that more recent version and check the Discursive site for updates related to this book.