"Are you amazed by the fast response you get while searching the Web with Google or Yahoo? Have you ever wondered how these services manage to search millions of pages and return your results in milliseconds or less? The algorithms that drive both of these major-league search services originated with Google's MapReduce framework. While MapReduce is proprietary technology, the Apache Foundation has implemented its own open source map-reduce framework, called Hadoop. Hadoop is used by Yahoo and many other services whose success is based on processing massive amounts of data. In this article we'll help you discover whether it might also be a good solution for your distributed data processing needs."
"The Google environment is customized for their needs and to fit their operational model. For example, Google uses a proprietary file system for storing files that?s optimized for the type of operations that their MapReduce implementations are likely to perform. Enterprise applications, on the other hand, are built on top of Java or similar technologies, and rely on existing file systems, communication protocols, and application stacks."
"It?s easy to include endpoints in systems across a geographically distributed network or that implement application or context-specific business logic that?s useful for a specific reduction task but that it?s decoupled from the MapReduce system. Virtual addressability of such endpoints, and event-driven logic built using the Mule facilities, can provide the foundation of a resource-oriented architecture that merges distributed computing techniques with SOA at a low cost and with minimal risk for the organization tasked with building the system. An overview of such a system is described in this TheServerSide Java Symposium presentation Son of SOA: Resource Oriented Computing from the 2008 Las Vegas conference."