Example usage for org.apache.hadoop.conf Configured subclass-usage

List of usage examples for org.apache.hadoop.conf Configured subclass-usage

Introduction

In this page you can find the example usage for org.apache.hadoop.conf Configured subclass-usage.

Usage

From source file com.github.gaoyangthu.demo.mapred.MultiFileWordCount.java

/**
 * MultiFileWordCount is an example to demonstrate the usage of 
 * MultiFileInputFormat. This examples counts the occurrences of
 * words in the text files under the given input directory.
 */
public class MultiFileWordCount extends Configured implements Tool {

From source file com.github.gaoyangthu.demo.mapred.PiEstimator.java

/**
 * A Map-reduce program to estimate the value of Pi
 * using quasi-Monte Carlo method.
 *
 * Mapper:
 *   Generate points in a unit square

From source file com.github.gaoyangthu.demo.mapred.RandomTextWriter.java

/**
 * This program uses map/reduce to just run a distributed job where there is
 * no interaction between the tasks and each task writes a large unsorted
 * random sequence of words.
 * In order for this program to generate data for terasort with a 5-10 words
 * per key and 20-100 words per value, have the following config:

From source file com.github.gaoyangthu.demo.mapred.RandomWriter.java

/**
 * This program uses map/reduce to just run a distributed job where there is
 * no interaction between the tasks and each task write a large unsorted
 * random binary sequence file of BytesWritable.
 * In order for this program to generate data for terasort with 10-byte keys
 * and 90-byte values, have the following config:

From source file com.github.gaoyangthu.demo.mapred.SleepJob.java

/**
 * Dummy class for testing MR framefork. Sleeps for a defined period 
 * of time in mapper and reducer. Generates fake input for map / reduce 
 * jobs. Note that generated number of input pairs is in the order 
 * of <code>numMappers * mapSleepTime / 100</code>, so the job uses
 * some disk space.

From source file com.github.gaoyangthu.demo.mapred.Sort.java

/**
 * This is the trivial map/reduce program that does absolutely nothing
 * other than use the framework to fragment and sort the input values.
 *
 * To run: bin/hadoop jar build/hadoop-examples.jar sort
 *            [-m <i>maps</i>] [-r <i>reduces</i>]

From source file com.github.gaoyangthu.demo.mapred.terasort.TeraGen.java

/**
 * Generate the official terasort input data set.
 * The user specifies the number of rows and the output directory and this
 * class runs a map/reduce program to generate the data.
 * The format of the data is:
 * <ul>

From source file com.github.gaoyangthu.demo.mapred.terasort.TeraSort.java

/**
 * Generates the sampled split points, launches the job, and waits for it to
 * finish. 
 * <p>
 * To run the program: 
 * <b>bin/hadoop jar hadoop-examples-*.jar terasort in-dir out-dir</b>

From source file com.github.gaoyangthu.demo.mapred.terasort.TeraValidate.java

/**
 * Generate 1 mapper per a file that checks to make sure the keys
 * are sorted within each file. The mapper also generates 
 * "$file:begin", first key and "$file:end", last key. The reduce verifies that
 * all of the start/end items are in order.
 * Any output from the reduce is problem report.

From source file com.github.karahiyo.hadoop.mapreduce.examples.dancing.DistributedPentomino.java

/**
 * Launch a distributed pentomino solver.
 * It generates a complete list of prefixes of length N with each unique prefix
 * as a separate line. A prefix is a sequence of N integers that denote the 
 * index of the row that is choosen for each column in order. Note that the
 * next column is heuristically choosen by the solver, so it is dependant on