Example usage for org.apache.hadoop.mapreduce Mapper subclass-usage

List of usage examples for org.apache.hadoop.mapreduce Mapper subclass-usage

Introduction

In this page you can find the example usage for org.apache.hadoop.mapreduce Mapper subclass-usage.

Usage

From source file minor_MapReduce.SummarizeMapper.java

public class SummarizeMapper extends Mapper<LongWritable, Text, TextArrayWritable, IntWritable> {

    public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {
        String[] line_splitted = value.toString().split("\t");
        Text[] my_tmp_key = new Text[line_splitted.length];
        for (int i = 0; i < line_splitted.length; ++i) {

From source file ml.shifu.guagua.mapreduce.GuaguaMapper.java

/**
 * {@link GuaguaMapper} is the Hadoop Mapper implementation for both guagua master and guagua workers.
 * 
 * <p>
 * Use <code>(GuaguaInputSplit) context.getInputSplit()</code> to check whether this task is guagua master or guagua
 * worker.

From source file ml.shifu.shifu.core.autotype.AutoTypeDistinctCountMapper.java

/**
 * {@link AutoTypeDistinctCountMapper} is a mapper to get {@link HyperLogLogPlus} statistics per split. Such statistics
 * will be merged in our reducer.
 */
public class AutoTypeDistinctCountMapper extends Mapper<LongWritable, Text, IntWritable, BytesWritable> {

From source file ml.shifu.shifu.core.binning.UpdateBinningInfoMapper.java

/**
 * {@link UpdateBinningInfoMapper} is a mapper to update local data statistics given bin boundary list.
 * 
 * <p>
 * Bin boundary list is got by using distributed cache. After read bin boundary list, by iterate each record, to update
 * count and weighted value in each bin.

From source file ml.shifu.shifu.core.correlation.CorrelationMapper.java

/**
 * {@link CorrelationMapper} is used to compute {@link CorrelationWritable} per column per mapper.
 * 
 * <p>
 * Such {@link CorrelationWritable} is sent to reducer (only one) to merge and compute real pearson value.
 * 

From source file ml.shifu.shifu.core.correlation.CorrelationMultithreadedMapper.java

/**
 * Copy from MultithreadedMapper to do some customization. Merge mapper output results and then write to reducer.
 * 
 * @author Zhang David (pengzhang@paypal.com)
 */
@InterfaceAudience.Public

From source file ml.shifu.shifu.core.correlation.FastCorrelationMapper.java

/**
 * {@link FastCorrelationMapper} is used to compute {@link CorrelationWritable} per column per mapper.
 * 
 * <p>
 * Such {@link CorrelationWritable} is sent to reducer (only one) to merge and compute real pearson value.
 * 

From source file ml.shifu.shifu.core.correlation.FastCorrelationMultithreadedMapper.java

/**
 * Copy from MultithreadedMapper to do some customization. Merge mapper output results and then write to reducer.
 * 
 * @author Zhang David (pengzhang@paypal.com)
 */
@InterfaceAudience.Public

From source file ml.shifu.shifu.core.posttrain.FeatureImportanceMapper.java

/**
 * {@link FeatureImportanceMapper} is to compute the most important variables in one model.
 * 
 * <p>
 * Per each record, get the top 3 biggest variables in one bin. Then sent to reducer for further statistics.
 *