List of usage examples for org.apache.hadoop.mapreduce Mapper subclass-usage
From source file minor_MapReduce.SummarizeMapper.java
public class SummarizeMapper extends Mapper<LongWritable, Text, TextArrayWritable, IntWritable> { public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException { String[] line_splitted = value.toString().split("\t"); Text[] my_tmp_key = new Text[line_splitted.length]; for (int i = 0; i < line_splitted.length; ++i) {
From source file ml.shifu.guagua.mapreduce.GuaguaMapper.java
/**
* {@link GuaguaMapper} is the Hadoop Mapper implementation for both guagua master and guagua workers.
*
* <p>
* Use <code>(GuaguaInputSplit) context.getInputSplit()</code> to check whether this task is guagua master or guagua
* worker.
From source file ml.shifu.shifu.core.autotype.AutoTypeDistinctCountMapper.java
/** * {@link AutoTypeDistinctCountMapper} is a mapper to get {@link HyperLogLogPlus} statistics per split. Such statistics * will be merged in our reducer. */ public class AutoTypeDistinctCountMapper extends Mapper<LongWritable, Text, IntWritable, BytesWritable> {
From source file ml.shifu.shifu.core.binning.UpdateBinningInfoMapper.java
/**
* {@link UpdateBinningInfoMapper} is a mapper to update local data statistics given bin boundary list.
*
* <p>
* Bin boundary list is got by using distributed cache. After read bin boundary list, by iterate each record, to update
* count and weighted value in each bin.
From source file ml.shifu.shifu.core.correlation.CorrelationMapper.java
/**
* {@link CorrelationMapper} is used to compute {@link CorrelationWritable} per column per mapper.
*
* <p>
* Such {@link CorrelationWritable} is sent to reducer (only one) to merge and compute real pearson value.
*
From source file ml.shifu.shifu.core.correlation.CorrelationMultithreadedMapper.java
/**
* Copy from MultithreadedMapper to do some customization. Merge mapper output results and then write to reducer.
*
* @author Zhang David (pengzhang@paypal.com)
*/
@InterfaceAudience.Public
From source file ml.shifu.shifu.core.correlation.FastCorrelationMapper.java
/**
* {@link FastCorrelationMapper} is used to compute {@link CorrelationWritable} per column per mapper.
*
* <p>
* Such {@link CorrelationWritable} is sent to reducer (only one) to merge and compute real pearson value.
*
From source file ml.shifu.shifu.core.correlation.FastCorrelationMultithreadedMapper.java
/**
* Copy from MultithreadedMapper to do some customization. Merge mapper output results and then write to reducer.
*
* @author Zhang David (pengzhang@paypal.com)
*/
@InterfaceAudience.Public
From source file ml.shifu.shifu.core.posttrain.FeatureImportanceMapper.java
/**
* {@link FeatureImportanceMapper} is to compute the most important variables in one model.
*
* <p>
* Per each record, get the top 3 biggest variables in one bin. Then sent to reducer for further statistics.
*