com.ebay.erl.mobius.core.mapred
Class DefaultMobiusReducer

java.lang.Object
  extended by org.apache.hadoop.mapred.MapReduceBase
      extended by com.ebay.erl.mobius.core.datajoin.DataJoinReducer<Tuple,Tuple,org.apache.hadoop.io.NullWritable,org.apache.hadoop.io.WritableComparable<?>>
          extended by com.ebay.erl.mobius.core.mapred.DefaultMobiusReducer
All Implemented Interfaces:
java.io.Closeable, org.apache.hadoop.mapred.JobConfigurable, org.apache.hadoop.mapred.Reducer<DataJoinKey,DataJoinValue,org.apache.hadoop.io.NullWritable,org.apache.hadoop.io.WritableComparable<?>>
Direct Known Subclasses:
TotalSortReducer

public class DefaultMobiusReducer
extends DataJoinReducer<Tuple,Tuple,org.apache.hadoop.io.NullWritable,org.apache.hadoop.io.WritableComparable<?>>

Reducer for handling Mobius joining and group by job.

This product is licensed under the Apache License, Version 2.0, available at http://www.apache.org/licenses/LICENSE-2.0. This product contains portions derived from Apache hadoop which is licensed under the Apache License, Version 2.0, available at http://hadoop.apache.org. © 2007 – 2012 eBay Inc., Evan Chiu, Woody Zhou, Neel Sundaresan


Field Summary
protected  TupleCriterion _persistantCriteria
          the criteria specified by the user and to be applied before the persistent step.
protected  Projectable[] _projections
          the final projection functions.
protected  boolean isOuterJoin
          a boolean flag to indicate this job is outer join ( including left-outer-join and right-outer-join) or not.
protected  java.util.List<ExtendFunction> multiDatasetExtendFunction
          list of extend functions that need columns from multiple datasets as the input.
protected  java.util.List<GroupFunction> multiDatasetGroupFunction
          list of group functions that need columns from multiple datasets as the input.
protected  java.lang.Object nullReplacement
          the replacement specified by user to replace the null columns for outer-join job.
protected  java.lang.String[] outputColumnNames
          The final projected column names, in user specified order.
protected  boolean reporterSet
          A flag to indicate if we have set the reference of Hadoop reporter to every projectable functions or not.
protected  boolean requirePreCrossProduct
          When set to true, that mean there is at least one projectable function require columns from different datasets as the inputs.
protected  java.util.Map<java.lang.String,java.util.List<ExtendFunction>> singleDatasetExtendFunction
          mapping from a datasetID to a list of extend functions that require columns only from that datasetID.
protected  java.util.Map<java.lang.String,BigTupleList> singleDatasetExtendFunResult
          mapping from a datasetID to the result of its extend functions that require only the columns from the dataset.
protected  java.util.Map<java.lang.String,java.util.List<GroupFunction>> singleDatasetGroupFunction
          mapping from a datasetID to a list of group functions that require columns only from that datasetID.
protected  java.util.Map<java.lang.String,BigTupleList> valuesForAllDatasets
          only used when requirePreCrossProduct is true.
 
Constructor Summary
DefaultMobiusReducer()
           
 
Method Summary
 void configure(org.apache.hadoop.mapred.JobConf conf)
           
protected  java.lang.String[] getSchemaByDatasetID(java.lang.String datasetID)
           
 void joinreduce(Tuple key, DataJoinValueGroup<Tuple> values, org.apache.hadoop.mapred.OutputCollector<org.apache.hadoop.io.NullWritable,org.apache.hadoop.io.WritableComparable<?>> output, org.apache.hadoop.mapred.Reporter reporter)
           
 
Methods inherited from class com.ebay.erl.mobius.core.datajoin.DataJoinReducer
reduce
 
Methods inherited from class org.apache.hadoop.mapred.MapReduceBase
close
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface java.io.Closeable
close
 

Field Detail

_persistantCriteria

protected TupleCriterion _persistantCriteria
the criteria specified by the user and to be applied before the persistent step.


_projections

protected Projectable[] _projections
the final projection functions.


outputColumnNames

protected java.lang.String[] outputColumnNames
The final projected column names, in user specified order.


reporterSet

protected boolean reporterSet
A flag to indicate if we have set the reference of Hadoop reporter to every projectable functions or not.


requirePreCrossProduct

protected boolean requirePreCrossProduct
When set to true, that mean there is at least one projectable function require columns from different datasets as the inputs.


multiDatasetGroupFunction

protected java.util.List<GroupFunction> multiDatasetGroupFunction
list of group functions that need columns from multiple datasets as the input.


multiDatasetExtendFunction

protected java.util.List<ExtendFunction> multiDatasetExtendFunction
list of extend functions that need columns from multiple datasets as the input.


singleDatasetGroupFunction

protected java.util.Map<java.lang.String,java.util.List<GroupFunction>> singleDatasetGroupFunction
mapping from a datasetID to a list of group functions that require columns only from that datasetID.


singleDatasetExtendFunction

protected java.util.Map<java.lang.String,java.util.List<ExtendFunction>> singleDatasetExtendFunction
mapping from a datasetID to a list of extend functions that require columns only from that datasetID.


singleDatasetExtendFunResult

protected java.util.Map<java.lang.String,BigTupleList> singleDatasetExtendFunResult
mapping from a datasetID to the result of its extend functions that require only the columns from the dataset.


valuesForAllDatasets

protected java.util.Map<java.lang.String,BigTupleList> valuesForAllDatasets
only used when requirePreCrossProduct is true.


isOuterJoin

protected boolean isOuterJoin
a boolean flag to indicate this job is outer join ( including left-outer-join and right-outer-join) or not.


nullReplacement

protected java.lang.Object nullReplacement
the replacement specified by user to replace the null columns for outer-join job.

Constructor Detail

DefaultMobiusReducer

public DefaultMobiusReducer()
Method Detail

configure

public void configure(org.apache.hadoop.mapred.JobConf conf)
Specified by:
configure in interface org.apache.hadoop.mapred.JobConfigurable
Overrides:
configure in class org.apache.hadoop.mapred.MapReduceBase

joinreduce

public void joinreduce(Tuple key,
                       DataJoinValueGroup<Tuple> values,
                       org.apache.hadoop.mapred.OutputCollector<org.apache.hadoop.io.NullWritable,org.apache.hadoop.io.WritableComparable<?>> output,
                       org.apache.hadoop.mapred.Reporter reporter)
                throws java.io.IOException
Specified by:
joinreduce in class DataJoinReducer<Tuple,Tuple,org.apache.hadoop.io.NullWritable,org.apache.hadoop.io.WritableComparable<?>>
Throws:
java.io.IOException

getSchemaByDatasetID

protected java.lang.String[] getSchemaByDatasetID(java.lang.String datasetID)