com.ebay.erl.mobius.core.function.base
Class GroupFunction

java.lang.Object
  extended by com.ebay.erl.mobius.core.function.base.Projectable
      extended by com.ebay.erl.mobius.core.function.base.GroupFunction
All Implemented Interfaces:
java.io.Serializable, org.apache.hadoop.conf.Configurable
Direct Known Subclasses:
AggregateFunction, Top, Unique

public abstract class GroupFunction
extends Projectable

A group function takes all the records in a group first, and then based on the inputs, produces X number of rows as the output, where X can be zero to many.

When a new record comes, the consume(Tuple) method is called. After all the records in a group are iterated through, Mobius engine retrieves the output of the group function via the getResult() method.

The implementation of a group function might not require all the records in a group to calculate the final result. In this case, Mobius still feeds all the records to the consume method, the implementer can choose to ignore any records that are not needed.

This product is licensed under the Apache License, Version 2.0, available at http://www.apache.org/licenses/LICENSE-2.0. This product contains portions derived from Apache hadoop which is licensed under the Apache License, Version 2.0, available at http://hadoop.apache.org. © 2007 – 2012 eBay Inc., Evan Chiu, Woody Zhou, Neel Sundaresan

See Also:
Serialized Form

Field Summary
protected  BigTupleList rowsToBeOutputted
          The container to hold the result been pushed by the output(Tuple) method within a group.
 
Fields inherited from class com.ebay.erl.mobius.core.function.base.Projectable
conf, hashCode, inputs, outputSchema, reporter, requireDataFromMultiDatasets
 
Constructor Summary
GroupFunction(Column... inputs)
          Create a GroupFunction which takes the inputs to compute some result.
 
Method Summary
abstract  void consume(Tuple tuple)
          consume a value within a group, to be implemented by sub-class.
 BigTupleList getNoMatchResult(java.lang.Object nullReplacement)
           
 BigTupleList getResult()
          Get the computed result.
protected  BigTupleList getRowsToBeOutputted()
           
protected  void output(Tuple tuple)
          To be called by the sub-class when a computed result can be populated.
 void reset()
          Empty previous result (rowsToBeOutputted), reset is called when the values within a group have been all iterated.
 
Methods inherited from class com.ebay.erl.mobius.core.function.base.Projectable
calledByCombiner, equals, getConf, getInputColumns, getOutputSchema, getParticipatedDataset, hashCode, init, isCombinable, requireDataFromMultiDatasets, setCalledByCombiner, setConf, setOutputSchema, setReporter, toString, useGroupKeyOnly
 
Methods inherited from class java.lang.Object
clone, finalize, getClass, notify, notifyAll, wait, wait, wait
 

Field Detail

rowsToBeOutputted

protected transient BigTupleList rowsToBeOutputted
The container to hold the result been pushed by the output(Tuple) method within a group.

Constructor Detail

GroupFunction

public GroupFunction(Column... inputs)
Create a GroupFunction which takes the inputs to compute some result. The schema of the result will be, by default, this.getClass().getSimpleName()+"_"+aColumn.getOutputName(), for each inputs.

The number of output column doesn't have to be the same as the number of input column, user can use Projectable.setOutputSchema(String...) to set the real output schema.

Method Detail

getRowsToBeOutputted

protected BigTupleList getRowsToBeOutputted()

consume

public abstract void consume(Tuple tuple)
consume a value within a group, to be implemented by sub-class.


reset

public void reset()
Empty previous result (rowsToBeOutputted), reset is called when the values within a group have been all iterated.

It is important to call super.reset() when override this method in a sub-class, fail to do so, will result in wrong result.


output

protected void output(Tuple tuple)
To be called by the sub-class when a computed result can be populated.

The number of output rows (within a group) of this function is equals to the number of times this method is called.


getNoMatchResult

public final BigTupleList getNoMatchResult(java.lang.Object nullReplacement)

getResult

public BigTupleList getResult()
Get the computed result.

The computed result will be cross-product with results from other functions ( if any).