com.ebay.erl.mobius.core.function.base
Class AggregateFunction

java.lang.Object
  extended by com.ebay.erl.mobius.core.function.base.Projectable
      extended by com.ebay.erl.mobius.core.function.base.GroupFunction
          extended by com.ebay.erl.mobius.core.function.base.AggregateFunction
All Implemented Interfaces:
java.io.Serializable, org.apache.hadoop.conf.Configurable
Direct Known Subclasses:
SingleInputAggregateFunction

public abstract class AggregateFunction
extends GroupFunction

An aggregate function is a specific type of group function that takes all records in a group, and outputs only one row. The output columns of an aggregate function can still be one to many columns.

To implement an aggregate function, override the methods GroupFunction.consume(Tuple) and getComputedResult() as well as implement the logic in these two methods. The consume method is called by Mobius each time there is a new record in a group. Users can store some partially computed results during this stage. The getComputedResult() method returns one tuple; users should implement the logic of basing the final result on the partial results here.

Note that, the schema of the returned tuple is the same as the schema Mobius retrieved from Projectable.getOutputSchema().

This product is licensed under the Apache License, Version 2.0, available at http://www.apache.org/licenses/LICENSE-2.0. This product contains portions derived from Apache hadoop which is licensed under the Apache License, Version 2.0, available at http://hadoop.apache.org. © 2007 – 2012 eBay Inc., Evan Chiu, Woody Zhou, Neel Sundaresan

See Also:
Serialized Form

Field Summary
protected  java.lang.Object aggregateResult
          the computed result.
 
Fields inherited from class com.ebay.erl.mobius.core.function.base.GroupFunction
rowsToBeOutputted
 
Fields inherited from class com.ebay.erl.mobius.core.function.base.Projectable
conf, hashCode, inputs, outputSchema, reporter, requireDataFromMultiDatasets
 
Constructor Summary
AggregateFunction(Column[] inputs)
          Constructor, can take 1 to more columns as it's input.
 
Method Summary
protected abstract  Tuple getComputedResult()
          Get the computed result for this group.
 BigTupleList getResult()
          Get the computed result.
protected  BigTupleList newBigTupleList()
           
protected  void output(Tuple tuple)
          To be called by the sub-class when a computed result can be populated.
 void reset()
          Empty previous result (rowsToBeOutputted), reset is called when the values within a group have been all iterated.
 
Methods inherited from class com.ebay.erl.mobius.core.function.base.GroupFunction
consume, getNoMatchResult, getRowsToBeOutputted
 
Methods inherited from class com.ebay.erl.mobius.core.function.base.Projectable
calledByCombiner, equals, getConf, getInputColumns, getOutputSchema, getParticipatedDataset, hashCode, init, isCombinable, requireDataFromMultiDatasets, setCalledByCombiner, setConf, setOutputSchema, setReporter, toString, useGroupKeyOnly
 
Methods inherited from class java.lang.Object
clone, finalize, getClass, notify, notifyAll, wait, wait, wait
 

Field Detail

aggregateResult

protected java.lang.Object aggregateResult
the computed result.

This object will be set to null every time when a new group starts, done by the {reset() method.

Constructor Detail

AggregateFunction

public AggregateFunction(Column[] inputs)
Constructor, can take 1 to more columns as it's input.

Method Detail

getComputedResult

protected abstract Tuple getComputedResult()
Get the computed result for this group.

The returned Tuple shall contains the same schema as the return value of Projectable.getOutputSchema().


output

protected final void output(Tuple tuple)
To be called by the sub-class when a computed result can be populated.

The number of output rows (within a group) of this function is equals to the number of times this method is called.

Force this method can be called only once per group, make this method to final to prevent subclass violate the contract.

Overrides:
output in class GroupFunction
Throws:
java.lang.IllegalStateException - if this method is called more than once within a group.

newBigTupleList

protected BigTupleList newBigTupleList()

getResult

public final BigTupleList getResult()
Get the computed result.

The computed result will be cross-product with results from other functions ( if any).

Use the Tuple returned by the getComputedResult() as the only output and set this method to final to prevent subclass violate the contract.

Overrides:
getResult in class GroupFunction

reset

public void reset()
Description copied from class: GroupFunction
Empty previous result (rowsToBeOutputted), reset is called when the values within a group have been all iterated.

It is important to call super.reset() when override this method in a sub-class, fail to do so, will result in wrong result.

Overrides:
reset in class GroupFunction