com.ebay.erl.mobius.core.function
Class Unique

java.lang.Object
  extended by com.ebay.erl.mobius.core.function.base.Projectable
      extended by com.ebay.erl.mobius.core.function.base.GroupFunction
          extended by com.ebay.erl.mobius.core.function.Unique
All Implemented Interfaces:
java.io.Serializable, org.apache.hadoop.conf.Configurable
Direct Known Subclasses:
UniqueCounts

public class Unique
extends GroupFunction

Returns the unique rows in a group.

Uniqueness is measured within the values from the specified columns (in Unique(Column...)) in a group.

This product is licensed under the Apache License, Version 2.0, available at http://www.apache.org/licenses/LICENSE-2.0. This product contains portions derived from Apache hadoop which is licensed under the Apache License, Version 2.0, available at http://hadoop.apache.org. © 2007 – 2012 eBay Inc., Evan Chiu, Woody Zhou, Neel Sundaresan

See Also:
Serialized Form

Field Summary
protected  BigTupleList temp
          temporal list to store values in a group.
 
Fields inherited from class com.ebay.erl.mobius.core.function.base.GroupFunction
rowsToBeOutputted
 
Fields inherited from class com.ebay.erl.mobius.core.function.base.Projectable
conf, hashCode, inputs, outputSchema, reporter, requireDataFromMultiDatasets
 
Constructor Summary
Unique(Column... columns)
          Create an instance of Unique to emit unique rows within a group.
 
Method Summary
 void consume(Tuple tuple)
          consume a value within a group, to be implemented by sub-class.
 BigTupleList getResult()
          Get the computed result.
 void reset()
          Empty previous result (rowsToBeOutputted), reset is called when the values within a group have been all iterated.
 
Methods inherited from class com.ebay.erl.mobius.core.function.base.GroupFunction
getNoMatchResult, getRowsToBeOutputted, output
 
Methods inherited from class com.ebay.erl.mobius.core.function.base.Projectable
calledByCombiner, equals, getConf, getInputColumns, getOutputSchema, getParticipatedDataset, hashCode, init, isCombinable, requireDataFromMultiDatasets, setCalledByCombiner, setConf, setOutputSchema, setReporter, toString, useGroupKeyOnly
 
Methods inherited from class java.lang.Object
clone, finalize, getClass, notify, notifyAll, wait, wait, wait
 

Field Detail

temp

protected BigTupleList temp
temporal list to store values in a group.

Constructor Detail

Unique

public Unique(Column... columns)
Create an instance of Unique to emit unique rows within a group.

Uniqueness is measured only within the values from columns.

Method Detail

consume

public void consume(Tuple tuple)
Description copied from class: GroupFunction
consume a value within a group, to be implemented by sub-class.

Specified by:
consume in class GroupFunction

getResult

public BigTupleList getResult()
Description copied from class: GroupFunction
Get the computed result.

The computed result will be cross-product with results from other functions ( if any).

Overrides:
getResult in class GroupFunction

reset

public void reset()
Description copied from class: GroupFunction
Empty previous result (rowsToBeOutputted), reset is called when the values within a group have been all iterated.

It is important to call super.reset() when override this method in a sub-class, fail to do so, will result in wrong result.

Overrides:
reset in class GroupFunction