com.ebay.erl.mobius.core.datajoin
Class EvenlyPartitioner<K extends org.apache.hadoop.io.WritableComparable,V>

java.lang.Object
  extended by com.ebay.erl.mobius.core.datajoin.EvenlyPartitioner<K,V>
All Implemented Interfaces:
org.apache.hadoop.mapred.JobConfigurable, org.apache.hadoop.mapred.Partitioner<K,V>

public class EvenlyPartitioner<K extends org.apache.hadoop.io.WritableComparable,V>
extends java.lang.Object
implements org.apache.hadoop.mapred.Partitioner<K,V>

Majority of the codes are copied from org.apache.hadoop.mapred.lib.TotalOrderPartitioner


Field Summary
static java.lang.String DEFAULT_PATH
           
 
Constructor Summary
EvenlyPartitioner()
           
 
Method Summary
 void configure(org.apache.hadoop.mapred.JobConf job)
          Read in the partition file and build indexing data structures.
 int getPartition(K key, V value, int numPartitions)
           
static java.lang.String getPartitionFile(org.apache.hadoop.mapred.JobConf job)
          Get the path to the SequenceFile storing the sorted partition keyset.
static void setPartitionFile(org.apache.hadoop.mapred.JobConf job, org.apache.hadoop.fs.Path p)
          Set the path to the SequenceFile storing the sorted partition keyset.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

DEFAULT_PATH

public static final java.lang.String DEFAULT_PATH
See Also:
Constant Field Values
Constructor Detail

EvenlyPartitioner

public EvenlyPartitioner()
Method Detail

configure

public void configure(org.apache.hadoop.mapred.JobConf job)
Read in the partition file and build indexing data structures. If the keytype is BinaryComparable and total.order.partitioner.natural.order is not false, a trie of the first total.order.partitioner.max.trie.depth(2) + 1 bytes will be built. Otherwise, keys will be located using a binary search of the partition keyset using the RawComparator defined for this job. The input file must be sorted with the same comparator and contain JobConf.getNumReduceTasks() - 1 keys.

Specified by:
configure in interface org.apache.hadoop.mapred.JobConfigurable

getPartition

public int getPartition(K key,
                        V value,
                        int numPartitions)
Specified by:
getPartition in interface org.apache.hadoop.mapred.Partitioner<K extends org.apache.hadoop.io.WritableComparable,V>

setPartitionFile

public static void setPartitionFile(org.apache.hadoop.mapred.JobConf job,
                                    org.apache.hadoop.fs.Path p)
Set the path to the SequenceFile storing the sorted partition keyset. It must be the case that for R reduces, there are R-1 keys in the SequenceFile.


getPartitionFile

public static java.lang.String getPartitionFile(org.apache.hadoop.mapred.JobConf job)
Get the path to the SequenceFile storing the sorted partition keyset.

See Also:
setPartitionFile(JobConf,Path)