com.intel.hadoop.graphbuilder.idnormalize.mapreduce
Class HashIdReducer

java.lang.Object
  extended by org.apache.hadoop.mapred.MapReduceBase
      extended by com.intel.hadoop.graphbuilder.idnormalize.mapreduce.HashIdReducer
All Implemented Interfaces:
java.io.Closeable, org.apache.hadoop.mapred.JobConfigurable, org.apache.hadoop.mapred.Reducer<org.apache.hadoop.io.IntWritable,org.apache.hadoop.io.Text,org.apache.hadoop.io.Text,org.apache.hadoop.io.Text>

public class HashIdReducer
extends org.apache.hadoop.mapred.MapReduceBase
implements org.apache.hadoop.mapred.Reducer<org.apache.hadoop.io.IntWritable,org.apache.hadoop.io.Text,org.apache.hadoop.io.Text,org.apache.hadoop.io.Text>

Reducer class reduce the (baseid, List<(vid, vdata)>) into a dictionary (lvid, vid), and a new vertex data file (lvid, vdata), where lvid = baseid + vid.index * splitsize; Because the splitsize is fixed for all mapper using option mapred.line.input.format.linespermap in the JobConf, this guarantee all vids are mapped into [0, ..., |V|-1]. The assumption is that the input should not contain any duplicate vertex ids.


Constructor Summary
HashIdReducer()
           
 
Method Summary
 void configure(org.apache.hadoop.mapred.JobConf job)
           
 void reduce(org.apache.hadoop.io.IntWritable key, java.util.Iterator<org.apache.hadoop.io.Text> iter, org.apache.hadoop.mapred.OutputCollector<org.apache.hadoop.io.Text,org.apache.hadoop.io.Text> out, org.apache.hadoop.mapred.Reporter reporter)
           
 
Methods inherited from class org.apache.hadoop.mapred.MapReduceBase
close
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface java.io.Closeable
close
 

Constructor Detail

HashIdReducer

public HashIdReducer()
Method Detail

configure

public void configure(org.apache.hadoop.mapred.JobConf job)
Specified by:
configure in interface org.apache.hadoop.mapred.JobConfigurable
Overrides:
configure in class org.apache.hadoop.mapred.MapReduceBase

reduce

public void reduce(org.apache.hadoop.io.IntWritable key,
                   java.util.Iterator<org.apache.hadoop.io.Text> iter,
                   org.apache.hadoop.mapred.OutputCollector<org.apache.hadoop.io.Text,org.apache.hadoop.io.Text> out,
                   org.apache.hadoop.mapred.Reporter reporter)
            throws java.io.IOException
Specified by:
reduce in interface org.apache.hadoop.mapred.Reducer<org.apache.hadoop.io.IntWritable,org.apache.hadoop.io.Text,org.apache.hadoop.io.Text,org.apache.hadoop.io.Text>
Throws:
java.io.IOException