Package com.intel.hadoop.graphbuilder.idnormalize.mapreduce

Class Summary
HashIdMapper This mapper class maps an (vid, vdata) pair into (lvid, (vid, vdata)) pair.
HashIdMR This MapReduce class maps a list of unique vertex into 2 parts of output: A dictionary from rawId to newId, and a new vertex data file using newId.
HashIdReducer Reducer class reduce the (baseid, List<(vid, vdata)>) into a dictionary (lvid, vid), and a new vertex data file (lvid, vdata), where lvid = baseid + vid.index * splitsize; Because the splitsize is fixed for all mapper using option mapred.line.input.format.linespermap in the JobConf, this guarantee all vids are mapped into [0, ..., |V|-1].
SortDictMapper Mapper function for SortDictMR job
SortDictMR This MapReduce class partitions the dictionary output of HashIdMR based on the hash of the rawId, the key.
SortDictReducer Reducer function for SortDictMR job
SortEdgeMR This class partition the edge list input by the hash of the source vertex.
SortEdgeMR.SortEdgeMapper This mapper class maps each edge into (h(edge.source), edge).
SortEdgeMR.SortEdgeReducer This reducer class takes the input (hashval, edge) from mapper and outputs edge directly.
TransEdgeMapper<VidType extends WritableComparable<VidType>> This mapper class maps each edge as (u,v,data) into (h(v), (D(u), v, data)) where D is the dictionary that contains entry u.
TransEdgeMR This MapReduce class translate the rawIds in the edge list into normalized newIds using the partitioned edgelist output from SortEdgeMR and partitioned dictionary output from SortDictMR.
TransEdgeReducer<VidType extends WritableComparable<VidType>> This reducer class takes from mapper the input (h(v), [(D(u_i),v_i, data)...]) and output (D(u_i), D(v_i), data).