com.ebay.erl.mobius.core.mapred
Class SequenceFileMapper<K,V>
java.lang.Object
org.apache.hadoop.mapred.MapReduceBase
com.ebay.erl.mobius.core.datajoin.DataJoinMapper<IK,IV,org.apache.hadoop.io.WritableComparable<?>,org.apache.hadoop.io.WritableComparable<?>>
com.ebay.erl.mobius.core.mapred.AbstractMobiusMapper<K,V>
com.ebay.erl.mobius.core.mapred.SequenceFileMapper<K,V>
- Type Parameters:
K
- key in a sequence file record.V
- value in a sequence file record.
- All Implemented Interfaces:
- java.io.Closeable, org.apache.hadoop.mapred.JobConfigurable, org.apache.hadoop.mapred.Mapper<K,V,org.apache.hadoop.io.WritableComparable<?>,org.apache.hadoop.io.WritableComparable<?>>
- Direct Known Subclasses:
- DefaultSeqFileMapper
public abstract class SequenceFileMapper<K,V>
- extends AbstractMobiusMapper<K,V>
The base class for parsing a SequenceFile into Tuple,
sub-class needs to override #parseKey(Object)
and #parseValue(Object)
.
This product is licensed under the Apache License, Version 2.0,
available at http://www.apache.org/licenses/LICENSE-2.0.
This product contains portions derived from Apache hadoop which is
licensed under the Apache License, Version 2.0, available at
http://hadoop.apache.org.
© 2007 – 2012 eBay Inc., Evan Chiu, Woody Zhou, Neel Sundaresan
Fields inherited from class com.ebay.erl.mobius.core.mapred.AbstractMobiusMapper |
_100MB, _COUNTER_FILTERED_RECORD, _COUNTER_INPUT_RECORD, _COUNTER_INVALIDATE_FORMAT_RECORD, _COUNTER_OUTPUT_RECORD, _IS_MAP_ONLY_JOB, computedColumns, counterThread, currentDatasetID, dataset_display_id, key_columns, projection_order, reporterSet, tuple_criteria, value_columns |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
SequenceFileMapper
public SequenceFileMapper()
parse
public abstract Tuple parse(K inkey,
V invalue)
throws java.lang.IllegalArgumentException,
java.io.IOException
- Parse the
inkey
and invalue
and merge them into one Tuple
, and set the
schema using getSchema()
.
There are X number of columns from inkey
,
and Y numbers of columns from invalue
, so
the returned Tuple contains X+Y number of columns. X+Y
must be equals to the size of getSchema()
.
The returned Tuple will be a row of this dataset.
- Specified by:
parse
in class AbstractMobiusMapper<K,V>
- Throws:
java.lang.IllegalArgumentException
java.io.IOException
getSchema
protected java.lang.String[] getSchema()
- The schema for the Tuple returned by parse(Object, Object).
This schema is set when the user built the dataset using
SeqFileDatasetBuilder
.