|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectcom.ebay.erl.mobius.core.builder.AbstractDatasetBuilder<SeqFileDatasetBuilder>
com.ebay.erl.mobius.core.builder.SeqFileDatasetBuilder
public class SeqFileDatasetBuilder
Reads a SequenceFile
with NullWritable
as its key and Tuple as its value.
The default Mapper
is DefaultSeqFileMapper
which only
accepts NullWritable as the key type and Tuple as
the value type from the underline sequence file.
If the sequence file includes different key and value types, specify
a different implementation of SequenceFileMapper
using
setMapper(Class)
.
This product is licensed under the Apache License, Version 2.0, available at http://www.apache.org/licenses/LICENSE-2.0. This product contains portions derived from Apache hadoop which is licensed under the Apache License, Version 2.0, available at http://hadoop.apache.org. © 2007 – 2012 eBay Inc., Evan Chiu, Woody Zhou, Neel Sundaresan
Field Summary | |
---|---|
protected java.lang.Class<? extends SequenceFileMapper> |
mapperClass
Mapper class of this builder, by default, it's DefaultSeqFileMapper |
Fields inherited from class com.ebay.erl.mobius.core.builder.AbstractDatasetBuilder |
---|
computedColumns, datasetName, mobiusJob |
Constructor Summary | |
---|---|
protected |
SeqFileDatasetBuilder(MobiusJob aJob,
java.lang.String datasetName)
|
Method Summary | |
---|---|
Dataset |
buildFromPreviousJob(org.apache.hadoop.mapred.JobConf prevJob,
java.lang.Class<? extends org.apache.hadoop.mapred.FileOutputFormat> prevJobOutputFormat,
java.lang.String[] schema)
To be called by Mobius engine, for building a dataset from a previous mobius job, user should not use this method. |
protected Dataset |
newDataset(java.lang.String datasetName)
Create a new Dataset , the returned Dataset
has no state at all (no paths, constraints...etc.) |
static SeqFileDatasetBuilder |
newInstance(MobiusJob job,
java.lang.String name,
java.lang.String[] schema)
Get an new instance of SeqFileDatasetBuilder to build a dataset
which is stored as Hadoop sequence file. |
SeqFileDatasetBuilder |
setMapper(java.lang.Class<? extends SequenceFileMapper> mapperClass)
Set a new implementation of SequenceFileMapper to parse the underline
sequence file records into tuples. |
Methods inherited from class com.ebay.erl.mobius.core.builder.AbstractDatasetBuilder |
---|
addComuptedColumn, addInputPath, addInputPath, build, checkTouchFile, constraint, getDataset, setSchema |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
---|
protected java.lang.Class<? extends SequenceFileMapper> mapperClass
DefaultSeqFileMapper
Constructor Detail |
---|
protected SeqFileDatasetBuilder(MobiusJob aJob, java.lang.String datasetName)
Method Detail |
---|
public static SeqFileDatasetBuilder newInstance(MobiusJob job, java.lang.String name, java.lang.String[] schema) throws java.io.IOException
SeqFileDatasetBuilder
to build a dataset
which is stored as Hadoop sequence file.
By default, a SeqFileDatasetBuilder use DefaultSeqFileMapper
to parse the underline sequence file records into Tuples, and
the schema
is set to every Tuple
.
Please note that, the schema
is not the names given to the key and
value in the sequence file, but the names to the parsed results (Tuples).
job
- a Mobius job contains the analysis flow.name
- the name of the dataset to be build.schema
- the schema of this dataset.
java.io.IOException
public SeqFileDatasetBuilder setMapper(java.lang.Class<? extends SequenceFileMapper> mapperClass)
SequenceFileMapper
to parse the underline
sequence file records into tuples.
protected Dataset newDataset(java.lang.String datasetName)
Dataset
, the returned Dataset
has no state at all (no paths, constraints...etc.)
newDataset
in class AbstractDatasetBuilder<SeqFileDatasetBuilder>
public Dataset buildFromPreviousJob(org.apache.hadoop.mapred.JobConf prevJob, java.lang.Class<? extends org.apache.hadoop.mapred.FileOutputFormat> prevJobOutputFormat, java.lang.String[] schema) throws java.io.IOException
prevJobOutputFormat
is SequenceFileOutputFormat
,
Mobius will use this class to build a dataset from the prevJob
,
which is an intermediate results in a Mobius job.
buildFromPreviousJob
in class AbstractDatasetBuilder<SeqFileDatasetBuilder>
java.io.IOException
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |