|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectcom.ebay.erl.mobius.core.builder.AbstractDatasetBuilder<ACUTAL_BUILDER_IMPL>
ACUTAL_BUILDER_IMPL
- the implementation of a AbstractDatasetBuilder
.public abstract class AbstractDatasetBuilder<ACUTAL_BUILDER_IMPL>
The base class of all Dataset
builders which builds
instance of different Dataset
.
This product is licensed under the Apache License, Version 2.0, available at http://www.apache.org/licenses/LICENSE-2.0. This product contains portions derived from Apache hadoop which is licensed under the Apache License, Version 2.0, available at http://hadoop.apache.org. © 2007 – 2012 eBay Inc., Evan Chiu, Woody Zhou, Neel Sundaresan
Field Summary | |
---|---|
protected java.util.List<ComputedColumns> |
computedColumns
The ComputedColumns for this dataset. |
protected java.lang.String |
datasetName
name of this dataset. |
protected MobiusJob |
mobiusJob
An instance of MobiusJob which contains
the analysis flow. |
Constructor Summary | |
---|---|
protected |
AbstractDatasetBuilder(MobiusJob aJob,
java.lang.String datasetName)
Constructor for creating a dataset builder. |
Method Summary | |
---|---|
ACUTAL_BUILDER_IMPL |
addComuptedColumn(ComputedColumns aComputedColumn)
Add a ComputedColumns to this dataset. |
protected ACUTAL_BUILDER_IMPL |
addInputPath(boolean validatePathExistance,
org.apache.hadoop.fs.Path... paths)
Add the paths to the underline dataset. |
ACUTAL_BUILDER_IMPL |
addInputPath(org.apache.hadoop.fs.Path... paths)
Specify the input path(s) of a Dataset . |
Dataset |
build()
Finishing the Dataset building process. |
Dataset |
buildFromPreviousJob(org.apache.hadoop.mapred.JobConf prevJob,
java.lang.Class<? extends org.apache.hadoop.mapred.FileOutputFormat> prevJobOutputFormat,
java.lang.String[] schema)
To be called by Mobius engine, for building a dataset from a previous mobius job, user should not use this method. |
protected boolean |
checkTouchFile(org.apache.hadoop.fs.FileSystem fs,
org.apache.hadoop.fs.Path aPath)
Check if there is a touch file exist within the given aFolder . |
ACUTAL_BUILDER_IMPL |
constraint(TupleCriterion criteria)
Put filter on the records of this Dataset , only
raw within the Dataset that meet the criteria
can be outputed. |
protected Dataset |
getDataset()
Get the dataset , if it's null,
then newDataset(String) will be called
and assign dataset to the return object. |
protected abstract Dataset |
newDataset(java.lang.String datasetName)
Create a new Dataset , the returned Dataset
has no state at all (no paths, constraints...etc.) |
ACUTAL_BUILDER_IMPL |
setSchema(java.lang.String... schema)
Specify the schema of this Dataset |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
---|
protected MobiusJob mobiusJob
MobiusJob
which contains
the analysis flow.
protected java.lang.String datasetName
protected java.util.List<ComputedColumns> computedColumns
ComputedColumns
for this dataset.
Constructor Detail |
---|
protected AbstractDatasetBuilder(MobiusJob aJob, java.lang.String datasetName)
Method Detail |
---|
protected Dataset getDataset()
dataset
, if it's null,
then newDataset(String)
will be called
and assign dataset
to the return object.
protected abstract Dataset newDataset(java.lang.String datasetName)
Dataset
, the returned Dataset
has no state at all (no paths, constraints...etc.)
public Dataset buildFromPreviousJob(org.apache.hadoop.mapred.JobConf prevJob, java.lang.Class<? extends org.apache.hadoop.mapred.FileOutputFormat> prevJobOutputFormat, java.lang.String[] schema) throws java.io.IOException
java.io.IOException
public ACUTAL_BUILDER_IMPL setSchema(java.lang.String... schema)
Dataset
public ACUTAL_BUILDER_IMPL addComuptedColumn(ComputedColumns aComputedColumn)
ComputedColumns
to this dataset.
ComputedColumns
public Dataset build() throws java.lang.IllegalStateException
Dataset
building process.
Invoke this method to get an reference to a Dataset
so it can be used in MobiusJob.innerJoin(Dataset...)
,
MobiusJob.list(Dataset, com.ebay.erl.mobius.core.model.Column...)
...etc.
Dataset
java.lang.IllegalStateException
- when the user doesn't specify all the required
parameters (no input path, for example) during the building process.public ACUTAL_BUILDER_IMPL addInputPath(org.apache.hadoop.fs.Path... paths) throws java.io.IOException
Dataset
.
paths
- one or more path that contain the dataset of
java.io.IOException
protected boolean checkTouchFile(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path aPath)
aFolder
.
This method is invoked when user use addInputPath(Path...)
, and
return true by default, i.e., do not check touch file. Touch file is
used in to indicate the files for a dataset are all ready, if the
deployed Hadoop system will generate touch file for a Hadoop output folder,
user should override this method to enable the touch file checking.
fs
- aFolder
-
protected ACUTAL_BUILDER_IMPL addInputPath(boolean validatePathExistance, org.apache.hadoop.fs.Path... paths) throws java.io.IOException
paths
to the underline dataset. A boolean
flag validatePathExistance
to specify if Mobius
needs to verify the specified paths
exist or not.
If validatePathExistance
is true, and one of the
paths
doesn't exist, IOException
will
be thrown.
If a path exists and it's a folder, checkTouchFile(FileSystem, Path)
will be called to see if a touch file exists under that folder or not.
The default implementation of checkTouchFile
always return
true, which means the dataset builder doesn't check touch file by default.
If this is a need to check touch file, the subclass should override that
function, and when the funciton return false, IOException
will be thrown here for that specific path.
java.io.IOException
public ACUTAL_BUILDER_IMPL constraint(TupleCriterion criteria)
Dataset
, only
raw within the Dataset
that meet the criteria
can be outputed.
criteria
-
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |