com.ebay.erl.mobius.core
Class Persistable

java.lang.Object
  extended by com.ebay.erl.mobius.core.Persistable

public class Persistable
extends java.lang.Object

Sets the projections (columns to be saved on disk ) for join or group-by jobs.

The user cannot create an instance of this class directly. To get an instance of this class, use JoinOnConfigure for join type jobs, or GroupByConfigure for group-by jobs.

See MobiusJob.innerJoin(Dataset...) or MobiusJob.group(Dataset) for information on creating a join or group-by job. This product is licensed under the Apache License, Version 2.0, available at http://www.apache.org/licenses/LICENSE-2.0. This product contains portions derived from Apache hadoop which is licensed under the Apache License, Version 2.0, available at http://hadoop.apache.org. © 2007 – 2012 eBay Inc., Evan Chiu, Woody Zhou, Neel Sundaresan


Method Summary
 Dataset build(MobiusJob job, java.lang.Class<? extends org.apache.hadoop.mapred.FileOutputFormat> outputFormat, Projectable... projections)
          Build the dataset and store the projections into a temporal path (under hadoop.tmp.dir) in the format of the given outputFormat.
 Dataset build(MobiusJob job, java.lang.Class<? extends org.apache.hadoop.mapred.FileOutputFormat> outputFormat, TupleCriterion criteria, Projectable... projections)
          Build the dataset and store the projections into a temporal path (under hadoop.tmp.dir) in the format of SequenceFileOutputFormat.
 Dataset build(MobiusJob job, Projectable... projections)
          Build the dataset and store the projections into a temporal path (under hadoop.tmp.dir) in the format of SequenceFileOutputFormat.
 Dataset build(MobiusJob job, TupleCriterion criteria, Projectable... projections)
          Build the dataset and store the projections into a temporal path (under hadoop.tmp.dir) in the format of SequenceFileOutputFormat.
 Dataset save(MobiusJob job, org.apache.hadoop.fs.Path output, java.lang.Class<? extends org.apache.hadoop.mapred.FileOutputFormat> outputFormat, Projectable... projections)
          Save the dataset and store the projections into a the specified output path in the format of the given outputFormat.
 Dataset save(MobiusJob job, org.apache.hadoop.fs.Path output, java.lang.Class<? extends org.apache.hadoop.mapred.FileOutputFormat> outputFormat, TupleCriterion criteria, Projectable... projections)
          Save the dataset and store the projections into a the specified output path in the format of the given outputFormat.
 Dataset save(MobiusJob job, org.apache.hadoop.fs.Path output, Projectable... projections)
          Save the dataset and store the projections into a the specified output path in the format of TextOutputFormat.
 Dataset save(MobiusJob job, org.apache.hadoop.fs.Path output, TupleCriterion criteria, Projectable... projections)
          Save the dataset and store the projections into a the specified output path in the format of TextOutputFormat.
 Persistable setConf(java.lang.String name, java.lang.String value)
          set a configuration property to this job's configuration.
 Persistable setJobName(java.lang.String newJobName)
          Specify the name of this job.
 Persistable setReducersNumber(int reducerNumber)
          Specify the number of reducer of this job.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Method Detail

setConf

public Persistable setConf(java.lang.String name,
                           java.lang.String value)
set a configuration property to this job's configuration.

Parameters:
name - a property name in a Hadoop job configuration.
value - the value for the property name in a Hadoop job configuration.

setJobName

public Persistable setJobName(java.lang.String newJobName)
Specify the name of this job.


setReducersNumber

public Persistable setReducersNumber(int reducerNumber)
Specify the number of reducer of this job.


build

public Dataset build(MobiusJob job,
                     Projectable... projections)
              throws java.io.IOException
Build the dataset and store the projections into a temporal path (under hadoop.tmp.dir) in the format of SequenceFileOutputFormat.

Throws:
java.io.IOException

build

public Dataset build(MobiusJob job,
                     TupleCriterion criteria,
                     Projectable... projections)
              throws java.io.IOException
Build the dataset and store the projections into a temporal path (under hadoop.tmp.dir) in the format of SequenceFileOutputFormat.

Only the rows that meet the criteria will be stored. The criteria can only evaluate the columns specified in the projections.

Throws:
java.io.IOException

build

public Dataset build(MobiusJob job,
                     java.lang.Class<? extends org.apache.hadoop.mapred.FileOutputFormat> outputFormat,
                     Projectable... projections)
              throws java.io.IOException
Build the dataset and store the projections into a temporal path (under hadoop.tmp.dir) in the format of the given outputFormat.

Throws:
java.io.IOException

build

public Dataset build(MobiusJob job,
                     java.lang.Class<? extends org.apache.hadoop.mapred.FileOutputFormat> outputFormat,
                     TupleCriterion criteria,
                     Projectable... projections)
              throws java.io.IOException
Build the dataset and store the projections into a temporal path (under hadoop.tmp.dir) in the format of SequenceFileOutputFormat.

Only the rows that meet the criteria will be stored. The criteria can only evaluate the columns specified in the projections.

Parameters:
job -
outputFormat -
criteria - if specified (not null), only rows that satisfy the given criteria will be saved. Note that, criteria is applied just before the persistant step, so it can only operate on the columns in the output schema of this job.
projections - the columns to be saved in the returned Dataset.
Returns:
a Dataset with the specified columns ()
Throws:
java.io.IOException

save

public Dataset save(MobiusJob job,
                    org.apache.hadoop.fs.Path output,
                    Projectable... projections)
             throws java.io.IOException
Save the dataset and store the projections into a the specified output path in the format of TextOutputFormat.

output will be deleted before the job gets started.

Throws:
java.io.IOException

save

public Dataset save(MobiusJob job,
                    org.apache.hadoop.fs.Path output,
                    TupleCriterion criteria,
                    Projectable... projections)
             throws java.io.IOException
Save the dataset and store the projections into a the specified output path in the format of TextOutputFormat.

Only the rows that meet the criteria will be stored. The criteria can only evaluate the columns specified in the projections.

output will be deleted before the job gets started.

Throws:
java.io.IOException

save

public Dataset save(MobiusJob job,
                    org.apache.hadoop.fs.Path output,
                    java.lang.Class<? extends org.apache.hadoop.mapred.FileOutputFormat> outputFormat,
                    Projectable... projections)
             throws java.io.IOException
Save the dataset and store the projections into a the specified output path in the format of the given outputFormat.

output will be deleted before the job gets started.

Throws:
java.io.IOException

save

public Dataset save(MobiusJob job,
                    org.apache.hadoop.fs.Path output,
                    java.lang.Class<? extends org.apache.hadoop.mapred.FileOutputFormat> outputFormat,
                    TupleCriterion criteria,
                    Projectable... projections)
             throws java.io.IOException
Save the dataset and store the projections into a the specified output path in the format of the given outputFormat.

Only the rows that meet the criteria will be stored. The criteria can only evaluate the columns specified in the projections.

output will be deleted before the job gets started.

Throws:
java.io.IOException