Example usage for org.apache.hadoop.mapreduce.lib.input NLineInputFormat getSplitsForFile

List of usage examples for org.apache.hadoop.mapreduce.lib.input NLineInputFormat getSplitsForFile

Introduction

In this page you can find the example usage for org.apache.hadoop.mapreduce.lib.input NLineInputFormat getSplitsForFile.

Prototype

public static List<FileSplit> getSplitsForFile(FileStatus status, Configuration conf, int numLinesPerSplit)
            throws IOException 

Source Link

Usage

From source file:org.apache.jena.hadoop.rdf.io.input.AbstractNLineFileInputFormat.java

License:Apache License

/**
 * Logically splits the set of input files for the job, splits N lines of
 * the input as one split.//ww  w  . j a  v  a2s  .  co  m
 * 
 * @see FileInputFormat#getSplits(JobContext)
 */
@Override
public final List<InputSplit> getSplits(JobContext job) throws IOException {
    boolean debug = LOGGER.isDebugEnabled();
    if (debug && FileInputFormat.getInputDirRecursive(job)) {
        LOGGER.debug("Recursive searching for input data is enabled");
    }

    List<InputSplit> splits = new ArrayList<InputSplit>();
    int numLinesPerSplit = NLineInputFormat.getNumLinesPerSplit(job);
    for (FileStatus status : listStatus(job)) {
        if (debug) {
            LOGGER.debug("Determining how to split input file/directory {}", status.getPath());
        }
        splits.addAll(NLineInputFormat.getSplitsForFile(status, job.getConfiguration(), numLinesPerSplit));
    }
    return splits;
}