List of usage examples for org.apache.hadoop.mapreduce.lib.input FileInputFormat subclass-usage
From source file BamInputFormat.java
@InterfaceAudience.Public @InterfaceStability.Stable public class BamInputFormat extends FileInputFormat<Text, Text> { public static final String LINES_PER_MAP = "mapreduce.input.lineinputformat.linespermap"; //////////////////////////////////////////////// private static byte[] pipe_buffer = null;
From source file DupleInputFormat.java
/** An {@link InputFormat} for plain text files. Files are broken into lines. * Either linefeed or carriage-return are used to signal end of line. Keys are * the position in the file, and values are the line of text.. */ @InterfaceAudience.Public @InterfaceStability.Stable public class DupleInputFormat extends FileInputFormat<LongWritable, Text> {
From source file ZipFileInputFormat.java
/** * Extends the basic FileInputFormat class provided by Apache Hadoop to accept ZIP files. It should be noted that ZIP * files are not 'splittable' and each ZIP file will be processed by a single Mapper. */ public class ZipFileInputFormat extends FileInputFormat<Text, BytesWritable> { /** See the comments on the setLenient() method */
From source file FastqInputFormat.java
/**
* This class define an InputFormat for FASTQ files for the
* Hadoop MapReduce framework.
*
* @author Mahmoud Parsian
* @author Jos M. Abun
From source file FastqInputFormatDouble.java
/**
* This class define an InputFormat for FASTQ files for the
* Hadoop MapReduce framework.
*
* @author Jos M. Abun
*/
From source file be.uantwerpen.adrem.hadoop.util.SplitByKTextInputFormat.java
/** * Input format that splits a file in a number of chunks given by Config.NUMBER_OF_MAPPERS_KEY. */ public class SplitByKTextInputFormat extends FileInputFormat<LongWritable, Text> { public static final String NUMBER_OF_CHUNKS = "number_of_chunks";
From source file brush.InterleavedFastqInputFormat.java
/**
* This class is a Hadoop reader for "interleaved fastq" -- that is,
* fastq with paired reads in the same file, interleaved, rather than
* in two separate files. This makes it much easier to Hadoopily slice
* up a single file and feed the slices into an aligner.
* The format is the same as fastq, but records are expected to alternate
From source file bucket_sort.NLineInputFormat.java
/**
* NLineInputFormat which splits N lines of input as one split.
*
* In many "pleasantly" parallel applications, each process/mapper
* processes the same input file (s), but with computations are
* controlled by different parameters.(Referred to as "parameter sweeps").
From source file ca.sparkera.adapters.mapreduce.MainframeVBInputFormat.java
/**
* MainframeVBInputFormat is an input format used to read input files from
* mainframe VB files. The content of a record need to be binary data. Users
* must configure the record by FTPing the data with RDW bytes
*/
@InterfaceAudience.Public
From source file ca.uwaterloo.iss4e.hadoop.io.CartesianInputFormat.java
/**
* Copyright (c) 2014 Xiufeng Liu ( xiufeng.liu@uwaterloo.ca )
*
* This file is free software: you may copy, redistribute and/or modify it
* under the terms of the GNU General Public License version 2
* as published by the Free Software Foundation.