List of usage examples for org.apache.hadoop.mapred FileOutputFormat subclass-usage
From source file org.apache.trevni.avro.AvroTrevniOutputFormat.java
/** An {@link org.apache.hadoop.mapred.OutputFormat} that writes Avro data to
* Trevni files.
*
* <p>Writes a directory of files per task, each comprising a single filesystem
* block. To reduce the number of files, increase the default filesystem block
* size for the job. Each task also requires enough memory to buffer a
From source file org.archive.hadoop.PerMapOutputFormat.java
/**
* OutputFormat that directs the output to a file named according to
* the input file. For instance, if the input file is "foo", then the
* output file is also named "foo". A suffix can be easily added, or
* a regex+replace applied to the input filename to produce an output
* filename.
From source file org.archive.jbs.lucene.LuceneOutputFormat.java
/**
* This class is inspired by Nutch's LuceneOutputFormat class. It does
* primarily three things of interest:
*
* 1. Creates a Lucene index in a local (not HDFS) ${temp} directory,
* into which the documents are added.
From source file org.archive.jbs.solr.SolrOutputFormat.java
/**
* This class is inspired by the technique used in Nutch's
* LuceneOutputFormat class. However, rather than creating a Lucene
* index and writing documents to it, we use the SolrJ API to send the
* documents to a (remote) Solr server.
*
From source file org.archive.jbs.util.PerMapOutputFormat.java
/** * */ public class PerMapOutputFormat<K, V> extends FileOutputFormat<K, V> { private String getOutputFilename(JobConf job) throws IOException { String regex = job.get("permap.regex", null);
From source file org.commoncrawl.mapred.ec2.parser.ParserOutputFormat.java
/**
* OutputFormat that splits the output from the ParseMapper into a bunch of
* distinct files, including:
*
* (1) A metadata file, that contains the crawl status, the http headers, the
* meta tags,title,links (if HTML), and Feed related data (if RSS/ATOM) in a
From source file org.hbasene.index.create.mapred.IndexOutputFormat.java
/** * Create a local index, unwrap Lucene documents created by reduce, add them to * the index, and copy the index to the destination. */ @Deprecated public class IndexOutputFormat extends FileOutputFormat<ImmutableBytesWritable, LuceneDocumentWrapper> {
From source file org.saarus.service.hadoop.util.JsonOutputFormat.java
/** An {@link JsonOutputFormat} that writes Json Objects as output. */ public class JsonOutputFormat<K, V> extends FileOutputFormat<K, V> { protected static class JsonRecordWriter<K, V> implements RecordWriter<K, V> {
From source file org.zuinnote.hadoop.office.format.mapred.AbstractSpreadSheetDocumentFileOutputFormat.java
public abstract class AbstractSpreadSheetDocumentFileOutputFormat<K> extends FileOutputFormat<NullWritable, K> { @Override public abstract RecordWriter<NullWritable, K> getRecordWriter(FileSystem ignored, JobConf conf, String name, Progressable progress) throws IOException;
From source file parquet.hadoop.mapred.DeprecatedParquetOutputFormat.java
@SuppressWarnings("deprecation") public class DeprecatedParquetOutputFormat<V> extends org.apache.hadoop.mapred.FileOutputFormat<Void, V> { private static final Log LOG = Log.getLog(DeprecatedParquetOutputFormat.class); public static void setWriteSupportClass(Configuration configuration, Class<?> writeSupportClass) { configuration.set(ParquetOutputFormat.WRITE_SUPPORT_CLASS, writeSupportClass.getName());