key « hadoop « Java Database Q&A

1. What is the use of the 'key K1' in the org.apache.hadoop.mapred.Mapper? stackoverflow.com

I'm learning Apache Hadoop and I was looking at the WordCount example org.apache.hadoop.examples.WordCount. I've understand this example, however I can see that the variable LongWritable key was not used ...

2. Using set/list data types for intermediate keys in Hadoop stackoverflow.com

In an Apache Hadoop map-reduce program, what are the options for using sets/lists as keys in the output from the mapper? My initial idea was to use ArrayWritable as key type, but ...

3. Using Hadoop, are my reducers guaranteed to get all the records with the same key? stackoverflow.com

I'm running a hadoop job (using hive actually) which is supposed to uniq lines in a lot of text file. More specifically it chooses the most recently timestamped record for ...

4. Which key class is suitable for secondary sort? stackoverflow.com

In Hadoop you can use the secondary-sort mechanism to sort the values before they are sent to the reducer. The way this is done in Hadoop is that you add the value ...

5. Parsing bulk text with Hadoop: best practices for generating keys stackoverflow.com

I have a 'large' set of line delimited full sentences that I'm processing with Hadoop. I've developed a mapper that applies some of my favorite NLP techniques to it. ...

6. hadoop + one key to every reducer stackoverflow.com

Is there a way in Hadoop to ensure that every reducer gets only one key that is output by the mapper ?

7. Specify Hadoop mapreduce input keys directly (not from a file) stackoverflow.com

I'd like to generate some data using a mapreduce. I'd like to invoke the job with one parameter N, and get Map called with each integer from 1 to N, ...

8. implementing inheritance for Hadoop Key class stackoverflow.com

How to extend in Hadoop (Java) map/reduce the Key class from another Key class? Mapper uses Key1 class, Reducer uses Key2 class, which is extension of Key1.

public class Key1 extends ...

9. FileInputFormat where filename is KEY and text contents are VALUE stackoverflow.com

I'd like to use an entire file as a single record for MAP processing, with the filename as the key. I've read the following post: How to get Filename/File ...

10. Hadoop seems to modify my key object during an iteration over values of a given reduce call stackoverflow.com

Hadoop Version: 0.20.2 (On Amazon EMR) Problem: I have a custom key that i write during map phase which i added below. During the reduce call, I do some simple aggregation on ...

11. Can Hadoop mapper produce multiple keys in output? stackoverflow.com

Can a single Mapper class produce multiple key-value pairs (of same type) in a single run? We output the key-value pair in the mapper like this:

context.write(key, value);

Here's a trimmed down (and exemplified) ...

12. What is the input order for keys in reduce() method stackoverflow.com

I have a simple use case. In my input file I just need to calculate the percentage distribution of total number of words. For example word1 is present 10 times, word2 ...

13. hadoop mapper emits unique key. can I perform reducer after per map? stackoverflow.com

My mapper emits 'uniq key' - 'very large value' pair. My reducer doesn't know the key is unique. Thus, reducer waits all the mappers are completed. I tried to use combiner, but it is ...

14. Add Entire Files Text as Map Key in Hadoop stackoverflow.com

I am looking for a way to load an entire file text into my map. Not a single line at a time like TextInputFormat does. So that when I ...

15. How can I get an integer index for a key in hadoop? stackoverflow.com

Intuitively, hadoop is doing something like this to distribute keys to mappers, using python-esque pseudocode.

# data is a dict with many key-value pairs
keys = data.keys()
key_set_size = len(keys) / num_mappers
index = 0
mapper_keys ...

16. Hadoop streaming example failed Type mismatch in key from map stackoverflow.com

Possible Duplicate:
hadoop-streaming example failed to run - Type mismatch in key from map

When I ran Hadoop streaming example, it failed with Type mismatch in ...

17. Type mismatch in key from map when replacing Mapper with MultithreadMapper stackoverflow.com

I'd like to implement a MultithreadMapper for my MapReduce job. For this I replaced Mapper with MultithreadMapper in a working code. Here's the exeption I'm getting:

java.io.IOException: Type mismatch in key from map: expected ...

18. hadoop-streaming example failed to run - Type mismatch in key from map stackoverflow.com

I was running  $HADOOP_HOME/bin/hadoop  jar $HADOOP_HOME/hadoop-streaming.jar \
    -D stream.map.output.field.separator=. \
    -D stream.num.map.output.key.fields=4 \
    -input myInputDirs \
    -output ...

19. Can Hadoop read arbitrary key binary file stackoverflow.com

It looks like Hadoop MapReduce requires a key value pair structure in the text or binary text. In reality we might have files to be split into chunks to be processed. ...

20. Hadoop Sort map and reduce key value stackoverflow.com

If I had a file with random integers on each line and wanted to sort the file using Hadoop, what would my mapper and reducer's input/output key and value be?

21. Is the input to a Hadoop reduce function complete with regards to its key? stackoverflow.com

I'm looking at solutions to a problem that involves reading keyed data from more than one file. In a single map step I need all the values for a particular ...

22. Hadoop key mismatch coderanch.com

Hello, Hope this is the correct forum for a hadoop question. I have a file with a bunch of lines like this: education:43 Alabama education:11 Alaska education:44 Arizona It continues on for all 50 states, then there is another word like politics:30 Virginia ... etc. I want to do a distributed sort on this using mapreduce. I know mapreduce sorts between ...