cache « hadoop « Java Database Q&A

1. hadoop, map/reduce output file(part-00000) and distributed cache

the value ouput from my map/reduce is a bytewritable array, which is written in the output file part-00000 (hadoop do so by default). i need this array for my next map ...

2. distributed cache

i am working with hadoop 19 on opensuse linux, i am not using any cluster rather running my hadoop code on my machine itself. i am following the standard technique on ...

3. FileNotFoundException when using Hadoop distributed cache

this time someone should please relpy i am struggling with running my code using distributed cahe. i have already the files on hdfs but when i run this code :

import java.awt.image.BufferedImage;
import ...

4. Can the Hadoop distributed cache addFileToClassPath .class files or is it limited to .jar files?

I've tried the following:

DistributedCache.addFileToClassPath(new Path("something.jar"), config);
DistributedCache.addFileToClassPath(new Path("something.class"),config);
The first one works, the second doesn't. Does addFileToClassPath only work for jars? This seems weird because there's also an addArchiveToClassPath method.

5. Adding multiple files to Hadoop distributed cache?

I am trying to add multiple files to hadoop distributed cache. Actually I don't know the file names. They will be named like part-0000*. Can someone tell me how to do ...

6. Hadoop Distributed Cache (Cloudera CH3)

I am trying to run a simple example using a binary executable and the cached archive and it does not seem to be working: The example I am trying to run has a ...

7. hive : remove stuff from distributed cache

I can add stuff to distributed cache via

add file largelookuptable
and then run a bunch of HQL. now when I have a series of commands, like the following
add file largelookuptable1;
select blah from blahness ...

8. Hadoop java mapper job executing on slave node, directory issue

As part of my Java mapper I have a command executes some standalone code on a local slave node. When I run a code it executes fine, unless it is ...

9. Hadoop Distributed Cache FileNotFound

I have a very frustrating FileNotFound issue in deployment. In an abstract class that all of my mappers and reducers extend, I have the following in the configure method:

public void configure(JobConf ...

10. Using Distributed Cache with Pig on Elastic Map Reduce

I am trying to run my Pig script (which uses UDFs) on Amazon's Elastic Map Reduce. I need to use some static files from within my UDFs. I do something like this in ...

11. Hadoop and distributed transactional cache

Hadoop has also its own cache implementation, which stands for bringing big data at the right place in the right time, but is not a transactional cache. Distributed transactional cache are used in transactional systems to communicate states between the part of the processes running in diferent nodes but sharing the same transactional scope. While MapReduce framework is not suitable or ...