I'm looking for some general information about how other people are using Hadoop or other MapReduce-like technologies. In general, I am curious to whether you are writing MR applications ... |
So, I've been looking at Hadoop with keen interest, and to be honest I'm fascinated, things don't get much cooler.
My only minor issue is I'm a C# developer and ... |
We have a huge data of about 300 million records, which will get updated every 3-6 months.We need to query this data(continously, real time) to get some information.What are the options ... |
Can somebody outline the various differences between the various Hadoop Distributions available:
using the Apache Hadoop distro as a baseline.
Is there a ... |
When I run a mapreduce program using Hadoop, I get the following error.
10/01/18 10:52:48 INFO mapred.JobClient: Task Id : attempt_201001181020_0002_m_000014_0, Status : FAILED
java.io.IOException: Task process exit with nonzero status of ... |
I can't find a single example of submitting a Hadoop job that does not use the deprecated JobConf class. JobClient, which hasn't been deprecated, still only supports methods that take ... |
I want to build a hadoop application which can read words from one file and search in another file.
If the word exists - it has to write to one output file
If ... |
|
Consider the following log file format:
id v1 v2 v3
1 ...
|
What is the closest thing like Hadoop, but in C++?
In particular, I want to do distributed computing using MapReduce.
Thanks!
|
I am playing around with Hadoop and have set up a two node cluster on Ubuntu. The WordCount example runs just fine.
Now I'd like to write my own MapReduce program to ... |
I need to do a project on Computational Linguistics course. Is there any interesting "linguistic" problem which is data intensive enough to work on using Hadoop map reduce. Solution or algorithm ... |
I learnt Hadoop a few months back and managed to do a very introductory programming project on it. I want to do a small - medium sized project or series of ... |
My reducer class produces outputs with TextOutputFormat (the default OutputFormat given by Job). I like to consume this outputs after the MapReduce job complete to aggregate the outputs. In addition to ... |
Is it correct to say that the parallel computation with iterative MapReduce can be justified mainly when the training data size is too large for the non-parallel computation for the same ... |
I'm interested in learning techniques for distributed computing. As a Java developer, I'm probably willing to start with Hadoop. Could you please recommend some books/tutorials/articles to begin with?
|
I launched a hadoop cluster and submitted a job to the master. The jar file is only contained in the master. Does hadoop ship the jar to all the slave machines ... |
This is a conceptual question involving Hadoop/HDFS. Lets say you have a file containing 1 billion lines. And for the sake of simplicity, lets consider that each line is of the ... |
I tried printing out values using System.out.println(), but they won't appear on the console. How do i print out the values in a map/reduce application for debugging purposes using Hadoop?
Thanks,
Deepak.
|
Can someone walk me though the basic work-flow of reading and writing data with classes generated from DDL?
I have defined some struct-like records using DDL. For example:
class Customer {
...
|
My program follows a iterative map/reduce approach. And it needs to stop if certain conditions are met. Is there anyway i can set a global variable that can be distributed across ... |
I'm trying to run a hadoop job (version 18.3) on my windows machine but I get the following error:
Caused by: javax.security.auth.login.LoginException: Login failed: CreateProcess: bash -c groups error=2
...
|
I have hadoop job with tasks that are expected to run for significant length of fime (few minues). However hadoop starts speculative execution too soon. I do not want to turn ... |
I need to split my Map Reduce jar file in two jobs in order to get two different output file, one from each reducers of the two jobs.
I mean that the ... |
I'm working with a team of mine on a small application that takes a lot of input (logfiles of a day) and produces useful output after several (now 4, in the ... |
I want to debug a mapreduce script, and without going into much trouble tried to put some print statements in my program. But I cant seem to find them in any ... |
I'm using hadoop in windows and i've configured everything good (installing cygwin, passwordless ssh etc..)
I've compiled the wordcount program in WC.jar and tried to run. Its running perfectly in standalone ... |
Given that the complexity of the map and reduce tasks are O(map)=f(n) and O(reduce)=g(n) has anybody taken the time to write down how the Map/Reduce intrinsic operations (sorting, shuffling, sending data, ... |
Is there a distance calculation implementation using hadoop map/reduce. I am trying to calculate a distance between a given set of points.
Looking for any resources ..
//edited ............
This is a very intelligent ... |
I'm a beginer in hadoop.
I've understood the WordCount program. Now I have a problem. I dont want the output of all the words..
- Words_I_Want.txt -
hello
echo
raj
- Text.txt -
hello eveyone. I ... |
My team built a Java application using the Hadoop libraries to transform a bunch of input files into useful output.
Given the current load a single multicore server will do fine for ... |
I have a weird problem, DistributedCache appears to change the names of my files, it uses the original name as the parent folder and adds the file as a child.
i.e. ... |
I was trying to find the sum of any given points using hadoop, but my problem is on getting all values from a given key in a single reducer. It is ... |
I'm a newbie in Hadoop. I'm trying out the Wordcount program.
Now to try out multiple output files, i use MultipleOutputFormat. this link helped me in doing it. |
Lately, i have reading a lot about MapReduce/Hadoop and think this is where industry is currently moving to.
I want to start learning MapReduce/Hadoop and i thought the best way ... |
- I have some experience with Lucene, I'm trying to understand how the data is actually stored in slave server in Hadoop framework?
- Do we create an index in Slave Server with set ...
|
I have inherited a mapreduce codebase which mainly calculates the number of unique user IDs seen over time for different ads. To me it doesn't look like it is being done ... |
I have been trying to understand the MapReduce concept and apply it to my current situation. What is my situation? Well, I have an ETL tool here, in which data transformation ... |
I want to chain 2 Map/Reduce jobs. I am trying to use JobControl to achieve the same. My problem is -
JobControl needs org.apache.hadoop.mapred.jobcontrol.Job which in turn needs org.apache.hadoop.mapred.JobConf which is deprecated. ... |
Just wondering if anybody has done/aware about encoding/compressing large image into JPEG2000 format using Hadoop ?
There is also this http://code.google.com/p/matsu-project/ which uses map reduce to process the image.
Image size ... |
Given Hadoop 0.21.0, what assumptions does the framework make regarding the number of open file descriptors relative to each individual map and reduce operation? Specifically, what suboperations cause Hadoop ... |
Is it possible to use one Hadoop job run to output data to different directories based on keys?
My use case is server access logs. Say I have them all together, ... |
I need to implement a custom (service) input source for a Hadoop MapReduce app. I google'd and SO'd and found that one way to proceed is to implement a custom InputFormat. ... |
How do I define an ArrayWritable for a custom Hadoop type ? I am trying to implement an inverted index in Hadoop, with custom Hadoop types to store the data
I have ... |
What are the hidden features of Hadoop MapReduce that every developer should be aware of?
One hidden feature per answer, please.
|
how do we design mapper/reducer if I have to transform a text file line-by-line into another text file.
I wrote a simple map/reduce programs which did a small transformation but the requirement ... |
I searched the web, but all I found was a site that claimed that it could be done. It didn't say how.
|
Is there a way to use the relation name in MapReduce's Map and Reduce? I am trying to do Set difference using Hadoop's MapReduce.
Input: 2 files R and S containing list ... |
I recently started to use Hadoop and I have a problem while using a Mapfile as a input to a MapReduce job.
The following working code, writes a simple MapFile called "TestMap" ... |
What happens when the datanode the map/reduce is using goes down? Shouldnt the job be redirected to another datanode? How should my code handle this exceptional condition?
|
I am interesting - what can be considered to be a good throughput
for the hadoop lightweight text data processing per node? To be more specific I would ask:
Let say I ... |
How to start learning Hadoop and Mapreduce?
Is there any tutorial on hardware requirement and development requirement setting? I am planning to use C++ and Java. Many thanks.
|
I am working on the parallelization an algorithm, which roughly does the following:
- Read several text documents with a total of 10k words.
- Create an objects for every word in the text corpus.
- Create ...
|
Is it possible to parallelize SVD computing, using for example Hadoop's MAP REDUCE?
Could you provide a simple example of it??
|
I've heard of Hadoop, but what else can I use to start in this topic...
- what other API are there?
- In general what is it needed to start
programming here?
- what do you recommend to ...
|
Greetings to all,
Today i came across a strange problem about non-root users in Linux ( CentOS ).
I am able to compile & run a Java Program through below commands properly :
[root@cuda1 ...
|
I am looking to work on hadoop open source implementation and I was wondering if there is a distributed profiler for hadoop? In case, could someone point me to any links ... |
I would like to implement a TreeWritable class to represent a Tree structure.
I have tried the following implementation but I'm getting a mapred.MapTask: Record too large for in-memory buffer error.
How should ... |
I have to pass 3rd agrs to mapreduce program..
I have to read file given by user in mapreduce program.
|
I was trying to use a static object in hadoop.
This object is both used in map and reduce.
My program is :
- read 100000 lines, thus 100000 maps.
- for each mapper, a static attribute ...
|
All three constructors of org.apache.hadoop.mapreduce.Job
are deprecated, is there a way to construct a Job class the non-deprecated way?
Thanks.
|
I have chained 2 mappers followed by 1 reducer. Is it possible to write the intermediate outputs (o/p of each mapper in the chain) to HDFS? I tried setting the OutputPath ... |
I'm trying to chain maps and reduces phases in one job. The problem is that I'm running under hadoop 0.20.2 and the package org.apache.hadoop.mapred.lib.Chain seems to be deprecated and replaced by ... |
Exercise 4, Chapter 4 in Hadoop in Action is about implementing a linear filter computing the moving average of a time series. That is, given N and a series of timestamped ... |
I've been studying hadoop's scheduler mechanism recently.
Using 0.20.2(fair&capacity included)
Have read some papers, LATE\Deadline Scheduler...
Has anyone tried?
or is there a guide?
thx anyway
|
HI
Im 3rd year of college student major in software engineering and had few experiences on HADOOP.i looking for a idea of small to medium size project with hadoop.i want to do ... |
I couldn't find any documentation on how hadoop handles splilled records. Is there a link that can be found online.
Thanks for your time.
|
I have two Map/Reduce classes, named MyMappper1/MyReducer1 and MyMapper2/MyReducer2, and want to use the output of MyReducer1 as the input of MyMapper2, by setting the input path of job2 to the ...
|
I am new to HDFS and MapReduce and trying to calculate survey statistics. Input file is in this format: Age Points Sex Category - all 4 of them are numbers. Is ... |
I using Hadoop Map/Reduce using Java
Suppose, I have completed a whole map/reduce job. Is there any way I could repeat the whole map/reduce part only, without ending the job. I mean, ... |
Is there any work going on to port Hadoop pipes from mapred to mapreduce package?
Thanks,
Meg
|
The Map-Reduce programming model stems from the map and reduce functions which are present in functional languages like Lisp and Scheme dating back many many years.
I remember from university (early 90's) ... |
I know it's my OCD, but I can't stand to have a deprecated reference in my code.
That said, the Hadoop tutorials, including the "The Definitive Guide" book, uses only deprecated classes ... |
My question is about mapreduce programming in java.
Suppose I have the WordCount.java example, a standard mapreduce program. I want the map function to collect some information, and return to the ... |
Everywhere I go to learn about Hadoop I see the wordcount example. I want to look at some more code that has been written to solve some other ... |
I'm new to hadoop pipes.
Can anyone tell me how to run two map reduce together in a single job (program) in hadoop pipes?
My problem is that i want to ... |
Not sure if anyone has run into this issue. I am trying to use oozie for running a simple MapReduce job that searches for a string value in HDFS location and ... |
In Hadoop 'grep' example (that comes with the Hadoop package) what is the group parameter.Can you give me an example for that.
|
I would like to write multiple output files.
How do I do this using Job instead of JobConf?
|
I am trying to use hadoop under windows and I am running into a problem when I want to start tasktracker. For example:
$bin/start-all.sh
then the logs writes:
2011-06-08 16:32:18,157 ERROR org.apache.hadoop.mapred.TaskTracker: Can not ...
|
After reading this and this paper, I decided I want to implement a distributed volume rendering setup for large datasets on MapReduce as my undergraduate thesis work. ... |
I have a use case where I want to process data and generate output of fixed size , say 1 GB i.e. each map-reduce job output should be 1 Gb.
Does anybody ... |
Is there a way to generate permutations with MapReduce?
input file:
1 title1
2 title2
3 title3
my goal:
1,2 title1,title2
1,3 title1,title3
2,3 title2,title3
|
I'm using Cloudera's Hadoop distribution CDH-0.20.2CDH3u0.
Is there any way I could the information such as jobtracker status, tasktracker status, counters using a JAVA program running outside of hadoop framework? I tried ... |
I am currently trying to figure out when you run a MapReduce job what happens by making some system.out.println() at certain places on the code but know of those print statement ... |
I am new to hadoop and I am learning by using few examples. I am currently trying to pass a file with random integers on it. For each and every number ... |
i want to convert the below codes to run in hadoop. Basically what I want to achieve is to runner a mapper a number of times. Assuming the array is my ... |
I am working on map reduce program and was thinking about designing computations of the form
where a1,b1 are the values associated with a key
a1/b1, a1+a2/b1+b2, a1+a2+a3/b1+b2+b3 ...
So at ... |
I am currently implement a parallel-for on hadoop to iterate the mapper a number of times as specify by the user. Can someone help me with a useful example that I ... |
I am working with a 2 large input files of the order of 5gb each..
It is the output of Hadoop map reduce, but as i am not able to do dependency ... |
I am processing a file with 7+ million lines (~59 MB) in Ubuntu 11.04 machine with this configuration:
Intel(R) Core(TM)2 Duo CPU E8135 @ 2.66GHz, 2280 MHz
Memory: ... |
I am analyzing a large amount of files in a Hadoop MapReduce job, with the input files being in .txt format. Both my mapper and my reducer are written in Python.
However, ... |
I'm new to the Apache hadoop. I install the prerequisite software and configure the every thing and eclipse plugins also done but when i click the new hadoop location it's not ... |
I'm using Hadoop's MapReduce. I have a a file as an input to the map function, the map function does something (not relevant for the question). I'd like my ... |
How efficient are opensource distributed computation frameworks like Hadoop? By efficiency, I mean CPU cycles that can be used for the "actual job" in tasks that are mostly pure computation. In ... |
I have mapreduce job:
my code Mapp class:
public static class MapClass extends Mapper {
@Override
public void map(Text key, Text value, Context ...
|
I am trying to access a data file from a public class, both of which are located within a JAR file. However, when I execute the jar on a Hadoop cluster, ... |
I am trying to run MapReduce job on Hadoop but I am facing an error and I am not sure what is going wrong. I have to pas library jars which ... |
I want to create a directory inside the working directory of a MapReduce job in Hadoop.
For example by using:
... |
I'm using Clojure to pull ten XML files hourly, each file is about 10 MB. This script is running on a server machine.
XML files are parsed and stored into RDBMS ... |
I would like to know the details (architecture and design documents) about the next generation Apache MapReduce. Where are the sources to get more information about it?
|