1. Life without JOINs... understanding, and common practices stackoverflow.com
Lots of "BAW"s (big ass-websites) are using data storage and retrieval techniques that rely on huge tables with indexes, and using queries that won't/can't use JOINs in their queries (BigTable, HQL, ...
2. Hadoop: intervals and JOIN stackoverflow.com
I'm very new to Hadoop and I'm currently trying to join two sources of data where the key is an interval (say [date-begin/date-end]). For example: input1:
3. Hadoop's Map-side join implements Hash join? stackoverflow.com
I try to implement Hash join in Hadoop. However, Hadoop seems to have already a map-side join and a reduce - side join already implemented. What is the difference between these techniques and ...
4. Is Hadoop a good open-source project to join? stackoverflow.com
I've been learning Java for the last 2 months with a Core Java book. Now I want to write something real, but at first I decided that I need to improve ...
5. Similarity join using Hadoop stackoverflow.com
I'm new to hadoop. I'd like to run some approaches with you that I came up with.
6. How would you suggest performing "Join" with Hadoop streaming? stackoverflow.com
I have two files, in the following formats: field1, field2, field3 field4, field1, field5 where different field number indicates different meaning. I want to join the two files using Hadoop Streaming based on the mutual ...
7. Understanding SQL joins within WHERE clause stackoverflow.com
I have a query in SQL that I'm trying to translate into Pig Latin (for use on a Hadoop cluster). Most of the time I have no problem moving the ...
8. Combine MapReduce result with data stackoverflow.com
How could i combine with map/reduce these two files: File1. Data.
File2. MR computed result.
9. Implementing cross join in hadoop stackoverflow.com
I am trying to implement cross join using hadoop in java. Both sides of the join are large enough that I can't keep any of them in memory. I have tried ...
10. Is a collocated join (a-la-netezza) theoretically possible in hive? stackoverflow.com
When you join tables which are distributed on the same key and used these key columns in the join condition, then each SPU (machine) in netezza works 100% independent of the ...
11. Join vs COGROUP in PIG stackoverflow.com
Are there any advantages (wrt performance / no of map reduces ) when i use COGROUP instead of JOIN in pig ? http://developer.yahoo.com/hadoop/tutorial/module6.html talks about the difference in ...
12. How can I do this inner join properly in Apache PIG? stackoverflow.com
I have two files, one called a-records
and the other file called b-records
you can see in file A that I have the token 123 one time. In file B it's in there ...
13. How to do outer join on two columns in Pig Latin stackoverflow.com
I do outer joins on single columns in Pig like this
How do I join on two columns, something like -