Join Query in Hive

Join Query in Hive - sql

I want to create a table C which contains column from Table A (customer_id ) and Table B (customer_id) which contains all customer_id from table A which are not in Table B. I wrote the following query but it didn't get any data populated.
create table C AS
select *
from (
select customer_id
from A al
join B bl
on al.customer_id=bl.customer_id
where bl.customer_id is null
) x;
This query shows 0 results.

SELECT a1.customer_id
FROM
A a1 LEFT OUTER JOIN
B b1 ON a1.customer_id = b1.customer_id
WHERE b1.customer_id IS NULL;
That should do the thing.
Regards,
Dino

Related

Oracle SQL: Get all users in one table but not another and join to a third table

I am wondering how to use oracle sql to get all the rows that are in one table but not another. The issue I am having is that the two tables don't have a field in common so I need to join to a third master table.
This is what I've tried which doesn't produce any errors but also produces 0 records which isn't possible but clearly I've done something wrong.
SELECT a.USER_ID, c.AD_ID, c.CREATED_DATE_ FROM $A$ a, $C$ c, $B$ b
WHERE (b.USER_ID IS NULL AND a.CUSTOMER_ID = c.CUSTOMER_ID)
I have three tables:
Table A has fields CUSTOMER_ID & USER_ID
Table B has field USER_ID
Table C has field CUSTOMER_ID
I need all the users that are in table C but not table B. They are all in Table A because that is the master list of users.
Any insight would be greatly appreciated.

SELECT
*
FROM
table_a
WHERE
NOT EXISTS (SELECT * FROM table_b WHERE table_b.user_id = table_a.user_id )
AND EXISTS (SELECT * FROM table_c WHERE table_c.customer_id = table_a.customer_id)

My solution:
select * from TableC tc
join TableA ta on tc.CUSTOMER_ID=ta.CUSTOMER_ID
left join TableB tb on tb.USER_ID=ta.USER_ID
where ta.USER_ID is null

I think you want:
select a.USER_ID, c.AD_ID, c.CREATED_DATE_
from a join
c
on a.customer_id = c.customer_id
where not exists (select 1 from b where b.user_id = a.user_id);

Hive Query is not working as expected

I am trying a left join in Hive Query, but it does not seem to work. It returns me columns only from left table:
create table mb.spt_new_var as select distinct customer_id ,target from mb.spt_201603 A
left outer join mb.temp B
on (A.customer_id=B.cust_id);
I tried selecting few records from table B based on the some random customer_id from table A and it returns some records. But if I try the left join on table A, it returns me only columns from table A. The data-type of both the IDs is same(int). what could be the possible reason behind this?
Sample Table A:
Customer_account_id target
12356 1
34245 0
12356 1
.... ..
Sample Table B:
Cust_id col1 col2 col3
12356 ..
12567 ..
24426 ..
...
Table A has some 1m records, while table B has some 30m records. There is possibility of some duplicate IDs in table A and Table B.

I'm a bit confused. Hive is returning the columns that you specify in the query:
select distinct a.customer_id, a.target
from mb.spt_201603 a left outer join
mb.temp b
on a.customer_id = b.cust_id;
If you want columns from the second table, you need to select them:
select distinct a.customer_id, a.target, b.col1, b.col2
from mb.spt_201603 a left outer join
mb.temp b
on a.customer_id = b.cust_id;

The Difference Inner Join Query

I'm just curious, if i have table a and table b.
I write query 1:
SELECT * FROM table a INNER JOIN table b ON table a.id = table b.id
I write query 2:
SELECT * FROM table b INNER JOIN table a ON table b.id = table a. id
What is the difference both of above query?
Thank you

When using INNER JOIN , there is no difference in resultset returned except in order of columns when SELECT * is used i.e. columns are not explicitly mentioned.
SELECT *
FROM table a
INNER JOIN table b
ON table a.id = table b.id
returns columns from tableA followed by columns from tableB
SELECT *
FROM table b
INNER JOIN table a
ON table b.id = table a. id
returns columns from tableB followed by columns from tableA

The second table matches data with the first one.
So it is better to put smaller table on the second place.

how to select only 1 unique data when joining two tables, if the table structure is one to many?

i am using an oracle database, i have two tables.
table A
primary key = productid
table B
references productid of table A
primary key = imageid
flow:
each product should have 4 images stored in the table B (mandatory)
problem:
there are some products that has only 2 or sometimes 3 or sometimes 1 image only
despite of the fact that the 4 images rule is mandatory based from code level.
Question:
how to count unique number of products that has images in table b?Because, if I do
select count(*) from tableA join tableB on tableA.productid = tableB.productid
the result is double, because it's a one to many...as in , one product has many images.
So let's say productID = 12345 has 4 images in table B, once I ran my query, the result is 4 , when i want to only get 1...so how?

SELECT Count(DISTINCT TableA.productid)
FROM TableA
JOIN TableB ON TableA.productid = TableB.productid;

Do a sub query with where exists
select count(*) from tableA
where exists (select 1 from tableB where tableA.productid = tableB.productid)

SELECT COUNT(*) FROM
(
select A.productId from tableA A join tableB B on A.productid = B.productid
GROUP BY A.productId
HAVING COUNT(B.imageId) > 1 ) T

Combining sql select and Count

I have two tables
A and B
A B
----------------- -----------------
a_pk (int) b_pk (int)
a_name(varchar) a_pk (int)
b_name (varchar)
I could write a query
SELECT a.a_name, b.b_name
FROM a LEFT OUTER JOIN b ON a.a_pk = b.a_pk
and this would return me a non distinct list of everything in table a and its table b joined data. Duplicates would display for column a where different b records shared a common a_pk column value.
But what I want to do is get a full list of values from table A column a_name and ADD a column that is a COUNT of the joined values of table B.
So if a_pk = 1 and a_name = test and in table b there are 5 records that have a a_pk value of 1 my result set would be
a_name b_count
------ -------
test 5

The query should like this :
SELECT
a.a_name,
(
SELECT Count(b.b_pk)
FROM b
Where b.a_pk = a.a_pk
) as b_count
FROM a

SELECT a_name, COUNT(*) as 'b_count'
FROM
A a
JOIN B b
ON a.a_pk = b.a_pk
GROUP BY a_name

SELECT
a.name,
(
SELECT COUNT(1)
FROM B b
WHERE b.a_pk = a.a_pk
)
FROM A a

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Join Query in Hive - sql

SELECT a1.customer_id FROM A a1 LEFT OUTER JOIN B b1 ON a1.customer_id = b1.customer_id WHERE b1.customer_id IS NULL; That should do the thing. Regards, Dino

Related

Oracle SQL: Get all users in one table but not another and join to a third table

Hive Query is not working as expected

The Difference Inner Join Query

how to select only 1 unique data when joining two tables, if the table structure is one to many?

Combining sql select and Count

Categories

Resources