Multiple IN subqueries in WHERE - sql

I am facing issues when trying to translate the following query from impala to hive 1.1 on cloudera 5.8.
SELECT *
FROM
table1 t1,table2 t2
WHERE concat(t1.field1, t1.field2) IN
(SELECT concat(T3.field1, T3.field2)
FROM table3 T3
WHERE T3.field3 = 'value')
AND concat(t1.field3, t1.field4) IN
(SELECT concat(T3.field1, T3.field2)
FROM table3 T3
WHERE T3.field3 = 'value')
AND t1.some_field = t2.some_field
The error I get here states that I can't do multiple subqueries in the where clause.
Only 1 SubQuery expression is supported.
I have tried working around this issue by using union, but in this version only union all is supported. I am not really sure on how I could use a join here to fix this as well.
I would appreciate suggestions on how to rewrite this query so it produces the expected result without throwing errors.

Using Joins and CTE:
with s3 as (SELECT T3.field1, T3.field2
FROM table3 T3
WHERE T3.field3 = 'value')
SELECT *
FROM
table1 t1
inner join table2 t2 on t1.some_field = t2.some_field
left semi join s3 on t1.field1=s3.field1
and t1.field2=s3.field2
left semi join s3 on t1.field3=s3.field1
and t1.field4=s3.field2

Their documentation says that you can use CTE.
https://cwiki.apache.org/confluence/display/Hive/Common+Table+Expression
Can you try this?
WITH firstConcatResult AS (
SELECT * FROM
table1 t1,table2 t2
WHERE
//first concat
)
SELECT * FROM firstConcatResult f
WHERE
//other concat

I would use exists and proper join syntax:
SELECT *
FROM table1 t1 JOIN
table2 t2
ON t1.some_field = t2.some_field
WHERE EXISTS (SELECT 1
FROM table3 T3
WHERE T3.field3 = 'value' AND
T3.field1 = t1.field1 AND t3.field2 = t1.field2
) AND
EXISTS (SELECT 1
FROM table3 T3
WHERE T3.field3 = 'value' AND
T3.field1 = t1.field3 AND t3.field2 = t1.field4
);

Related

How do I look for non-matching values across 3 SQL tables?

I'm looking to do what I believe is a double-nested check across three tables, but have no idea how to do so.
I have Table1, Table2, and Table3.
All are tied by an ID and a "Longform" and "Shortform" in Table1:
I'm trying to find:
Entries whose IDs appear in Table2 that have the same Longform as those in Table3, but don't share the same Shortform.
This is about as far as I've gotten:
SELECT T2.Longform,T2.Shortform FROM(
SELECT Table1.Longform,Table1.Shortform,Table1.ID FROM OuterTable1.Table1
LEFT JOIN OuterTable2.Table2 on Table1.ID = Table2.ID)
WHERE Table2.ID IS NOT NULL) T2
;
I know I'm probably going to have to do another nested select, or a join, on Outertable3.Table3 but I'm not sure which... Or where...
Any help appreciated as always.
Try the following:
Select *
(
Select T1.*
from T2
inner join T1
on T1.ID = T2.ID
) as Tab
inner join
(
Select T1.*
from T3
inner join T1
on T1.ID = T3.ID
) as Tab2
on Tab.id = Tab2.id
and Tab.Longform = Tab2.Longform
and Tab.Shortform <> Tab2.Shortform
To get the longform join table1 to table2 or table3. Then use EXISTS to check in a subquery if the IDs of table1 are different but the longform is equal.
SELECT *
FROM table2 t21
INNER JOIN table1 t11
ON t11.id = t21.id
WHERE EXISTS (SELECT *
FROM table3 t32
INNER JOIN table1 t12
ON t12.id = t32.id
WHERE t12.id <> t11.id
AND t12.longform = t11.longform);
Assuming ID is unique in all three tables
Select t2.id,t2.shortform, t1.shortform AS shortformTab1, t2.longform
FROM table2 t2
JOIN table3 t3
ON t2.id = t3.id AND t2.longform = t3.longform
JOIN table1 t1
ON t2.id = t1.id AND t2.shortform != t1.shortform

Transform select count(*) inside a inner join

My problem here is that i'm modifying an existing query and i cannot use count(*) in the query.
I have to use inner join subqueries.
What i need to "transform" into my inner join is like this (this works):
SELECT count(distinct t1.id)
FROM table1 t1
WHERE t1.column1 = 'value1' AND
t2.column2 = 'value2' AND
EXISTS(select 1 from table2 t2 where t2.id = t1.id)
My global query looks like this:
SELECT [many many column]
FROM table2 t2
INNER JOIN [...]
LEFT OUTER JOIN [...]
--[I NEED MY COUNT HERE, see below for example]
WHERE [some conditions are true]
ORDER BY [some column]
What i found to help me is something like this:
SELECT [many many column], myJoin.Count
FROM table2 t2
INNER JOIN (
SELECT tt2.id, count(distinct tt2.id) as Count
FROM table2 tt2
WHERE EXISTS (SELECT 1 FROM table1 tt1 where tt1.id = tt2.id)
GROUP BY tt2.id) myJoin
on t2.id = myJoin.id;
See what i'm trying to acheive? I need to count the ids, joining 2 tables, but i can't have a count in my main query, i can't possibly copy-paste all the "group by" condition that would go with it...
I'm on sql server.
If i find the answer i will come back and post it.
Thanks for any advice/tricks about this.
How about the following:
SELECT table2.*, TopQ.MyCount
FROM (
SELECT t2.id, myJoin.MyCount
FROM table2 t2
INNER JOIN (
SELECT tt2.id, count(distinct tt2.id) as MyCount
FROM table2 tt2
WHERE EXISTS
(SELECT 1 FROM table1 tt1 where tt1.id = tt2.id)
GROUP BY tt2.id) AS myJoin
on t2.id = myJoin.id
)AS TopQ
INNER JOIN table2 ON TopQ.id = table2.id
I came across this:
select count(distinct t1.id) over (partition by t1.aColumn) as myCount,
[many many column]
from table2 t2
inner join table1 t1 on [someConditions] = value1 and
[someConditions] = value2 and
t2.id = t1.id;
I get the same results as my first select i posted in my question, and without adding a "group by" anywhere and a lot of inner join that im not that familliar with. I'm gonna stick with this solution.
Thanks!

taking a join with output of other sql

I have 2 tables as follows T1 and T2.
T1 has one field as A and T2 has one field B.
Now i want to do following: for each value of T1.A I want to join with T2.B
Something like :
select * from T1 ,(select * from where T2 where T2.B = T1.A)
Is this correct? When i try this I get an error saying T1.A is invalid indentifier.
I know that i can do select * from T1,T2 where T1.A = T2.B
But my use case is very complex. The query (select * from where T2 where T2.B = T1.A) is very complex.
So how do I go ahead with this?
You just need to JOIN the tables:
select *
from T1
inner join T2
on T2.B = T1.A
If you need help learning JOIN syntax, here is a great visual explanation of joins.
I used an INNER JOIN which will return the rows that match between T1 and T2. You might need to use a LEFT JOIN which will return all rows in T1 even if there is not a matching row in T2
If you have another query to select from, then you can use a subquery:
select *
from T1
inner join
(
-- place your query here
select *
from T2
) T2
on T2.B = T1.A
If your subquery is only returning one column, then you could use:
select t1.*, (select t2.col1 from T2 t2 where t2.B = t1.A)
from T1 t1
Unless I'm mistaken, can't you just use JOIN:
select *
from t1
join t2 on t1.field = t2.field
Good luck.
You can do
select *
from T1
inner join T2
on T2.B = T1.A
as other people have said.
However, this "ON" version is preferred only for readability.
You can also go ahead an use
select * from T1,T2 where T1.A = T2.B
The optimizer will figure it out, and do the exact same thing,
as the queries are equivalent.
You can go ahead an use it, as there is nothing wrong with it.
select * from first_table inner join second_table on first_table.X = second_table.Y
select A,T1.B,(select * from where T2 where T2.B = T1.A) FROM T1 .Will this help?

Using an inner join with subqueries in an update syntax

I am trying to use a inner join with an update statement with a subquery ... can you help me out with the sytax please --- and also how do you use the AS clause for alias in sql server???
the following is what i am trying to do :
Update Table1
inner join table2
set table1.value1 = (select table2.value1 where table1.value 1 ....)
any idea??
If you need to use a subquery to perform the UPDATE you can do it this way:
UPDATE t1
SET t1.value = t2.value
FROM Table1 t1
JOIN
(
SELECT id, value
FROM table2
) t2
ON t1.id = t2.id
One way is to alias the table:
update t1
set table1.value1 = t2.value1
from table1 as t1
join table2 as t2
on t1.id = t2.t1_id
You should try
UPDATE table1 SET t1.value1 = t2.value2
FROM table1 t1
INNER JOIN table2 t2
ON t1.field1 = t2.field2
UPDATE Table1 t1
INNER JOIN (
SELECT id, value
FROM table2
) t2 USING(id)
SET t1.value = t2.value

Sql Server : How to use an aggregate function like MAX in a WHERE clause

I want get the maximum value for this record. Please help me:
SELECT rest.field1
FROM mastertable AS m
INNER JOIN (
SELECT t1.field1 field1,
t2.field2
FROM table1 AS T1
INNER JOIN table2 AS t2 ON t2.field = t1.field
WHERE t1.field3=MAX(t1.field3)
-- ^^^^^^^^^^^^^^ Help me here.
) AS rest ON rest.field1 = m.field
As you've noticed, the WHERE clause doesn't allow you to use aggregates in it. That's what the HAVING clause is for.
HAVING t1.field3=MAX(t1.field3)
You could use a sub query...
WHERE t1.field3 = (SELECT MAX(st1.field3) FROM table1 AS st1)
But I would actually move this out of the where clause and into the join statement, as an AND for the ON clause.
The correct way to use max in the having clause is by performing a self join first:
select t1.a, t1.b, t1.c
from table1 t1
join table1 t1_max
on t1.id = t1_max.id
group by t1.a, t1.b, t1.c
having t1.date = max(t1_max.date)
The following is how you would join with a subquery:
select t1.a, t1.b, t1.c
from table1 t1
where t1.date = (select max(t1_max.date)
from table1 t1_max
where t1.id = t1_max.id)
Be sure to create a single dataset before using an aggregate when dealing with a multi-table join:
select t1.id, t1.date, t1.a, t1.b, t1.c
into #dataset
from table1 t1
join table2 t2
on t1.id = t2.id
join table2 t3
on t1.id = t3.id
select a, b, c
from #dataset d
join #dataset d_max
on d.id = d_max.id
having d.date = max(d_max.date)
group by a, b, c
Sub query version:
select t1.id, t1.date, t1.a, t1.b, t1.c
into #dataset
from table1 t1
join table2 t2
on t1.id = t2.id
join table2 t3
on t1.id = t3.id
select a, b, c
from #dataset d
where d.date = (select max(d_max.date)
from #dataset d_max
where d.id = d_max.id)
SELECT rest.field1
FROM mastertable as m
INNER JOIN table1 at t1 on t1.field1 = m.field
INNER JOIN table2 at t2 on t2.field = t1.field
WHERE t1.field3 = (SELECT MAX(field3) FROM table1)
yes you need to use a having clause after the Group by clause ,
as the where is just to filter the data on simple parameters ,
but group by followed by a Having statement is the idea to group the data and filter it on basis of some aggregate function......
But its still giving an error message in Query Builder. I am using SqlServerCe 2008.
SELECT Products_Master.ProductName, Order_Products.Quantity, Order_Details.TotalTax, Order_Products.Cost, Order_Details.Discount,
Order_Details.TotalPrice
FROM Order_Products INNER JOIN
Order_Details ON Order_Details.OrderID = Order_Products.OrderID INNER JOIN
Products_Master ON Products_Master.ProductCode = Order_Products.ProductCode
HAVING (Order_Details.OrderID = (SELECT MAX(OrderID) AS Expr1 FROM Order_Details AS mx1))
I replaced WHERE with HAVING as said by #powerlord. But still showing an error.
Error parsing the query. [Token line number = 1, Token line offset = 371, Token in error = SELECT]