SQL Server OUTER JOIN - sql-server-2012

How to find the non matching records from both the tables?

You can use NOT EXISTS to find out the not matching Id from both the tables and can combine it by UNION ALL.
Query
SELECT t1.[Id] FROM [table-1] t1
WHERE NOT EXISTS(
SELECT 1 FROM [table-2] t2
WHERE t1.[Id] = t2.[Id]
)
UNION ALL
SELECT t2.[Id] FROM [table-2] t2
WHERE NOT EXISTS(
SELECT 1 FROM [table-1] t1
WHERE t1.[Id] = t2.[Id]
);
Demo for reference

Another way with TOP 1 WITH TIES and COUNT OVER:
SELECT TOP 1 WITH TIES *
FROM (
SELECT *
FROM [table-1]
UNION ALL
SELECT *
FROM [table-2]
) u
ORDER BY COUNT(*) OVER (PARTITION BY Id ORDER BY Id)
Output:
Id name
D ...
E ...
F ...
G ...
H ...
I ...
J ...
K ...
COUNT(*) OVER (PARTITION BY Id ORDER BY Id) Gives 1 to unique row and >1 if there are duplicated Ids. If you put that to ORDER BY and add TOP 1 WITH TIES - that will left only Ids with minimal count.
Another way with FULL OUTER JOIN:
SELECT COALESCE(Id1,Id2) Id,
COALESCE(name1,name2) name
FROM (
SELECT t1.Id Id1,
t1.[name] name1,
t2.Id Id2,
t2.[name] name2
FROM [table-1] t1
FULL OUTER JOIN [table-2] t2
ON t1.Id = t2.Id
WHERE t1.Id IS NULL OR t2.ID IS NULL
) as t
Same output (with another order)

Related

LEFT Join on a Subquery with specific criteria

I have two tables that I am trying to JOIN
table1
----------------------------
Id Name Num
123X Apple 17
table2
-------------------------------------------------
id EndDt SomeVal
123X 10/1/2021 xxx
123X 3/1/2022 yyy
I am attempting to Select from table1 a and LEFT JOIN table2 b on a.id = b.id - however, I want to only select on the id in table2 where MAX(EndDt)
Select a.*, b.SomeVal
from table1 a
LEFT OUTER JOIN table2 b on a.id=b.id // and b.MAX(EndDt)
Is something like that doable?
There are a few ways you can do this. I make some assumptions on your data though.
Use a LEFT JOIN with a subquery:
SELECT T1.*,
sq.SomeVal
FROM dbo.Table1 T1
LEFT JOIN (SELECT ROW_NUMBER() OVER (PARTITION BY t2.Id ORDER BY t2.EndDt DESC) AS RN,
t2.Id,
t2.SomeVal
FROM dbo.Table2 T2) sq ON T1.Id = T2.Id
AND T2.RN = 1;
Use APPLY and TOP:
SELECT T1.*,
sq.SomeVal
FROM dbo.Table1 T1
OUTER APPLY (SELECT TOP (1)
t2.Id,
t2.SomeVal
FROM dbo.Table2 T2
WHERE T2.Id = T1.Id
ORDER BY T2.EndDt DESC) sq;
Use a CTE and get the "top 1" row per group:
WITH CTE AS(
SELECT T1.*,
T2.SomeVal,
ROW_NUMBER() OVER (PARTITION BY T1.ID ORDER BY T2.MaxDt DESC) AS RN
FROM dbo.Table1 T1
LEFT JOIN dbo.Table2 T2 ON T1.Id = T2.Id)
SELECT *
FROM CT
WHERE RN = 1;
Use TOP (1) WITH TIES:
SELECT TOP (1) WITH TIES
T1.*,
T2.SomeVal
FROM dbo.Table1 T1
LEFT JOIN dbo.Table2 T2 ON T1.Id = T2.Id
ORDER BY ROW_NUMBER() OVER (PARTITION BY T1.ID ORDER BY T2.MaxDt DESC) ASC;
Note that options 3 and 4 won't work as expected if ID is not unique in the table Table1 (hence my assumptions about your data).
I would recommend using the windowed ROW_NUMBER function to take the max table2 first, and then join into that subquery.
;WITH cte AS (
SELECT *, [Row] = ROW_NUMBER() OVER (PARTITION BY b.Id ORDER BY b.EndDt DESC)
FROM table2 b
)
SELECT a.*, cte.SomeVal
FROM table1 a
LEFT JOIN cte ON a.id = cte.id AND cte.[Row] = 1
For a single value, use a correlated sub-query:
SELECT
a.id,
a.name,
a.num,
(
SELECT TOP 1 SomeValue
FROM table2 As b
WHERE b.id = a.id
ORDER BY b.EndDt DESC
) As SomeVal
FROM
table1 a

How to Group By all fields nested tables in a Left Join query in BigQuery?

I have about 10 tables that I make one big nested tables by rounds with the following query:
R1 AS(
SELECT ANY_VALUE(Table1).*, ARRAY_AGG(( SELECT AS STRUCT Table2.* EXCEPT(ID))) AS Table2
FROM Table1 LEFT JOIN Table2 USING(ID)
GROUP BY Table1.ID),
R2 AS(
SELECT ANY_VALUE(R1).*, ARRAY_AGG(( SELECT AS STRUCT Table3.* EXCEPT(ID))) AS Table3
FROM R1 LEFT JOIN Table3 USING(ID)
GROUP BY R1.ID),
...
SELECT ANY_VALUE(R9).*, ARRAY_AGG(( SELECT AS STRUCT Table10.* EXCEPT(ID))) AS Table10
FROM R9 LEFT JOIN Table10 USING(ID)
The thing is that for example in my first table I can have two records with the same ID but some other fields will be different and I want to consider them as two distinct records and thus group by all the fields of the table while I join.
Then I want to do the same with all the "sub-table" (the R tables in the query), so I will able to group by all the fields of the nested tables.
How can I do it easily ?
I tried GROUP BY Table1.* but it doesn't work...
Thank you in advance
Try to_json_string:
...
FROM Table1 t1
...
GROUP BY to_json_string(t1)
You seem to want something like this:
select *
from table1 t1 left join
(select t2.*
from table2 t2
where true
qualify row_number() over (partition by t2.id order by t2.id) = 0
) t2
using (id)
This uses qualify instead of group by to fetch one row.
If you don't want all rows from from table1, you can whittle them down as well:
select *
from (select t1.*
from table1 t1
where true
qualify row_number() over (partition by id, col1, col2 order by id) = 1
) t1 left join
(select t2.*
from table2 t2
where true
qualify row_number() over (partition by t2.id order by t2.id) = 0
) t2
using (id)
How to Group By all fields ...?
I tried GROUP BY Table1.* but it doesn't work...
Consider below example
SELECT ANY_VALUE(t1).*,
ARRAY_AGG(( SELECT AS STRUCT t2.* EXCEPT(ID))) AS Table2
FROM Table1 t1 LEFT JOIN Table2 t2 USING(ID)
GROUP BY FORMAT('%t', t1)

where column in from another select results with limit (mysql/mariadb)

when i run this query returns all rows that their id exist in select from table2
SELECT * FROM table1 WHERE id in (
SELECT id FROM table2 where name ='aaa'
)
but when i add limit or between to second select :
SELECT * FROM table1 WHERE id in (
SELECT id FROM table2 where name ='aaa' limit 4
)
returns this error :
This version of MariaDB doesn't yet support 'LIMIT & IN/ALL/ANY/SOME subquery'
You are using LIMIT without an ORDER BY. This is generally not recommended because that returns an arbitrary set of rows -- and those can change from one execution to another.
You can convert this to a JOIN -- fortunately. If id is not duplicated in table2:
SELECT t1.*
FROM table1 t1 JOIN
(SELECT t2.id
FROM table2 t2
WHERE t2.name = 'aaa'
LIMIT 4
) t2
USING (id);
If id can be duplicated in table2, then:
SELECT t1.*
FROM table1 t1 JOIN
(SELECT DISTINCT t2.id
FROM table2 t2
WHERE t2.name = 'aaa'
LIMIT 4
) t2
USING (id);
Another fun way uses LIMIT:
SELECT t1.*
FROM table1 t1
WHERE id <= ANY (SELECT t2.id
FROM table2
WHERE t2.name = 'aaa'
ORDER BY t2.id
LIMIT 1 OFFSET 3
);
LIMIT is allowed in a scalar subquery.
You can use an analytic function such as ROW_NUMBER() in order to return one row from the subquery. I suppose, this way no problem would occur like raising too many rows issue :
SELECT * FROM
(
SELECT t1.*,
ROW_NUMBER() OVER (ORDER BY t2.id DESC) AS rn
FROM table1 t1
JOIN table2 t2 ON t2.id = t1.id
WHERE t2.name ='aaa'
) t
WHERE rn = 1
P.S.: Btw, id columns are expected to be primary keys of your tables, aren't they ?
Update ( depending on your need in the comment ) Consider using :
SELECT * FROM
(
SELECT j.*,
ROW_NUMBER() OVER (ORDER BY j.id DESC) AS rn2
FROM job_forum j
CROSS JOIN
( SELECT t.*,
ROW_NUMBER() OVER (PARTITION BY t2.id ORDER BY t2.id DESC) AS rn1
FROM table2 t2
WHERE t2.name ='aaa'
AND t2.id = j.id ) t2
WHERE rn1 = 1
) jj
WHERE rn2 <= 10

Join data from two tables and top from table 2

i have this tables:
Table1:
id Name
1 Example1
2 Example2
Table2:
id Date..............
1 5.2.2014........
1 6.2.2014.........
1 6.2.2014........
2 16.1.2014.......
2 17.1.2014.......
And I need take id and Name from table1 and join table1.id = table2.id and from table2 take only top 1 row...
Example:
id Name Date
1 Example1 5.2.2014
2 Example2 16.1.2014
It is possible?
You can use row_number() to filter out all but the latest row per id:
select *
from (
select row_number() over (partition by id order by Date desc) as rn
, *
from Table2
) as t2
join Table1 as t1
on t1.id = t2.id
where t2.rn = 1 -- Only latest row
Well, a simple attempt would be
SELECT t1.*,
(SELECT TOP 1 t2.Date FROM Table2 t2 WHERE t2.ID = t1.ID t2.Date) t2Date
FROM Table1 t1
If you were using SQL Server, you could use ROW_NUMBER
Something like
;WITH Vals AS (
SELECT t1.ID,
t1.Name,
t2.Date,
ROW_NUMBER() OVER(PARTITION BY t1.ID ORDER BY t2.Date) RowID
FROm Table1 t1 LEFT JOIN
Table2 t2 ON t1.ID
)
SELECT *
FROM Vals
WHERE RowID = 1
Select t1.id, t1.name , MIN(t2.date)
From table1 t1
Inner Join table2 t2
On t1.id=t2.id
Group By t1.id, t1.name

Subquery with multiple select statements

to check the subquery having multiple select statement inside 'not in' condition
Eg.
select id from tbl where
id not in (select id from table1) and
id not in (select id from table2) and
id not in (select id from table3)
instead of repeating the same id 'not in' condition , i need the subquery which will check in one shot from multiple tables..
pls help..
Your query is better expressed as:
SELECT id
FROM tbl t
LEFT JOIN table1 t1 on t1.id = t.id
LEFT JOIN table2 t2 on t2.id = t.id
LEFT JOIN table3 t3 on t3.id = t.id
WHERE t1.id IS NULL AND t2.id IS NULL AND t3.id IS NULL
You could use a union, so you just have one in:
select id
from tbl
where id not in
(
select id from table1
union all select id from table2
union all select id from table3
)
Note: not in does not work well with nullable columns, but I assume id is not nullable here.
use union all
like this -->
select f.FIRST_NAME from farmer f where f.ID in (select v.ID from Village v where v.ID in (1,2) union all select s.ID from state s where s.ID in (3,4) )