LEFT Join on a Subquery with specific criteria

LEFT Join on a Subquery with specific criteria - sql

I have two tables that I am trying to JOIN
table1
----------------------------
Id Name Num
123X Apple 17
table2
-------------------------------------------------
id EndDt SomeVal
123X 10/1/2021 xxx
123X 3/1/2022 yyy
I am attempting to Select from table1 a and LEFT JOIN table2 b on a.id = b.id - however, I want to only select on the id in table2 where MAX(EndDt)
Select a.*, b.SomeVal
from table1 a
LEFT OUTER JOIN table2 b on a.id=b.id // and b.MAX(EndDt)
Is something like that doable?

There are a few ways you can do this. I make some assumptions on your data though.
Use a LEFT JOIN with a subquery:
SELECT T1.*,
sq.SomeVal
FROM dbo.Table1 T1
LEFT JOIN (SELECT ROW_NUMBER() OVER (PARTITION BY t2.Id ORDER BY t2.EndDt DESC) AS RN,
t2.Id,
t2.SomeVal
FROM dbo.Table2 T2) sq ON T1.Id = T2.Id
AND T2.RN = 1;
Use APPLY and TOP:
SELECT T1.*,
sq.SomeVal
FROM dbo.Table1 T1
OUTER APPLY (SELECT TOP (1)
t2.Id,
t2.SomeVal
FROM dbo.Table2 T2
WHERE T2.Id = T1.Id
ORDER BY T2.EndDt DESC) sq;
Use a CTE and get the "top 1" row per group:
WITH CTE AS(
SELECT T1.*,
T2.SomeVal,
ROW_NUMBER() OVER (PARTITION BY T1.ID ORDER BY T2.MaxDt DESC) AS RN
FROM dbo.Table1 T1
LEFT JOIN dbo.Table2 T2 ON T1.Id = T2.Id)
SELECT *
FROM CT
WHERE RN = 1;
Use TOP (1) WITH TIES:
SELECT TOP (1) WITH TIES
T1.*,
T2.SomeVal
FROM dbo.Table1 T1
LEFT JOIN dbo.Table2 T2 ON T1.Id = T2.Id
ORDER BY ROW_NUMBER() OVER (PARTITION BY T1.ID ORDER BY T2.MaxDt DESC) ASC;
Note that options 3 and 4 won't work as expected if ID is not unique in the table Table1 (hence my assumptions about your data).

I would recommend using the windowed ROW_NUMBER function to take the max table2 first, and then join into that subquery.
;WITH cte AS (
SELECT *, [Row] = ROW_NUMBER() OVER (PARTITION BY b.Id ORDER BY b.EndDt DESC)
FROM table2 b
)
SELECT a.*, cte.SomeVal
FROM table1 a
LEFT JOIN cte ON a.id = cte.id AND cte.[Row] = 1

For a single value, use a correlated sub-query:
SELECT
a.id,
a.name,
a.num,
(
SELECT TOP 1 SomeValue
FROM table2 As b
WHERE b.id = a.id
ORDER BY b.EndDt DESC
) As SomeVal
FROM
table1 a

Related

How to Group By all fields nested tables in a Left Join query in BigQuery?

I have about 10 tables that I make one big nested tables by rounds with the following query:
R1 AS(
SELECT ANY_VALUE(Table1).*, ARRAY_AGG(( SELECT AS STRUCT Table2.* EXCEPT(ID))) AS Table2
FROM Table1 LEFT JOIN Table2 USING(ID)
GROUP BY Table1.ID),
R2 AS(
SELECT ANY_VALUE(R1).*, ARRAY_AGG(( SELECT AS STRUCT Table3.* EXCEPT(ID))) AS Table3
FROM R1 LEFT JOIN Table3 USING(ID)
GROUP BY R1.ID),
...
SELECT ANY_VALUE(R9).*, ARRAY_AGG(( SELECT AS STRUCT Table10.* EXCEPT(ID))) AS Table10
FROM R9 LEFT JOIN Table10 USING(ID)
The thing is that for example in my first table I can have two records with the same ID but some other fields will be different and I want to consider them as two distinct records and thus group by all the fields of the table while I join.
Then I want to do the same with all the "sub-table" (the R tables in the query), so I will able to group by all the fields of the nested tables.
How can I do it easily ?
I tried GROUP BY Table1.* but it doesn't work...
Thank you in advance

Try to_json_string:
...
FROM Table1 t1
...
GROUP BY to_json_string(t1)

You seem to want something like this:
select *
from table1 t1 left join
(select t2.*
from table2 t2
where true
qualify row_number() over (partition by t2.id order by t2.id) = 0
) t2
using (id)
This uses qualify instead of group by to fetch one row.
If you don't want all rows from from table1, you can whittle them down as well:
select *
from (select t1.*
from table1 t1
where true
qualify row_number() over (partition by id, col1, col2 order by id) = 1
) t1 left join
(select t2.*
from table2 t2
where true
qualify row_number() over (partition by t2.id order by t2.id) = 0
) t2
using (id)

How to Group By all fields ...?
I tried GROUP BY Table1.* but it doesn't work...
Consider below example
SELECT ANY_VALUE(t1).*,
ARRAY_AGG(( SELECT AS STRUCT t2.* EXCEPT(ID))) AS Table2
FROM Table1 t1 LEFT JOIN Table2 t2 USING(ID)
GROUP BY FORMAT('%t', t1)

where column in from another select results with limit (mysql/mariadb)

when i run this query returns all rows that their id exist in select from table2
SELECT * FROM table1 WHERE id in (
SELECT id FROM table2 where name ='aaa'
)
but when i add limit or between to second select :
SELECT * FROM table1 WHERE id in (
SELECT id FROM table2 where name ='aaa' limit 4
)
returns this error :
This version of MariaDB doesn't yet support 'LIMIT & IN/ALL/ANY/SOME subquery'

You are using LIMIT without an ORDER BY. This is generally not recommended because that returns an arbitrary set of rows -- and those can change from one execution to another.
You can convert this to a JOIN -- fortunately. If id is not duplicated in table2:
SELECT t1.*
FROM table1 t1 JOIN
(SELECT t2.id
FROM table2 t2
WHERE t2.name = 'aaa'
LIMIT 4
) t2
USING (id);
If id can be duplicated in table2, then:
SELECT t1.*
FROM table1 t1 JOIN
(SELECT DISTINCT t2.id
FROM table2 t2
WHERE t2.name = 'aaa'
LIMIT 4
) t2
USING (id);
Another fun way uses LIMIT:
SELECT t1.*
FROM table1 t1
WHERE id <= ANY (SELECT t2.id
FROM table2
WHERE t2.name = 'aaa'
ORDER BY t2.id
LIMIT 1 OFFSET 3
);
LIMIT is allowed in a scalar subquery.

You can use an analytic function such as ROW_NUMBER() in order to return one row from the subquery. I suppose, this way no problem would occur like raising too many rows issue :
SELECT * FROM
(
SELECT t1.*,
ROW_NUMBER() OVER (ORDER BY t2.id DESC) AS rn
FROM table1 t1
JOIN table2 t2 ON t2.id = t1.id
WHERE t2.name ='aaa'
) t
WHERE rn = 1
P.S.: Btw, id columns are expected to be primary keys of your tables, aren't they ?
Update ( depending on your need in the comment ) Consider using :
SELECT * FROM
(
SELECT j.*,
ROW_NUMBER() OVER (ORDER BY j.id DESC) AS rn2
FROM job_forum j
CROSS JOIN
( SELECT t.*,
ROW_NUMBER() OVER (PARTITION BY t2.id ORDER BY t2.id DESC) AS rn1
FROM table2 t2
WHERE t2.name ='aaa'
AND t2.id = j.id ) t2
WHERE rn1 = 1
) jj
WHERE rn2 <= 10

SQL Server OUTER JOIN

How to find the non matching records from both the tables?

You can use NOT EXISTS to find out the not matching Id from both the tables and can combine it by UNION ALL.
Query
SELECT t1.[Id] FROM [table-1] t1
WHERE NOT EXISTS(
SELECT 1 FROM [table-2] t2
WHERE t1.[Id] = t2.[Id]
)
UNION ALL
SELECT t2.[Id] FROM [table-2] t2
WHERE NOT EXISTS(
SELECT 1 FROM [table-1] t1
WHERE t1.[Id] = t2.[Id]
);
Demo for reference

Another way with TOP 1 WITH TIES and COUNT OVER:
SELECT TOP 1 WITH TIES *
FROM (
SELECT *
FROM [table-1]
UNION ALL
SELECT *
FROM [table-2]
) u
ORDER BY COUNT(*) OVER (PARTITION BY Id ORDER BY Id)
Output:
Id name
D ...
E ...
F ...
G ...
H ...
I ...
J ...
K ...
COUNT(*) OVER (PARTITION BY Id ORDER BY Id) Gives 1 to unique row and >1 if there are duplicated Ids. If you put that to ORDER BY and add TOP 1 WITH TIES - that will left only Ids with minimal count.
Another way with FULL OUTER JOIN:
SELECT COALESCE(Id1,Id2) Id,
COALESCE(name1,name2) name
FROM (
SELECT t1.Id Id1,
t1.[name] name1,
t2.Id Id2,
t2.[name] name2
FROM [table-1] t1
FULL OUTER JOIN [table-2] t2
ON t1.Id = t2.Id
WHERE t1.Id IS NULL OR t2.ID IS NULL
) as t
Same output (with another order)

Oracle nested correlated subquery problem

Consider table1 and table2 with a one-to-many relationship (table1 is the master table and table2 is the detail table). I want to get records from table1 where some value ('XXX') is the value of the most recent record in table2 of the detail records correlated to table1. What I want to do is this:
select t1.pk_id
from table1 t1
where 'XXX' = (select a_col
from ( select a_col
from table2 t2
where t2.fk_id = t1.pk_id
order by t2.date_col desc)
where rownum = 1)
But, because the reference to table1 (t1) in the correlated subquery is two-levels deep, it pops up with an Oracle error (invalid id t1). I need to be able to rewrite this, but the one caveat is that only the where clause may be changed (i.e. the initial select and from must remain unchanged). Can it be done?

Here's a different analytic approach:
select t1.pk_id
from table1 t1
where 'XXX' = (select distinct first_value(t2.a_col)
over (order by t2.date_col desc)
from table2 t2
where t2.fk_id = t1.pk_id)
And here's the same idea using a ranking function:
select t1.pk_id
from table1 t1
where 'XXX' = (select max(t2.a_col) keep
(dense_rank first order by t2.date_col desc)
from table2 t2
where t2.fk_id = t1.pk_id)

you could use analytics here: join table1 to table2, take the most recent table2 record for each element in table1 and verify that this most recent element has a value of 'XXX':
SELECT *
FROM (SELECT t1.*,
t2.a_col,
row_number() over (PARTITION BY t1.pk
ORDER BY t2.date_col DESC) rnk
FROM table1 t1
JOIN table2 t2 ON t2.fk_id = t1.pk_id)
WHERE rnk = 1
AND a_col = 'XXX'
Update: Without modifying the top-level SELECT, you could write a query like this:
SELECT t1.pk_id
FROM table1 t1
WHERE 'XXX' =
(SELECT a_col
FROM (SELECT a_col,
t2_in.fk_id,
row_number() over(PARTITION BY t2_in.fk_id
ORDER BY t2_in.date_col DESC) rnk
FROM table2 t2_in) t2
WHERE rnk = 1
AND t2.fk_id = t1.pk_id)
Basically you only join (SEMI-JOIN) the rows from table2 that are the most recent for each fk_id

Try this:
select t1.pk_id
from table1 t1
where 'XXX' =
(select a_col
from table2 t2
where t2.fk_id = t1.pk_id
and t2.date_col =
(select max(t3.date_col)
from table2 t3
where t3.fk_id = t2.fk_id)
)

Does this do what you are looking for?
select t1.pk_id
from table1 t1
where 'XXX' = ( select a_col
from table2 t2
where t2.fk_id = t1.pk_id
t2.date_col = (select max(date_col) from table2 where fk_id = t1.pk_id)
)

SQL Query with conditional JOIN

The scenario:
Table1
CatId|Name|Description
Table2
ItId|Title|Date|CatId (foreign key)
I want to return all rows from Table1 and Title,Date from Table2, where
The returned from Table 2 must be the Latest one by the date column.
(in second table there many items with same CatId and I need just the latest)
I have 2 queries but can't merge them together:
Query 1:
SELECT Table1.Name, Table1.Description,
Table2.Title, Table2.Date
FROM
Table1 LEFT JOIN Table2 ON Table1.CatId=Table2.CatId
Query2:
SELECT TOP 1 Table2.Title, Table2.Date
FROM
Table2
WHERE
Table2.CatId = #inputParam
ORDER BY Table2.Date DESC

You can use a UNION, but you'll need to make the columns match up:
OK, after rereading the question, I understand what you're trying to do.
This should do the trick:
SELECT Table1.Name, Table1.Description,
T2.Title, T2.Date
FROM
Table1
LEFT JOIN (
SELECT CatId, Title, Date, ROW_NUMBER() over (ORDER BY CatId, Date DESC) - RANK() over (ORDER BY CatID) as Num
FROM Table2) T2 on T2.CatId = Table1.CatId AND T2.Num = 0

Sounds like you're talking about a groupwise maximum (newest row in Table2 for each matching row in Table1), in which case, the easiest way is use ROW_NUMBER:
WITH CTE AS
(
SELECT
t1.Name, t1.Description, t2.Title, t2.Date,
ROW_NUMBER() OVER (PARTITION BY t1.CatId ORDER BY t2.Date DESC) AS Seq
FROM Table1 t1
LEFT JOIN Table2 t2
ON t2.CatId = t1.CatId
)
SELECT *
FROM CTE
WHERE Seq = 1
OR Date IS NULL

Shouldn't this work?
SELECT Table1.Name, Table1.Description,
T2.Title, T2.Date
FROM
Table1 LEFT JOIN (
SELECT TOP 1 Table2.CatId Table2.Title, Table2.Date
FROM
Table2
WHERE
Table2.CatId = Table1.catId
ORDER BY Table2.Date DESC
) T2
ON Table1.CatId=T2.CatId

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

LEFT Join on a Subquery with specific criteria - sql

For a single value, use a correlated sub-query: SELECT a.id, a.name, a.num, ( SELECT TOP 1 SomeValue FROM table2 As b WHERE b.id = a.id ORDER BY b.EndDt DESC ) As SomeVal FROM table1 a

Related

How to Group By all fields nested tables in a Left Join query in BigQuery?

where column in from another select results with limit (mysql/mariadb)

SQL Server OUTER JOIN

Oracle nested correlated subquery problem

SQL Query with conditional JOIN

Categories

Resources