multiple joins to same sub query - sql

I need to join to the same table repeatedly but this looks ugly. Any suggestions appreciated. Below is simplified SQL, I have 8 tables in the subquery and it produces many duplicate records of the same date, so I need to find only the newest record for each client. (I don't think DB and/or version should matter, but I am using DB2 11.1 LUW)
select c.client_num, a.eff_date, t.trx_date
from client c
join address a on(a.id = c.addr_id)
join transaction t on(t.addr_id = a.id)
{many other joins}
where {many conditions};
select SQ.*
from [ABOVE_SUBQUERY] SQ
join
(select client_num, max(eff_date) AS newest_date from [ABOVE_SUBQUERY] group by client_num) AA
ON(SQ.client_num = AA.client_num and SQ.eff_date = AA.newest_date)
join
(select client_num, max(trx_date) AS newest_date from [ABOVE_SUBQUERY] group by client_num) TT
ON(SQ.client_num = TT.client_num and SQ.trx_date = TT.newest_date)

I need to fine only the newest record for each client.
Can't you just use row_number()?
select t.*
from (select t.*,
row_number() over (partition by client_num order by eff_date desc, eff_time desc) as seqnum
from <whatever> t
) t
where seqnum = 1;

Related

Remove multiple rows with same ID

So I've done some looking around and wasn't unable to find quite what I was looking for. I have two tables.
1.) Table where general user information is stored
2.) Where a status is generated and stored.
The problem is, is that there are multiple rows for the same users and querying these results in multiple returns. I can't just merge them because they aren't all the same status. I need just the newest status from that table.
Example of the table:
SELECT DISTINCT
TOP(50) cam.UserID AS PatientID,
mppi.DisplayName AS Surgeon,
ISNULL(sci.IOPStatus, 'N/A') AS Status,
tkstat.TrackerStatusID AS Stat_2
FROM
Main AS cam
INNER JOIN
Providers AS rap
ON cam.VisitID = rap.VisitID
INNER JOIN
ProviderInfo AS mppi
ON rap.UnvUserID = mppi.UnvUserID
LEFT OUTER JOIN
Inop AS sci
ON cam.CwsID = sci.CwsID
LEFT OUTER JOIN
TrackerStatus AS tkstat
ON cam.CwsID = tkstat.CwsID
WHERE
(
cam.Location_ID IN
(
'SURG'
)
)
AND
(
rap.IsAttending = 'Y'
)
AND
(
cam.DateTime BETWEEN CONCAT(CAST(GETDATE() AS DATE), ' 00:00:00') AND CONCAT(CAST(GETDATE() AS DATE), ' 23:59:59')
)
AND
(
cam.Status_StatusID != 'Cancelled'
)
ORDER BY
cam.UserID ASC
So I need to grab only the newest Stat_2 from each ID so they aren't returning multiple rows. Each Stat_2 also has an update time meaning I can sort by the time/date that column is : StatusDateTime
One way to handle this is to create a calculated row_number for the table where you need the newest record.
Easiest way to do that is to change your TKSTAT join to a derived table with the row_number calculation and then add a constraint to your join where the RN =1
SELECT DISTINCT TOP (50)
cam.UserID AS PatientID, mppi.DisplayName AS Surgeon, ISNULL(sci.IOPStatus, 'N/A') AS Status, tkstat.TrackerStatusID AS Stat_2
FROM Main AS cam
INNER JOIN Providers AS rap ON cam.VisitID = rap.VisitID
INNER JOIN ProviderInfo AS mppi ON rap.UnvUserID = mppi.UnvUserID
LEFT OUTER JOIN Inop AS sci ON cam.CwsID = sci.CwsID
LEFT OUTER JOIN (SELECT tk.CwsID, tk.TrackerStatusId, ROW_NUMBER() OVER (PARTITION BY tk.cwsId ORDER BY tk.CreationDate DESC) AS rn FROM TrackerStatus tk)AS tkstat ON cam.CwsID = tkstat.CwsID
AND tkstat.rn = 1
WHERE (cam.Location_ID IN ( 'SURG' )) AND (rap.IsAttending = 'Y')
AND (cam.DateTime BETWEEN CONCAT(CAST(GETDATE() AS DATE), ' 00:00:00') AND CONCAT(CAST(GETDATE() AS DATE), ' 23:59:59'))
AND (cam.Status_StatusID != 'Cancelled')
ORDER BY cam.UserID ASC;
Note you need a way to derive what the "newest" status is; I assume there is a created_date or something; you'll need to enter the correct colum name
ROW_NUMBER() OVER (PARTITION BY tk.cwsId ORDER BY tk.CreationDate DESC) AS rn
SQL Server doesn't offer a FIRST function, but you can reproduce the functionality with ROW_NUMBER() like this:
With Qry1 (
Select <other columns>,
ROW_NUMBER() OVER(
PARTITION BY <group by columns>
ORDER BY <time stamp column*> DESC
) As Seq
From <the rest of your select statement>
)
Select *
From Qry1
Where Seq = 1
* for the "newest" record.

Correlated subquery pattern is not supported due to internal error - where not exists correlated subquery

I have a query that is giving me the error above. My code is the following:
SELECT *,
dense_rank() OVER (PARTITION BY email
ORDER BY priority_score,
comp) AS r
FROM main_query
WHERE NOT EXISTS
(SELECT name,
event,
email,
report_date
FROM gm
WHERE gm.name= main_query.name
AND gm.event= main_query.event
AND gm.email = main_query.email
AND gm.report_date >= (CURRENT_DATE - 25)::date)
ORDER BY priority_score ASC
One solution that I saw to overcome these types of errors was to be able to transform correlated subqueries in queries not correlated (sherlock). Therefore, I am searching for other ways of using the where not exists statement but without a correlated subquery, i.e., calling the main_query table inside the subquery ((...) from gm left join main_query on(...)). Does anyone know if this is possible and how to do it?
Any advice is more than welcome and thanks a lot in advance!
Does a LEFT JOIN version work?
SELECT mq.*,
dense_rank() OVER (PARTITION BY mq.email
ORDER BY mq.priority_score, comp
) AS r
FROM main_query mq LEFT JOIN
gm
ON gm.name = mq.name AND
gm.event= mq.event AND
gm.email = mq.email AND
gm.report_date >= (CURRENT_DATE - 25)::date)
WHERE gm.name IS NULL
ORDER BY priority_score ASC;
If that doesn't work, it should work like this:
SELECT mq.*,
dense_rank() OVER (PARTITION BY mq.email
ORDER BY mq.priority_score, comp
) AS r
FROM main_query mq LEFT JOIN
(SELECT gm.*
FROM gm
WHERE gm.report_date >= (CURRENT_DATE - 25)::date)
) gm
ON gm.name = mq.name AND
gm.event= mq.event AND
gm.email = mq.email
WHERE gm.name IS NULL
ORDER BY priority_score ASC

Joining Onto Partition Statement HANA SQL

I am trying to join two tables to the following query:
SELECT "NUMBER",
"U_ANALYZED_DATE",
"DV_SALES_ACCOUNT",
"U_USD_TOTAL_POTENTIAL_NNACV"
FROM
(select *, row_number() over ( partition by "DV_SALES_ACCOUNT" order by "U_ANALYZED_DATE" desc ) rownum
from "SURF_RT"."SALES_REQUEST")
WHERE rownum = 1
AND "DV_SALES_CATEGORY" = 'Compliance'
AND "DV_STATE" NOT IN ('Closed Canceled')
AND (YEAR("U_ANALYZED_DATE") = '2019' AND MONTH("U_ANALYZED_DATE") IN ('10','11','12')
OR YEAR("U_ANALYZED_DATE") = '2020' AND MONTH("U_ANALYZED_DATE") IN ('1','2','3'))
AND "U_USD_TOTAL_POTENTIAL_NNACV" > 0
ORDER BY "U_ANALYZED_DATE" desc
The tables should be joined as follows:
JOIN "SURF_RT"."SALES_ACCOUNT" on "SURF_RT"."SALES_ACCOUNT"."NAME" = "SURF_RT"."SALES_REQUEST"."DV_SALES_ACCOUNT"
JOIN "SURF_RT"."SALES_CONTRACT" on "SURF_RT"."SALES_CONTRACT"."DV_ACCOUNT" = "SURF_RT"."SALES_REQUEST"."DV_SALES_ACCOUNT"
I am getting an error no matter what I try and it has to be because of the partition. Does anyone know the solution here?
I suspect that you just need to alias the derived table so you can then refer to join it. In many databases this is mandatory anyway, but apparently not in Hana (otherwise your original query would not run).
But to join on a derived table (the resultset that is generated by the subquery), alias do help:
SELECT
...
FROM
(
select
*,
row_number() over ( partition by "DV_SALES_ACCOUNT" order by "U_ANALYZED_DATE" desc ) rownum
from "SURF_RT"."SALES_REQUEST"
) sr -- table alias
JOIN "SURF_RT"."SALES_ACCOUNT"
ON "SURF_RT"."SALES_ACCOUNT"."NAME" = "SURF_RT"."SALES_REQUEST"."DV_SALES_ACCOUNT"
JOIN "SURF_RT"."SALES_CONTRACT"
ON "SURF_RT"."SALES_CONTRACT"."DV_ACCOUNT" = sr."DV_SALES_ACCOUNT" --reference to the derived table
WHERE
...
Side notes:
you should use table aliases for other tables involved in the query too, in order to make the code more readable
you should also prefix each column in the query with the identifier of the table it belongs to, so the query is unambiguous (and easier to maintain)
The problem with this query is not "that the where rownum = 1 needs to be at the end" but that the OP got confused by the bracketing of SQL expression.
More specifically, trying to reference the sub-query data by specifying join conditions against the base table that is used in the sub-query:
JOIN "SURF_RT"."SALES_ACCOUNT"
on "SURF_RT"."SALES_ACCOUNT"."NAME" = "SURF_RT"."SALES_REQUEST"."DV_SALES_ACCOUNT"
JOIN "SURF_RT"."SALES_CONTRACT"
on "SURF_RT"."SALES_CONTRACT"."DV_ACCOUNT" = "SURF_RT"."SALES_REQUEST"."DV_SALES_ACCOUNT"
Since the sub-query (derived table) is used in the query and should be used for the join, it needs to be referred in the join condition, instead.
So, yes, it needs a table alias here and the join conditions need to refer to it.
SELECT
...
FROM
(select *
, row_number() over
(partition by "DV_SALES_ACCOUNT"
order by "U_ANALYZED_DATE" desc) rownum
from
"SURF_RT"."SALES_REQUEST") sr
INNER JOIN "SURF_RT"."SALES_ACCOUNT" sa
on sa."NAME" = sr."DV_SALES_ACCOUNT"
INNER JOIN "SURF_RT"."SALES_CONTRACT" sc
on sc."DV_ACCOUNT" = sr."DV_SALES_ACCOUNT"
WHERE
sr.rownum = 1
AND "DV_SALES_CATEGORY" = 'Compliance'
AND "DV_STATE" NOT IN ('Closed Canceled')
AND (YEAR("U_ANALYZED_DATE") = '2019'
AND MONTH("U_ANALYZED_DATE") IN ('10','11','12')
OR YEAR("U_ANALYZED_DATE") = '2020'
AND MONTH("U_ANALYZED_DATE") IN ('1','2','3'))
AND "U_USD_TOTAL_POTENTIAL_NNACV" > 0
ORDER BY
"U_ANALYZED_DATE" desc;
With just this little bit of standard SQL syntax and formatting of code the query got a lot easier to understand.
Now it's even obvious that the IN conditions for MONTH and YEAR should in fact be integers, not strings as these functions return integers.
SELECT
...
FROM
(SELECT *
, row_number() over
(partition by "DV_SALES_ACCOUNT"
order by "U_ANALYZED_DATE" desc) rownum
FROM
"SURF_RT"."SALES_REQUEST") sr
INNER JOIN "SURF_RT"."SALES_ACCOUNT" sa
on sa."NAME" = sr."DV_SALES_ACCOUNT"
INNER JOIN "SURF_RT"."SALES_CONTRACT" sc
on sc."DV_ACCOUNT" = sr."DV_SALES_ACCOUNT"
WHERE
sr.rownum = 1
AND "DV_SALES_CATEGORY" = 'Compliance'
AND "DV_STATE" NOT IN ('Closed Canceled')
AND ( YEAR("U_ANALYZED_DATE") = 2019
AND MONTH("U_ANALYZED_DATE") IN (10, 11, 12)
OR YEAR("U_ANALYZED_DATE") = '2020'
AND MONTH("U_ANALYZED_DATE") IN (1 , 2 , 3 )
)
AND "U_USD_TOTAL_POTENTIAL_NNACV" > 0
ORDER BY
"U_ANALYZED_DATE" DESC

SQL Most Recent Register FROM Second Table by Id

I have 2 tables (Opportunity and Stage). I need to get each opportunity with the most recent stage by StageTypeId.
Opportunity: Id, etc
Stage: Id, CreatedOn, OpportunityId, StageTypeId.
Let's suppose I have "opportunity1" and "opportunity2" each one with many Stages added.
By passing the StageTypeId I need to get the opportunity which has this StageTypeId as most recent.
I'm trying the following query but it´s replicating the same Stage for all the Opportunities.
It seems that it's ignoring this line: "AND {Stage}.[OpportunityId] = ID"
SELECT {Opportunity}.[Id] ID,
{Opportunity}.[Name],
{Opportunity}.[PotentialAmount],
{Contact}.[FirstName],
{Contact}.[LastName],
(SELECT * FROM
(
SELECT {Stage}.[StageTypeId]
FROM {Stage}
WHERE {Stage}.[StageTypeId] = #StageTypeId
AND {Stage}.[OpportunityId] = ID
ORDER BY {Stage}.[CreatedOn] DESC
)
WHERE ROWNUM = 1) AS StageTypeId
FROM {Opportunity}
LEFT JOIN {Contact}
ON {Opportunity}.[ContactId] = {Contact}.[Id]
Thank you
Most of DBMS support fetch first clause So, you can do :
select o.*
from Opportunity o
where o.StageTypeId = (select s.StageTypeId
from Stage s
where s.OpportunityId = o.id
order by s.CreatedOn desc
fetch first 1 rows only
);
you can try below way all dbms will support
select TT*. ,o*. from
(
select s1.OpportunityId,t.StageTypeId from Stage s1 inner join
(select StageTypeId,max(CreatedOn) as createdate Stage s
group by StageTypeId
) t
on s1.StageTypeId=t.StageTypeId and s1.CreatedOn=t.createdate
) as TT inner join Opportunity o on TT.OpportunityId=o.id

access - row_number function?

I had this query, which gives me the desired results on postgres
SELECT
t.*,
ROW_NUMBER() OVER (PARTITION BY t."Internal_reference", t."Movement_date" ORDER BY t."Movement_date") AS "cnt"
FROM (SELECT
"Internal_reference",
MAX("Movement_date") AS maxtime
FROM dw."LO-D4_Movements"
GROUP BY "Internal_reference") r
INNER JOIN dw."LO-D4_Movements" t
ON t."Movement_date" = r.maxtime
AND t."Internal_reference" = r."Internal_reference"
Issue is I have to translate the query above on Access where the analytical function does not exist ...
I used this answer to build the query below
SELECT
t."Internal_reference",
t.from_code,
t.to_code,
t."Movement_date",
t.shipment_number,
t."PO_number",
t."Quantity",
t."Movement_value",
t."Site",
t."Import_date",
COUNT(*) AS "cnt"
FROM (
SELECT "Internal_reference",
MAX("Movement_date") AS maxtime
FROM dw."LO-D4_Movements"
GROUP BY "Internal_reference") r
LEFT OUTER JOIN dw."LO-D4_Movements" t
ON t."Movement_date" = r.maxtime AND t."Internal_reference" = r."Internal_reference"
GROUP BY
t.from_code,
t.to_code,
t."Movement_date",
t.shipment_number,
t."PO_number",
t."Quantity",
t."Movement_value",
t."Site",
t."Import_date",
t."Internal_reference"
ORDER BY t.from_code
Issue is I only have 1 in the cnt column.
I tried to tweak it by removing the internal_reference (see below)
SELECT
t.from_code,
t.to_code,
t."Movement_date",
t.shipment_number,
t."PO_number",
t."Quantity",
t."Movement_value",
t."Site",
t."Import_date",
COUNT(*) AS "cnt"
FROM (
SELECT "Internal_reference",
MAX("Movement_date") AS maxtime
FROM dw."LO-D4_Movements"
GROUP BY "Internal_reference") r
LEFT OUTER JOIN dw."LO-D4_Movements" t
ON t."Movement_date" = r.maxtime AND t."Internal_reference" = r."Internal_reference"
GROUP BY
t.from_code,
t.to_code,
t."Movement_date",
t.shipment_number,
t."PO_number",
t."Quantity",
t."Movement_value",
t."Site",
t."Import_date"
ORDER BY t.from_code
However, the results are even worse. The cnt is growing but it gives me the wrong cnt
Any help are more than welcome as I'm slow losing my sanity.
Thanks
Edit: Please find the sqlfiddle
I think Gordon-Linoff's code is close to what you want, but there are some typos I couldn't correct without a rewrite, so here's my attempt
SELECT
t1.Internal_reference,
t1.Movement_date,
t1.PO_Number as Combination_Of_Columns_Which_Make_This_Unique,
t1.Other_columns,
Count(1) AS Cnt
FROM
([LO-D4_Movements] AS t1
INNER JOIN [LO-D4_Movements] AS t2 ON
t1.Internal_reference = t2.Internal_reference AND
t1.Movement_date = t2.Movement_date)
INNER JOIN (
SELECT
t3.Internal_reference,
MAX(t3.Movement_date) AS Maxtime
FROM
[LO-D4_Movements] AS t3
GROUP BY
t3.Internal_reference
) AS r ON
t1.Internal_reference = r.Internal_reference AND
t1.Movement_date = r.Maxtime
WHERE
t1.PO_Number>=t2.PO_Number
GROUP BY
t1.Internal_reference,
t1.Movement_date,t1.PO_Number,
t1.Other_columns
ORDER BY
t1.Internal_reference,
t1.Movement_date,
Count(1);
In addition to within the max(movement_date) subquery, the main table is brought in twice. One version is the one for showing in your results, the other is for counting records to generate the sequence numbers.
Gordon said you need a unique id column for each row. And that's true if by "column" you mean to include derived columns also. Also it only needs to be unique within any combination of "internal_reference" and "Movement_date".
I've assumed, perhaps wrongly, that PO_Number will suffice. If not, concatenate with that (and some delimeters) other fields which will make it unique. The where clause will need updating to compare t1 and t2 for the "Combination of Columns which make this unique".
If, there is no appropriate combination available, I'm not sure it can be done without VBA and/or temp tables as The-Gambill suggested.
This is a real pain in MS Access, as far as I know. One method is a correlated subquery, but you need a unique id column on each row:
SELECT t.*,
(SELECT COUNT(*)
FROM (SELECT "Internal_reference", MAX("Movement_date") AS maxtime
FROM dw."LO-D4_Movements"
GROUP BY "Internal_reference"
) as t2
WHERE t2."Internal_reference" AND t."Internal_reference" AND
t2."Movement_date" = t."Movement_date" AND
t2.?? <= t.??
) as cnt
FROM (SELECT "Internal_reference", MAX("Movement_date") AS maxtime
FROM dw."LO-D4_Movements"
GROUP BY "Internal_reference"
) r INNER JOIN
dw."LO-D4_Movements" t
ON t."Movement_date" = r.maxtime AND
t."Internal_reference" = r."Internal_reference";
The ?? is for the id or creation date or something to allow the counting of rows.