INNER JOIN on a Sub Query

INNER JOIN on a Sub Query - sql

I have a list of tasks in a table called dbo.Task
In the database, each Task can have 1 or more rows in the TaskLine table.
TaskLine has a TaskID to related the Tasklines to the Task.
A TaskLine has a column called TaskHeadingTypeID
I need to return all the tasks, joined to the LAST TaskLine for that Task.
In english, I need to display a task, with the latest TaskLine heading. So, I basically need to join to the TaskLine table, like this (which, is incorrect and maybe inefficient, but hopefully shows what I am trying to do)
SELECT *
FROM #Results r
INNER JOIN (
SELECT TOP 1 TaskID, TaskHeadingTypeID FROM dbo.TaskLine
ORDER BY TaskLineID DESC
) tl
ON tl.TaskID = r.TaskID
However, the issue is, the sub query only brings back the last TaskLine row, which is incorrect.
Edit:
At the moment, it's 'Working' like the code below, but it seems highly inefficient, because for each task row, it has to run two extra queries. And they're both on the same table, just slightly different columns in that table:
(An extract of the columns in the SELECT cause)
SELECT TaskStatusID,
TaskStatus,
(SELECT TOP 1 TaskHeadingTypeID FROM dbo.TaskLine
WHERE TaskID = r.TaskID
ORDER BY TaskLineID DESC) AS TaskHeadingID,
(SELECT TOP 1 LongName FROM dbo.TaskLine tl
INNER JOIN ref.TaskHeadingType tht
ON tht.TaskHeadingTypeID = tl.TaskHeadingTypeID
WHERE TaskID = r.TaskID
ORDER BY TaskLineID DESC) AS TaskHeading,
PersonInCareID,
ICMSPartyID,
CarerID.... FROM...
EDIT:
Thanks to the ideas and comments below, I have ended up with this, using CTE:
;WITH ValidTaskLines (RowNumber, TaskID, TaskHeadingTypeID, TaskHeadingType)
AS
(SELECT
ROW_NUMBER()OVER(PARTITION BY tl.TaskID, tl.TaskHeadingTypeID ORDER BY tl.TaskLineID) AS RowNumber,
tl.TaskID,
tl.TaskHeadingTypeID,
LongName AS TaskHeadingType
FROM dbo.TaskLine tl
INNER JOIN ref.TaskHeadingType tht
ON tht.TaskHeadingTypeID = tl.TaskHeadingTypeID
)
SELECT AssignedByBusinessUserID,
BusinessUserID,
LoginName,
Comments,
r.CreateDate,
r.CreateUser,
r.Deleted,
r.Version,
IcmsBusinessUserID,
r.LastUpdateDate,
r.LastUpdateUser,
OverrrideApprovalBusinessUserID,
PlacementID,
r.TaskID,
TaskPriorityTypeID,
TaskPriorityCode,
TaskPriorityType,
TaskStatusID,
TaskStatus,
vtl.TaskHeadingTypeID AS TaskHeadingID,
vtl.TaskHeadingType AS TaskHeading,
PersonInCareID,
ICMSPartyID,
CarerID,
ICMSCarerEntityID,
StartDate,
EndDate
FROM #Results r
INNER JOIN ValidTaskLines vtl
ON vtl.TaskID = r.TaskID
AND vtl.RowNumber = 1

You could use the ROW_NUMBER() function for this:
SELECT *
FROM #Results r
INNER JOIN (SELECT TaskID
, TaskHeadingTypeID
, ROW_NUMBER()OVER(PARTITION BY TaskID, TaskHeadingTypeID ORDER BY TAskLineID DESC) RN
FROM dbo.TaskLine
) tl
ON tl.TaskID = r.TaskID
AND t1.RN = 1
The ROW_NUMBER() function assigns a number to each row. PARTITION BY is optional, but used to start the numbering over for each value in that group, ie: if you PARTITION BY Some_Date then for each unique date value the numbering would start over at 1. ORDER BY of course is used to define how the counting should go, and is required in the ROW_NUMBER() function.
You may need to adjust the PARTITION BY to suit your query, run the subquery by itself to get an idea of how the ROW_NUMBER() works.

Related

Remove multiple rows with same ID

So I've done some looking around and wasn't unable to find quite what I was looking for. I have two tables.
1.) Table where general user information is stored
2.) Where a status is generated and stored.
The problem is, is that there are multiple rows for the same users and querying these results in multiple returns. I can't just merge them because they aren't all the same status. I need just the newest status from that table.
Example of the table:
SELECT DISTINCT
TOP(50) cam.UserID AS PatientID,
mppi.DisplayName AS Surgeon,
ISNULL(sci.IOPStatus, 'N/A') AS Status,
tkstat.TrackerStatusID AS Stat_2
FROM
Main AS cam
INNER JOIN
Providers AS rap
ON cam.VisitID = rap.VisitID
INNER JOIN
ProviderInfo AS mppi
ON rap.UnvUserID = mppi.UnvUserID
LEFT OUTER JOIN
Inop AS sci
ON cam.CwsID = sci.CwsID
LEFT OUTER JOIN
TrackerStatus AS tkstat
ON cam.CwsID = tkstat.CwsID
WHERE
(
cam.Location_ID IN
(
'SURG'
)
)
AND
(
rap.IsAttending = 'Y'
)
AND
(
cam.DateTime BETWEEN CONCAT(CAST(GETDATE() AS DATE), ' 00:00:00') AND CONCAT(CAST(GETDATE() AS DATE), ' 23:59:59')
)
AND
(
cam.Status_StatusID != 'Cancelled'
)
ORDER BY
cam.UserID ASC
So I need to grab only the newest Stat_2 from each ID so they aren't returning multiple rows. Each Stat_2 also has an update time meaning I can sort by the time/date that column is : StatusDateTime

One way to handle this is to create a calculated row_number for the table where you need the newest record.
Easiest way to do that is to change your TKSTAT join to a derived table with the row_number calculation and then add a constraint to your join where the RN =1
SELECT DISTINCT TOP (50)
cam.UserID AS PatientID, mppi.DisplayName AS Surgeon, ISNULL(sci.IOPStatus, 'N/A') AS Status, tkstat.TrackerStatusID AS Stat_2
FROM Main AS cam
INNER JOIN Providers AS rap ON cam.VisitID = rap.VisitID
INNER JOIN ProviderInfo AS mppi ON rap.UnvUserID = mppi.UnvUserID
LEFT OUTER JOIN Inop AS sci ON cam.CwsID = sci.CwsID
LEFT OUTER JOIN (SELECT tk.CwsID, tk.TrackerStatusId, ROW_NUMBER() OVER (PARTITION BY tk.cwsId ORDER BY tk.CreationDate DESC) AS rn FROM TrackerStatus tk)AS tkstat ON cam.CwsID = tkstat.CwsID
AND tkstat.rn = 1
WHERE (cam.Location_ID IN ( 'SURG' )) AND (rap.IsAttending = 'Y')
AND (cam.DateTime BETWEEN CONCAT(CAST(GETDATE() AS DATE), ' 00:00:00') AND CONCAT(CAST(GETDATE() AS DATE), ' 23:59:59'))
AND (cam.Status_StatusID != 'Cancelled')
ORDER BY cam.UserID ASC;
Note you need a way to derive what the "newest" status is; I assume there is a created_date or something; you'll need to enter the correct colum name
ROW_NUMBER() OVER (PARTITION BY tk.cwsId ORDER BY tk.CreationDate DESC) AS rn

SQL Server doesn't offer a FIRST function, but you can reproduce the functionality with ROW_NUMBER() like this:
With Qry1 (
Select <other columns>,
ROW_NUMBER() OVER(
PARTITION BY <group by columns>
ORDER BY <time stamp column*> DESC
) As Seq
From <the rest of your select statement>
)
Select *
From Qry1
Where Seq = 1
* for the "newest" record.

Take Sum of time difference and Last value of a column

I have a table in which we store the StartTime and StopTime for a task. One task can be assigned to multiple technicians. One task can have multiple start times, stop times (so multiple rows, here Doc Num is primary ID).
I want to calculate Sum of difference between Start time and Stop time based on Task ID and Technician ID.
Also I need to take the status of last row to determine the Current Status of the Task for each Technician. I tried following query. But it didn't work.
SELECT T1.TaskId, SUM(DATEDIFF(second, T1.StartTime,T1.StopTime)) as TaskDuration
, T3.TechnicianId, T3.FinalStatus
FROM Tbl_TaskTracking T1
INNER JOIN (
SELECT DISTINCT T2.TaskId, T2.technicianId
, first_Value(T2.Status) OVER (PARTITION BY T2.TASKID, T2.technicianId ORDER BY T2.TASKID, T2.technicianId, T2.docnum desc) AS FinalStatus
FROM Tbl_TaskTracking T2
) AS T3 ON T1.TaskId = T3.TaskId
WHERE T1.TaskId = '2001628'
GROUP BY T1.TaskId, T3.TechnicianId, T3.FinalStatus
My table data look like this. This rows showing data for a particular Task ID.

You are just missing the following in your join on clause:
AND T1.TechnicianId = T3.TechnicianId
It's like this:
SELECT T1.TaskId
,SUM(DATEDIFF(second, T1.StartTime,T1.StopTime)) as TaskDuration
,T3.TechnicianId
,T3.FinalStatus
FROM Tbl_TaskTracking T1
INNER JOIN
(
SELECT DISTINCT T2.TaskId
,T2.technicianId
,first_Value(T2.Status) OVER (PARTITION BY T2.TASKID,T2.technicianId ORDER BY T2.TASKID,T2.technicianId,T2.docnum desc) AS FinalStatus
FROM Tbl_TaskTracking T2
) AS T3
ON T1.TaskId = T3.TaskId
AND T1.TechnicianId = T3.TechnicianId
WHERE T1.TaskId = '2001628'
GROUP BY T1.TaskId
,T3.TechnicianId
,T3.FinalStatus

SQL Most Recent Register FROM Second Table by Id

I have 2 tables (Opportunity and Stage). I need to get each opportunity with the most recent stage by StageTypeId.
Opportunity: Id, etc
Stage: Id, CreatedOn, OpportunityId, StageTypeId.
Let's suppose I have "opportunity1" and "opportunity2" each one with many Stages added.
By passing the StageTypeId I need to get the opportunity which has this StageTypeId as most recent.
I'm trying the following query but it´s replicating the same Stage for all the Opportunities.
It seems that it's ignoring this line: "AND {Stage}.[OpportunityId] = ID"
SELECT {Opportunity}.[Id] ID,
{Opportunity}.[Name],
{Opportunity}.[PotentialAmount],
{Contact}.[FirstName],
{Contact}.[LastName],
(SELECT * FROM
(
SELECT {Stage}.[StageTypeId]
FROM {Stage}
WHERE {Stage}.[StageTypeId] = #StageTypeId
AND {Stage}.[OpportunityId] = ID
ORDER BY {Stage}.[CreatedOn] DESC
)
WHERE ROWNUM = 1) AS StageTypeId
FROM {Opportunity}
LEFT JOIN {Contact}
ON {Opportunity}.[ContactId] = {Contact}.[Id]
Thank you

Most of DBMS support fetch first clause So, you can do :
select o.*
from Opportunity o
where o.StageTypeId = (select s.StageTypeId
from Stage s
where s.OpportunityId = o.id
order by s.CreatedOn desc
fetch first 1 rows only
);

you can try below way all dbms will support
select TT*. ,o*. from
(
select s1.OpportunityId,t.StageTypeId from Stage s1 inner join
(select StageTypeId,max(CreatedOn) as createdate Stage s
group by StageTypeId
) t
on s1.StageTypeId=t.StageTypeId and s1.CreatedOn=t.createdate
) as TT inner join Opportunity o on TT.OpportunityId=o.id

access - row_number function?

I had this query, which gives me the desired results on postgres
SELECT
t.*,
ROW_NUMBER() OVER (PARTITION BY t."Internal_reference", t."Movement_date" ORDER BY t."Movement_date") AS "cnt"
FROM (SELECT
"Internal_reference",
MAX("Movement_date") AS maxtime
FROM dw."LO-D4_Movements"
GROUP BY "Internal_reference") r
INNER JOIN dw."LO-D4_Movements" t
ON t."Movement_date" = r.maxtime
AND t."Internal_reference" = r."Internal_reference"
Issue is I have to translate the query above on Access where the analytical function does not exist ...
I used this answer to build the query below
SELECT
t."Internal_reference",
t.from_code,
t.to_code,
t."Movement_date",
t.shipment_number,
t."PO_number",
t."Quantity",
t."Movement_value",
t."Site",
t."Import_date",
COUNT(*) AS "cnt"
FROM (
SELECT "Internal_reference",
MAX("Movement_date") AS maxtime
FROM dw."LO-D4_Movements"
GROUP BY "Internal_reference") r
LEFT OUTER JOIN dw."LO-D4_Movements" t
ON t."Movement_date" = r.maxtime AND t."Internal_reference" = r."Internal_reference"
GROUP BY
t.from_code,
t.to_code,
t."Movement_date",
t.shipment_number,
t."PO_number",
t."Quantity",
t."Movement_value",
t."Site",
t."Import_date",
t."Internal_reference"
ORDER BY t.from_code
Issue is I only have 1 in the cnt column.
I tried to tweak it by removing the internal_reference (see below)
SELECT
t.from_code,
t.to_code,
t."Movement_date",
t.shipment_number,
t."PO_number",
t."Quantity",
t."Movement_value",
t."Site",
t."Import_date",
COUNT(*) AS "cnt"
FROM (
SELECT "Internal_reference",
MAX("Movement_date") AS maxtime
FROM dw."LO-D4_Movements"
GROUP BY "Internal_reference") r
LEFT OUTER JOIN dw."LO-D4_Movements" t
ON t."Movement_date" = r.maxtime AND t."Internal_reference" = r."Internal_reference"
GROUP BY
t.from_code,
t.to_code,
t."Movement_date",
t.shipment_number,
t."PO_number",
t."Quantity",
t."Movement_value",
t."Site",
t."Import_date"
ORDER BY t.from_code
However, the results are even worse. The cnt is growing but it gives me the wrong cnt
Any help are more than welcome as I'm slow losing my sanity.
Thanks
Edit: Please find the sqlfiddle

I think Gordon-Linoff's code is close to what you want, but there are some typos I couldn't correct without a rewrite, so here's my attempt
SELECT
t1.Internal_reference,
t1.Movement_date,
t1.PO_Number as Combination_Of_Columns_Which_Make_This_Unique,
t1.Other_columns,
Count(1) AS Cnt
FROM
([LO-D4_Movements] AS t1
INNER JOIN [LO-D4_Movements] AS t2 ON
t1.Internal_reference = t2.Internal_reference AND
t1.Movement_date = t2.Movement_date)
INNER JOIN (
SELECT
t3.Internal_reference,
MAX(t3.Movement_date) AS Maxtime
FROM
[LO-D4_Movements] AS t3
GROUP BY
t3.Internal_reference
) AS r ON
t1.Internal_reference = r.Internal_reference AND
t1.Movement_date = r.Maxtime
WHERE
t1.PO_Number>=t2.PO_Number
GROUP BY
t1.Internal_reference,
t1.Movement_date,t1.PO_Number,
t1.Other_columns
ORDER BY
t1.Internal_reference,
t1.Movement_date,
Count(1);
In addition to within the max(movement_date) subquery, the main table is brought in twice. One version is the one for showing in your results, the other is for counting records to generate the sequence numbers.
Gordon said you need a unique id column for each row. And that's true if by "column" you mean to include derived columns also. Also it only needs to be unique within any combination of "internal_reference" and "Movement_date".
I've assumed, perhaps wrongly, that PO_Number will suffice. If not, concatenate with that (and some delimeters) other fields which will make it unique. The where clause will need updating to compare t1 and t2 for the "Combination of Columns which make this unique".
If, there is no appropriate combination available, I'm not sure it can be done without VBA and/or temp tables as The-Gambill suggested.

This is a real pain in MS Access, as far as I know. One method is a correlated subquery, but you need a unique id column on each row:
SELECT t.*,
(SELECT COUNT(*)
FROM (SELECT "Internal_reference", MAX("Movement_date") AS maxtime
FROM dw."LO-D4_Movements"
GROUP BY "Internal_reference"
) as t2
WHERE t2."Internal_reference" AND t."Internal_reference" AND
t2."Movement_date" = t."Movement_date" AND
t2.?? <= t.??
) as cnt
FROM (SELECT "Internal_reference", MAX("Movement_date") AS maxtime
FROM dw."LO-D4_Movements"
GROUP BY "Internal_reference"
) r INNER JOIN
dw."LO-D4_Movements" t
ON t."Movement_date" = r.maxtime AND
t."Internal_reference" = r."Internal_reference";
The ?? is for the id or creation date or something to allow the counting of rows.

How to speed up update query in SQL Server 2008?

update orders
set tname = (select top 1 t.task
from task t
where prod_typ='2' and sorder_nbr = t.ORDER_NBR
order by t.strt_dt desc)
where Prod_type='2'
update orders
set tname= (select top 1 t.task
from task t
where prod_typ='1' and sorder_nbr=t.ORDER_NBR
order by t.strt_dt desc)
where Prod_type='1'
I am trying to update the tname column of orders table by the latest task from the task table
And the condition is prod_typ of orders table is 1 and sorder_nbr of orders table and order_nbr of task table are equal
My first update statement works well where the rows are 900k and for the second update rows are 400k for second update statement it takes more than one hour to run and at last I cancelled the query

1) You query and my query:
update orders
set tname = (select top 1 t.task
from task t
where prod_type='2' and order_nbr = t.ORDER_NBR
order by t.strt_dt desc)
where Prod_type='2';
go
update o
set tname = (select top 1 t.task
from task t
where prod_type='2' and o.order_nbr = t.ORDER_NBR
order by t.strt_dt desc)
from dbo.orders o
where Prod_type='2';
go
The actual execution plans:
As you can see, if default collation for current DB is CI (case insensitive) then following predicate order_nbr=t.ORDER_NBR force SQL Server to compare the values of t.ORDER_NBR with the values order_nbr column from the same table task t. Look at first execution plan which corresponds to first query.
To solve just this problem, I've used another alias
dbo.orders o and I've reqrite the predicate thus o.order_nbr = t.ORDER_NBR. You may see this also within second execution plan.
Depending on how many tasks are for every order_num & prod_type you could test S#1 if there are many tasks or S#2 if there is a small amount of tasks per order_num & prod_type. Again, you need to test with your data to see which solution is better.
2) Solution #1:
UPDATE o
SET tname =
COALESCE(
(SELECT TOP(1) t.task
FROM dbo.task t
WHERE t.prod_type=o.Prod_type
AND o.order_nbr = t.ORDER_NBR
ORDER BY t.strt_dt DESC), tname
)
FROM dbo.orders o
WHERE o.Prod_type IN ('1', '2');
3) Solution #2:
UPDATE o
SET tname = lt.task
FROM dbo.orders o
INNER JOIN
(
SELECT src.order_nbr, src.prod_type, src.task
FROM (
SELECT t.ORDER_NBR, t.prod_type, t.task,
ROW_NUMBER() OVER(PARTITION BY t.ORDER_NBR, t.prod_type ORDER BY t.strt_dt DESC) RowNum
FROM dbo.task t
) src
WHERE src.RowNum = 1
) lt -- last task
ON o.order_nbr = lt.ORDER_NBR AND o.prod_type = lt.prod_type
WHERE o.Prod_type IN ('1', '2');
If you have questions then feel free to ask.
4) An index on dbo.task(order_nbr, prod_type, strt_dt) include (task) should help both solutions.
5) Also you should publish the actual execution plans.

If the data size is large than i suggest you to use variables for updating the table, or Using CTE to update
Update a table using CTE and NEWID()
Updating record using CTE?
I hope this will help
with tname (t.task) as
(select top 1 t.task
from task t
where prod_typ='2' and order_nbr = t.ORDER_NBR
order by t.strt_dt desc )
insert into Orders(t.task)

Try something like this. This will update prod_type of 1 and 2 at the same time.
UPDATE orders
SET tname = t1.task
FROM orders o
CROSS APPLY (
SELECT order_nbr, prod_type, t.task, row_number() OVER (PARTITION BY order_nbr, prod_type ORDER BY strt_dt DESC) rownumber
FROM task t
WHERE o.prod_type = t.prod_type
AND o.order_nbr = t.order_nbr) t1
WHERE t1.rownumber = 1
AND o.prod_type in (1,2)

Using a CTE query will speed up this, because the subquery is need not be created for every row, it is pre-prepared. Here is the sqlfiddle
;with cteTaskNames as
(
select top 1 t.task
from task t
where prod_type='2' and order_nbr=t.ORDER_NBR
order by t.strt_dt desc
)
update orders
set tname = (select task from cteTaskNames)
where Prod_type='2'
go
Also,
1) Is "prod_type" an integer field or a string field?
2) If you add group by in the cte, you can do an inner join on orders and cte query to run all updates at once instead of doing each query.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

INNER JOIN on a Sub Query - sql

Related

Remove multiple rows with same ID

Take Sum of time difference and Last value of a column

SQL Most Recent Register FROM Second Table by Id

access - row_number function?

How to speed up update query in SQL Server 2008?

Categories

Resources