Optimizing query with two MAX columns in the same table - sql

I need to optimize below query
SELECT
Id, -- identity
CargoID,
[Status] AS CurrentStatus
FROM
dbo.CargoStatus
WHERE
id IN (SELECT TOP 1 ID
FROM dbo.CargoStatus CS
INNER JOIN STD.StatusMaster S ON CS.ShipStatusID = S.SatusID
WHERE CS.CargoID=CargoStatus.CargoID
ORDER BY YEAR([CS.DATE]) DESC, MONTH([CS.DATE]) DESC,
DAY([CS.DATE]) DESC, S.StatusStageNumber DESC)
There are two tables
CargoStatus, and
StatusMaster
Statusmaster has columns StatusID, StatusName, StatusStageNumber(int)
CargoStatus has columns ID, StatusID (FK StatusMaster StatusID column), Date
Is there any other better way of writing this query.
I want latest status for each cargo (only one entry per cargoID).

Since you seem to be using SQL Server 2005 or newer, you can use a CTE with the ROW_NUMBER() windowing function:
;WITH LatestCargo AS
(
SELECT
cs.Id, -- identity
cs.CargoID,
cs.[Status] AS CurrentStatus
ROW_NUMBER() OVER(PARTITION BY cs.CargoID
ORDER BY cs.[Date], s.StatusStageNumber DESC) AS 'RowNum'
FROM
dbo.CargoStatus cs
INNER JOIN
STD.StatusMaster s ON cs.ShipStatusID = s.[StatusID]
)
SELECT
Id, CargoID, [Status]
FROM
LatestCargo
WHERE
RowNum = 1
This CTE "partitions" your data by CargoID, and for each partition, the ROW_NUMBER function hands out sequential numbers, starting at 1 and ordered by Date DESC - so the latest row gets RowNum = 1 (for each CargoID) which is what I select from the CTE in the SELECT statement after it.

Related

Delete Duplicate Rows in SQL

I have a table with unique id but duplicate row information.
I can find the rows with duplicates using this query
SELECT
PersonAliasId, StartDateTime, GroupId, COUNT(*) as Count
FROM
Attendance
GROUP BY
PersonAliasId, StartDateTime, GroupId
HAVING
COUNT(*) > 1
I can manually delete the rows while keeping the 1 I need with this query
Delete
From Attendance
Where Id IN(SELECT
Id
FROM
Attendance
Where PersonAliasId = 15
and StartDateTime = '9/24/2017'
and GroupId = 1429
Order By ModifiedDateTIme Desc
Offset 1 Rows)
I am not versed in SQL enough to figure out how to use the rows in the first query to delete the duplicates leaving behind the most recent. There are over 3481 records returned by the first query to do this one by one manually.
How can I find the duplicate rows like the first query and delete all but the most recent like the second?
You can use a Common Table Expression to delete the duplicates:
WITH Cte AS(
SELECT *,
Rn = ROW_NUMBER() OVER(PARTITION BY PersonAliasId, StartDateTime, GroupId
ORDER BY ModifiedDateTIme DESC)
FROM Attendance
)
DELETE FROM Cte WHERE Rn > 1;
This will keep the most recent record for each PersonAliasId - StartDateTime - GroupId combination.
Use the MAX aggregate function to identify the latest startdatetime for each group/person combination. Then delete records which do not have that latest time.
DELETE a
FROM attendance as a
INNER JOIN (
SELECT
PersonAliasId, MAX(StartDateTime) AS LatestTime, GroupId,
FROM
Attendance
GROUP BY
PersonAliasId, GroupId
HAVING
COUNT(*) > 1
) as b
on a.personaliasid=b.personaliasid and a.groupid=b.groupid and a.startdatetime < b.latesttime
Same as the CTE answer - give Felix the check
delete
from ( SELECT rn = ROW_NUMBER() OVER(PARTITION BY PersonAliasId, StartDateTime, GroupId
ORDER BY ModifiedDateTIme DESC)
FROM Attendance
) tt
where tt.rn > 1

Filter the table with latest date having duplicate OrderId

I have following table:
I need to filter out the rows for which start date is latest corresponding to its order id .With reference to given table row no 2 and 3 should be the output.
As row 1 and row 2 has same order id and order date but start date is later than first row. And same goes with row number 3 and 4 hence I need to take out row no 3 . I am trying to write the query in SQL server. Any help is appreciated.Please let me know if you need more details.Apologies for poor English
You can do this easily with a ROW_NUMBER() windowed function:
;With Cte As
(
Select *,
Row_Number() Over (Partition By OrderId Order By StartDate Desc) RN
From YourTable
)
Select *
From Cte
Where RN = 1
But I question the StartDate datatype. It looks like these are being stored as VARCHAR. If that is the case, you need to CONVERT the value to a DATETIME:
;With Cte As
(
Select *,
Row_Number() Over (Partition By OrderId
Order By Convert(DateTime, StartDate) Desc) RN
From YourTable
)
Select *
From Cte
Where RN = 1
Another way using a derived table.
select
t.*
from
YourTable t
inner join
(select OrderId, max(StartDate) dt
from YourTable
group by OrderId) t2 on t2.dt = t.StartDate and t2.OrderId = t.OrderId

How to select max rownumber for each partition in SQL Server

Can anybody tell me how to select the max row number for each partition in SQL Server using CTE.
Suppose any employee is having 4 transaction rows and another is having only one row then how to select max rows for those employees.
I am having job table I want to fetch max row number for employee to fetch the latest transaction for that employee
I'd tried following
With CTE as (
Select
My fields,
Rownum = row_number() over(partition by emplid order by date) from jobtable
Where
Myconditions
)
Select * from CTE B left outer join
CTE A on A.emplid = B.emplid
Where
A.rownum = (select max(a2.rownum) from jobtable a2)
Do left join is required above or it is not at all needed ?
Please tell me how to fetch rownum if only 1 row exist for any employees as above query is fetching only employees which are having.greatest rownum in whole table
With CTE as (
Select
My fields,
Rownum = row_number() over(partition by emplid order by date DESC)
from jobtable
Where
Myconditions
)
SELECT *
FROM
cte
WHERE
RowNum = 1
Just reverse the order of your ROW_NUMBER and and select where it equals 1. Row numbers can be ascending (ASC) or descending (DESC). So if you want the most recent date to get the latest record ORDER BY date DESC, if you want the earliest record first you would choose ORDER BY date ASC (or date)

How can I do "SELECT" query with "WHERE" condition on column that does not exist in the table in SQL Server

To understand the question, I'll show an example: my columns in the Students table are:
stuID, cityID, Name, updateDate
and my SELECT is:
SELECT
ROW_NUMBER() OVER (PARTITION BY cityID ORDER BY updateDate DESC) AS rownumber
stuID,
cityID,
Name
FROM
Students
WHERE
rownumber = 1
No matter - why I wish to make such a query, this is only example, but how can I put in the "WHERE" condition on rownumber????
The WHERE Clause will be evaluated immediately after FROM Clause,so SQL has no idea when you refer to somevalue you refered in select
Use CTE/SubQuery if you want to apply predicate to RowNumber :
;With cte
as
(
SELECT
ROW_NUMBER() OVER (PARTITION BY cityID ORDER BY updateDate DESC) AS rownumber
stuID,
cityID,
Name
FROM Students
)
select * from cte where rownumber=1
Below are the logical querying phases which dictates how the clauses defined in one Phase are made available to the clauses in next phases..
1. FROM
2. ON
3.OUTER
4.WHERE
5.GROUP BY
6.CUBE | ROLLUP
7.HAVING
8. SELECT
9. DISTINCT
10. TOP
11. ORDER BY
As you can see ,RowNumber you have defined in select clause which will be available only to next phases after select
Below is the Logical query processing flow chart for each clause :Itzik Ben-Gan
Any aliases and calculated fields are available if you put your query into subquery or CTE:
select *
from
(
SELECT
ROW_NUMBER() OVER (PARTITION BY cityID ORDER BY updateDate DESC) AS rownumber
stuID,
cityID,
Name
FROM Students
) s
WHERE rownumber = 1

How do I return a single row based on the aggregate of more than one column

Sorry for the ambiguous title, not sure how to search, or ask this question.
Lets say we have TableA:
RowID FkId Rank Date
ID1 A 1 2013-3-1
ID2 A 2 2013-3-2
ID3 A 2 2013-3-3
ID4 B 3 2013-3-4
ID5 A 1 2013-3-5
I need to create a view, that will return 1 row for each FkId. The row should be the max rank, and max date. So for FkId "A", the query would return the row for "ID3".
I was able to return a single row by using sub-queries; first I get the MAX(Rank), then join to another query that gets MAX(Date) group by FkId & Rank.
SELECT TableA.*
(Select FkId, MAX(Rank) AS Rank FROM TableA GROUP BY FkId) s1
INNER JOIN (Select FkId, Rank, MAX(Date) AS Date FROM TableA GROUP BY FkId,Rank) s2 ON s1.FkId = s2.FkId AND s1.Rank = s2.Rank
INNER JOIN TableA ON s2.FkId = TableA.FkId AND s2.Rank = TableA.Rank AND s2.Date = TableA.Date
Is there a more efficient query that would achieve the same results? Thanks for looking.
Edit: Added ID5 since the last answer. If I tried a normal MAX(rank),MAX(Date) GROUP BY FkId, then for "A", I would get A; 2; 2013-3-5. This result would not match up to a RowId.
You can use ROW_NUMBER with a CTE (presuming sql-server >= 2005):
WITH CTE AS
(
SELECT TableA.*,
RN = ROW_NUMBER() OVER (PARTITION BY FkId Order By Rank Desc, Date DESC)
FROM Table A
)
SELECT RowID,FkId, Rank,Date
FROM CTE WHERE RN = 1
Your question (clarified in the comments to this answer) asks for:
A single row for each FkId
The Max Date and Rank
The results to correspond to a row in the original data.
In the case that there are FkIds with rows such that the maximum date and maximum rank are in separate rows, you'll have to relax at least one of these requirements.
If you're willing to relax requirement (3), then you can use GROUP BY:
SELECT FkId, MAX(Rank) AS Rank, Max(Date) AS Date
FROM TableA
GROUP BY FkId
Given the extra information in the comments. That you want the latest, of the highest ranked entries for each FkId, the following should work:
SELECT FkId, Rank, MAX(Date) AS Date
FROM TableA A
WHERE Rank = (SELECT MAX(Rank)
FROM TableA sub
WHERE A.FkId = sub.FkId
GROUP BY sub.FkId)
GROUP BY FkId, Rank
Here's a sqlfiddle to show it in action.
You can use Rank() and inline query to achieve it.
select * from TableA
where RowID in (
select rowID from (
select FKID, RowID,
rank() over (partition by FKID order by [Rank] desc, [Date] desc) as RankNumber
from TableA ) A
where A.RankNumber=1 )
SQL Fiddle Demo
You can also be sneaky and accomplish what ljh suggested like this:
select top 1 with ties *
from TableA
order by rank() over (
partition by FKID
order by [Rank] desc, [Date] desc
)