How to find max value in SQL Server 2014? - sql

I have a table named StatementSummary.
SELECT *
FROM StatementSummary
WHERE AccountID = 1234
Results
StatementId StatementDate AccountId AmountDue
-------------------------------------------------
100 2017-10-16 1234 600
99 2017-09-16 1234 500
98 2017-08-16 1234 400
I have another table that has a list of Accounts. I am trying to give results that show the last AmountDue for each account
My code:
SELECT
AccountID,
(SELECT MAX(StatementDate)
FROM StatementSummary
GROUP BY AccountID) LastStatementDate,
AmountDue
FROM
Accounts A
INNER JOIN
StatementSummary S ON A.AccountId = S.AccountId
Basically, I want to show all the details of the last statement for every AccountId.

You can use the SQL Server Windowing functions in cases like this.
SELECT DISTINCT
a.AccountId,
FIRST_VALUE(s.StatementDate) OVER (PARTITION BY s.AccountId ORDER BY s.StatementDate DESC) As LastStatementDate,
FIRST_VALUE(s.AmountDue) OVER (PARTITION BY s.AccountId ORDER BY s.StatementDate DESC) As LastAmountDue
FROM Accounts a
INNER JOIN StatementSummary s
ON a.AccountId = s.AccountId
Basically what happens is the OVER clause creates partitons in your data, in this case, by the account number (these partitions are the windows). We then tell SQL Server to sort the data within each partition by the statement date in descending order, so the last statement will be at the top of the partition, and then the FIRST_VALUE function is used to just grab the first row.
Finally, since you perform this operation for every account/statement combo between the two tables, you need the DISTINCT to say you just want one copy of each row for each account.
There are quite a bit of useful things you can do with the windowing functions in SQL Server. This article gives a good introduction to them: https://www.red-gate.com/simple-talk/sql/learn-sql-server/window-functions-in-sql-server/

Derived Table over row numberand left join - to display all accounts regardless if there is a statement
select *
from
(select row_number() over (partition by accountid order by statementdate desc) rn,
accountid, statementdate,amount
from statementtable
) l
left outer join accountstable a
on l.accountid = a.accountid
And l.rn = 1

That's sounds like a job for me sings lateral join aka cross apply in T-SQL.
SELECT a.*, last_ss.*
FROM Accounts A
cross apply (
select top 1 *
from StatementSummary S ON A.AccountId = S.AccountId
order by StatementDate desc
) last_ss
Alternatively you can use CTE to get last date for account:
; with l as (
select accountid, max(StatementDate)
from StatementSummary
group by accountid
)
select ...
from Accounts a
inner join l on l.accountid = a.accountid
inner join StatementSummary ss on ss.accountid = a.accountid
and l.StatementDate = ss.StatementDate

Related

Reduce the execution time of queries using tables that holds millions of rows

I'm trying to execute a query that gets the count based on certain parameters. The tables which it operates on is containing 38 million data as of now. The query is taking 6 seconds to execute. I want to bring it down to less than a second as we are using the results to display on a web app.
SELECT
Policy id,Policy name, COUNT(Policy) COUNT
FROM
(SELECT
ItemID, OUID, ItemType, ItemGeneratedBy, CreatedDateTime GeneratedDate,
Month, OU Agency
FROM
ItemMaster) IM
LEFT JOIN
(SELECT ItemId, PolicyId
FROM ItemPolicy) IP ON IP.ItemId = IM.ItemId
LEFT JOIN
(SELECT PolicyId, PolicyName Policy
FROM Policies) P ON P.PolicyId = IP.PolicyID
LEFT JOIN
(SELECT ItemID, ActualActionStr ActionTaken
FROM ItemExtension_McAfee) IEM ON IEM.ItemId = IM.ItemId
LEFT JOIN
(SELECT Id, ItemType Channel FROM ItemType) IT ON IT.Id = IM.ItemType
INNER JOIN
(SELECT ID, LoginName Violator FROM ItemADUser) IAU ON IAU.ID = IM.ItemGeneratedBy
WHERE
IM.OUId IS NOT NULL
AND TRIM(Violator) IN ('cusyuk01')
AND TRIM(ActionTaken) IN ('affirm')
AND TRIM(Policy) IN ('G004-Email To External-Affirm')
AND GeneratedDate >= '2022-06-11 00:00:00'
GROUP BY
Policy, Policy
ORDER BY
COUNT(Policy) DESC
https://www.brentozar.com/pastetheplan/?id=B1VpV8dKi
The above link is the execution plan for this query.
Any ideas on how to achieve this?
Fix your data so you don't have to perform table scans to apply these predicates
AND TRIM(Violator) IN ('cusyuk01')
AND TRIM(ActionTaken) IN ('affirm')
AND TRIM(Policy) IN ('G004-Email To External-Affirm')

Returning ID's from two other tables or null if no IDs found using using a left join SQL Server

I am wondering if someone could hep me. I am trying to make a join on two tables and return an id if an id is there but if there is no id return null but still return the row for that product and not ignore it. My query below returns twice the amount the records to which I can not figure out why.
SELECT
T2.ProductID, FirstChild.SupplierID, SecondChild.AccountID
FROM
Products T2
LEFT OUTER JOIN
(
SELECT TOP(1) SupplierID, Reference,CompanyID, Row_Number() OVER (Partition By SupplierID Order By SupplierID) AS RowNo FROM Suppliers
)
FirstChild ON T2.SupplierReference = FirstChild.Reference AND RowNo = 1AND FirstChild.CompanyID =T2.CompanyID
LEFT OUTER JOIN
(
SELECT TOP(1) AccountID, SageKey,CompanyID, Row_Number() OVER (Partition By AccountID Order By AccountID) AS RowNo2 FROM Accounts
)
SecondChild ON T2.ProductAccountReference = SecondChild.Reference AND RowNo2 = 1 AND SecondChild.CompanyID =T2.CompanyID
Example of what I am trying to do
ProductID SupplierID AccountID
1 5 2
2 6 NULL
3 NULL NULL
OUTER APPLY and ditching the ROW_NUMBER Seems like a better choice here:
SELECT
p.ProductId
,FirstChild.SupplierId
,SecondChild.AccountId
FROM
Products p
OUTER APPLY (SELECT TOP (1) s.SupplierId
FROM
Suppliers s
WHERE
p.SupplierReference = s.SupplierReference
AND p.CompanyId = s.CompanyId
ORDER BY
s.SupplierId
) FirstChild
OUTER APPLY (SELECT TOP (1) a.AccountId
FROM
Accounts
WHERE
p.ProductAccountReference = a.Reference
AND p.CompanyId = a.CompanyId
ORDER BY
a.AccountID
) SecondChild
The way your query is written above there is no correlation for the derived tables. Which means you would always get what ever SupplierId SQL chooses based on optimization and if that doesn't happen to always be Row1 you wont get the value. You need to relate your Table and select top 1, adding an ORDER BY in your derived table is like identifying the row number you want.
If it's just showing duplicate records, wouldn't an inelegant solution just be to add distinct in the select line?

Sum Distinct Rows Only In Sql Server

I have four tables,in which First has one to many relation with rest of three tables named as (Second,Third,Fourth) respectively.I want to sum only Distinct Rows returned by select query.Here is my query, which i try so far.
select count(distinct First.Order_id) as [No.Of Orders],sum( First.Amount) as [Amount] from First
inner join Second on First.Order_id=Second.Order_id
inner join Third on Third.Order_id=Second.Order_id
inner join Fourth on Fourth.Order_id=Third.Order_id
The outcome of this query is :
No.Of Orders Amount
7 69
But this Amount should be 49,because the sum of First column Amount is 49,but due to inner join and one to many relationship,it calculate sum of also duplicate rows.How to avoid this.Kindly guide me
I think the problem is cartesian products in the joins (for a given id). You can solve this using row_number():
select count(t1234.Order_id) as [No.Of Orders], sum(t1234.Amount) as [Amount]
from (select First.*,
row_number() over (partition by First.Order_id order by First.Order_id) as seqnum
from First inner join
Second
on First.Order_id=Second.Order_id inner join
Third
on Third.Order_id=Second.Order_id inner join
Fourth
on Fourth.Order_id=Third.Order_id
) t1234
where seqnum = 1;
By the way, you could also express this using conditions in the where clause, because you appear to be using the joins only for filtering:
select count(First.Order_id) as [No.Of Orders], sum(First.Amount) as [Amount]
from First
where exists (select 1 from second where First.Order_id=Second.Order_id) and
exists (select 1 from third where First.Order_id=third.Order_id) and
exists (select 1 from fourth where First.Order_id=fourth.Order_id);

Find MAX with JOIN where Field also shows up in another Table

I have 3 tables: Master, Paper and iCodes. For a certain set of Master.Ref's, I need to find Max(Paper.Date), where the Paper.Code is also in the iCodes table (i.e., Paper.Code is a type of iCode). Master is joined to Paper by the File field.
EDIT:
I only need the Max(Paper.Date) its corresponding Code; I do not need all of the Codes.
I wrote the following but it is very slow. I have a few hundred ref #'s to look for. What is a better way to do this?
SELECT Master.Ref,
Paper.Code,
mp.MaxDate
FROM ( SELECT p.File ,
MAX(p.Date) AS MaxDate ,
FROM Paper AS p
LEFT JOIN Master AS m ON p.File = m.File
WHERE m.Ref IN ('ref1', 'ref2', 'ref3', 'ref4', 'ref5', 'ref6'... )
AND p.Code IN ( SELECT DISTINCT i.iCode
FROM iCodes AS i
)
GROUP BY p.File
) AS mp
LEFT JOIN Master ON mp.File = Master.File
LEFT JOIN Paper ON Master.File = Paper.File
AND mp.MaxDate = Paper.Date
WHERE Paper.Code IN ( SELECT DISTINCT iCodes.iCode
FROM iCodes
)
Does this do what you want?
SELECT m.Ref, p.Code, max(p.date)
FROM Master m LEFT JOIN
Paper
ON m.File = p.File
WHERE p.Code IN (SELECT DISTINCT iCodes.iCode FROM iCodes) and
m.Ref IN ('ref1','ref2','ref3','ref4','ref5','ref6'...)
GROUP BY m.Ref, p.Code;
EDIT:
To get the code on the max date, then use window functions:
select ref, code, date
from (SELECT m.Ref, p.Code, p.date
row_number() over (partition by m.Ref order by p.date desc) as seqnum
FROM Master m LEFT JOIN
Paper
ON m.File = p.File
WHERE p.Code IN (SELECT DISTINCT iCodes.iCode FROM iCodes) and
m.Ref IN ('ref1','ref2','ref3','ref4','ref5','ref6'...)
) mp
where seqnum = 1;
The function row_number() assigns a sequential number starting at 1 to a group of rows. The groups are defined by the partition by clause, so in this case everything with the same m.Ref value would be in a single group. Within the group, rows are assigned the number based on the order by clause. So, the one with the biggest date gets the value of 1. That is the row you want.

Distinct on multi-columns in sql

I have this query in sql
select cartlines.id,cartlines.pageId,cartlines.quantity,cartlines.price
from orders
INNER JOIN
cartlines on(cartlines.orderId=orders.id)where userId=5
I want to get rows distinct by pageid ,so in the end I will not have rows with same pageid more then once(duplicate)
any Ideas
Thanks
Baaroz
Going by what you're expecting in the output and your comment that says "...if there rows in output that contain same pageid only one will be shown...," it sounds like you're trying to get the top record for each page ID. This can be achieved with ROW_NUMBER() and PARTITION BY:
SELECT *
FROM (
SELECT
ROW_NUMBER() OVER(PARTITION BY c.pageId ORDER BY c.pageID) rowNumber,
c.id,
c.pageId,
c.quantity,
c.price
FROM orders o
INNER JOIN cartlines c ON c.orderId = o.id
WHERE userId = 5
) a
WHERE a.rowNumber = 1
You can also use ROW_NUMBER() OVER(PARTITION BY ... along with TOP 1 WITH TIES, but it runs a little slower (despite being WAY cleaner):
SELECT TOP 1 WITH TIES c.id, c.pageId, c.quantity, c.price
FROM orders o
INNER JOIN cartlines c ON c.orderId = o.id
WHERE userId = 5
ORDER BY ROW_NUMBER() OVER(PARTITION BY c.pageId ORDER BY c.pageID)
If you wish to remove rows with all columns duplicated this is solved by simply adding a distinct in your query.
select distinct cartlines.id,cartlines.pageId,cartlines.quantity,cartlines.price
from orders
INNER JOIN
cartlines on(cartlines.orderId=orders.id)where userId=5
If however, this makes no difference, it means the other columns have different values, so the combinations of column values creates distinct (unique) rows.
As Michael Berkowski stated in comments:
DISTINCT - does operate over all columns in the SELECT list, so we
need to understand your special case better.
In the case that simply adding distinct does not cover you, you need to also remove the columns that are different from row to row, or use aggregate functions to get aggregate values per cartlines.
Example - total quantity per distinct pageId:
select distinct cartlines.id,cartlines.pageId, sum(cartlines.quantity)
from orders
INNER JOIN
cartlines on(cartlines.orderId=orders.id)where userId=5
If this is still not what you wish, you need to give us data and specify better what it is you want.