I have a simple data model for users and their requests:
User
- id,
- name
Request
- id,
- createdAt,
- completedAt,
- status
- userId (FK to user)
I'm trying to run a query which collects some stats for every user. The issue is that I have to run the same subquery to fetch user requests for every parameter I select. As instead, I want to run it once and then calculate some stats over it.
select
u.id as UserId,
(select count(*)
from Requests r
where userId = u.id
and timestamp > #dateFrom) as Total,
(select count(*)
from Requests r
where userId = u.id
and timestamp > #dateFrom
and status = N'Completed') as Completed,
(select status
from Requests r
where userId = u.id
and timestamp > #dateFrom
and status != N'Completed') as ActiveStatus,
(select datediff(second, createdAt, completedAt)
from Requests r
where userId = u.id
and timestamp > #dateFrom
and status == N'Completed') as AvgProcessingTime
from User u
Obviously, this query is very slow and I need to optimize it. I tried join, apply, rank, nothing worked out well for me (read as I wasn't able to complete the query for all required stats).
What is the best approach here from the performance stand point?
try this using Left Join and aggregation
There could be a couple of issues here but let me know how you go.
select
u.id as UserId
,count(r.UserId) [Total]
,sum(iif(r.status = N'Completed',1,0)) [Completed]
,sum(iif(r.status <> N'Completed',1,0)) [ActiveStatus]
,avg(iif(r.status = N'Completed', datediff(second, createdAt, completedAt),0)) [AvgProcessingTime]
from User u
left join Request R
where timestamp > #datefrom
and r.userId = u.id
group by
u.id
I'm not sure about this query cause I haven't run it on my machine, but you can give it a try and make some changes accordingly if needed --
SELECT U.ID AS USERID
,COUNT(R.ID) AS TOTAL
,SUM(CASE
WHEN R.[STATUS] = N'COMPLETED'
THEN 1
END) AS [COMPLETED]
,CASE
WHEN R.[STATUS] <> N'COMPLETED'
THEN [STATUS]
END AS [ACTIVE STATUS]
,CASE
WHEN R.[STATUS] = N'COMPLETED'
THEN DATEDIFF(SECOND, CREATEDAT, COMPLETEDAT)
END AS [AVG PROCESSING TIME]
FROM USERS U
LEFT JOIN REQUESTS R ON U.ID = R.USERID
WHERE TIMESTAMP > #DATEFROM
Related
select DAC.LocationCode, DAC.Description, ReqApp.Rank, App.Approver as UserName,
CASE WHEN app.Approver = app.AlternateApprover THEN ''
ELSE AltApp.AlternateApprover END As AltApprover,
ISNULL(CONVERT(Varchar,AltApp.FromDate,101),'')AS FromDate,
ISNULL(CONVERT(Varchar,AltApp.ToDate,101),'')AS ToDate
from tblAPAlternateApprovers App
INNER JOIN tblAPAlternateApprovers AltApp
ON App.ID = AltApp.ID
INNER JOIN tblAPReqLocations DAC
ON App.tblAPReqLocationsID = DAC.ID
INNER JOIN tblAPReqApprover ReqApp
ON App.Approver = ReqApp.Approver AND
App.tblAPReqLocationsID = ReqApp.LocationID
ORDER BY DAC.LocationCode ASC, ReqApp.Rank asc
Output
When SQL Adds an 'alternate approver' (for purchase orders), it creates an additional record for the actual approver. So, trying to find a way to show only 1 record for those approvers that also have alternates. i.e. 'jlhayes' has 2 records. One with an alternate and one without. For these records, I want to only see the ones that have an alternate.Thank you for your help. I've spend a couple hours and out of ideas.
You can wrap AltApprover case statement in max(AltApprover) and group by DAC.LocationCode, DAC.Description, ReqApp.Rank, App.Approver and do similarly for FromDate and ToDate:
select DAC.LocationCode, DAC.Description, ReqApp.Rank, App.Approver as UserName,
max(CASE WHEN app.Approver = app.AlternateApprover THEN ''
ELSE AltApp.AlternateApprover END) As AltApprover,
max(ISNULL(CONVERT(Varchar,AltApp.FromDate,101),'')) AS FromDate,
max(ISNULL(CONVERT(Varchar,AltApp.ToDate,101),'')) AS ToDate
from tblAPAlternateApprovers App
INNER JOIN tblAPAlternateApprovers AltApp
ON App.ID = AltApp.ID
INNER JOIN tblAPReqLocations DAC
ON App.tblAPReqLocationsID = DAC.ID
INNER JOIN tblAPReqApprover ReqApp
ON App.Approver = ReqApp.Approver AND
App.tblAPReqLocationsID = ReqApp.LocationID
GROUP BY DAC.LocationCode, DAC.Description, ReqApp.Rank, App.Approver
ORDER BY DAC.LocationCode ASC, ReqApp.Rank asc
I have a SQL query which take sometimes 15 sec, sometimes 1 min.
Please, tell me how i can do my query lighter.
SELECT TOP 100
u.firstName,
u.id as userID,
ueh.targetID,
ueh.opened,
ueh.emailID,
u.phone,
u.gender
FROM dbo.Students u
INNER JOIN dbo.tblEmailHistory ueh
ON ueh.studentID = u.ID
WHERE (CONVERT(date,DATEADD(day,6,ueh.sDate))=CONVERT(date,getdate()))
AND IsNull(u.firstName, '') != ''
AND IsNull(u.email, '') != ''
AND IsNull(u.phone, '') != ''
AND ueh.status = 'sent'
AND ueh.reject_reason = 'null'
AND ueh.targetID = 28
AND ueh.opened = 0
AND u.deleted = 0
AND NOT EXISTS (SELECT ush.smsSendFullDate, ush.studentID FROM dbo.UsersSmsHistory ush WHERE u.id = ush.studentID AND DATEDIFF(DAY,ush.smsSendFullDate,GETDATE()) = 0)
This is your query greatly simplified:
SELECT TOP 100 u.firstName, u.id as userID,
ueh.targetID, ueh.opened, ueh.emailID,
u.phone, u.gender
FROM dbo.Students u INNER JOIN
dbo.tblEmailHistory ueh
ON ueh.studentID = u.ID
WHERE ueh.sDate >= cast(getdate() + 6 as date) AND
ueh.sDate < csat(getdate() + 7 as date) AND
u.firstName <> '' AND
u.email <> '' AND
u.phone <> '' AND
ueh.status = 'sent' AND
ueh.reject_reason = 'null' AND -- sure you mean a string here?
ueh.targetID = 28 AND
ueh.opened = 0 AND
u.deleted = 0 AND
NOT EXISTS (SELECT ush.smsSendFullDate, ush.studentID
FROM dbo.UsersSmsHistory ush
WHERE u.id = ush.studentID AND
convert(date, ush.smsSendFullDate) = convert(date, GETDATE())
);
Note: Comparisons to NULL are never true for almost all comparisons, so ISNULL()/COALESCE() is unnecessary.
Then start adding indexes. I would recommend:
tblEmailHistory(targetid, status, opened, deleted, rejectreason, sdate)
UsersSmsHistory(studentID, smsSendFullDate)
I am guessing most students have names and phone numbers, so indexes on those columns would not help.
Your query looks okay with no redundant parts. The reason it takes a lot of time is because you are joining tables three times and there may be a lot of data in them. So instead of improving your query, try to improve the performance of the table by adding index to them on columns like dbo.tblEmailHistory.studentID, dbo.Students.ID, etc
First, I will explain the what is being captured. User's have a member level associated with their accounts (Bronze, Gold, Diamond, etc). A nightly job needs to run to calculate the orders from today a year back. If the order total for a given user goes over or under a certain amount their level is upgraded or downgraded. The table where the level information is stored will not change much, but the minimum and maximum amount thresholds may over time. This is what the table looks like:
CREATE TABLE [dbo].[MemberAdvantageLevels] (
[Id] int NOT NULL IDENTITY(1,1) ,
[Name] varchar(255) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL ,
[MinAmount] int NOT NULL ,
[MaxAmount] int NOT NULL ,
CONSTRAINT [PK__MemberAd__3214EC070D9DF1C7] PRIMARY KEY ([Id])
)
ON [PRIMARY]
GO
I wrote a query that will group the orders by user for the year to date. The query includes their current member level.
SELECT
Sum(dbo.tbh_Orders.SubTotal) AS OrderTotals,
Count(dbo.UserProfile.UserId) AS UserOrders,
dbo.UserProfile.UserId,
dbo.UserProfile.UserName,
dbo.UserProfile.Email,
dbo.MemberAdvantageLevels.Name,
dbo.MemberAdvantageLevels.MinAmount,
dbo.MemberAdvantageLevels.MaxAmount,
dbo.UserMemberAdvantageLevels.LevelAchievmentDate,
dbo.UserMemberAdvantageLevels.LevelAchiementAmount,
dbo.UserMemberAdvantageLevels.IsCurrent as IsCurrentLevel,
dbo.MemberAdvantageLevels.Id as MemberLevelId,
FROM
dbo.tbh_Orders
INNER JOIN dbo.tbh_OrderStatuses ON dbo.tbh_Orders.StatusID = dbo.tbh_OrderStatuses.OrderStatusID
INNER JOIN dbo.UserProfile ON dbo.tbh_Orders.CustomerID = dbo.UserProfile.UserId
INNER JOIN dbo.UserMemberAdvantageLevels ON dbo.UserProfile.UserId = dbo.UserMemberAdvantageLevels.UserId
INNER JOIN dbo.MemberAdvantageLevels ON dbo.UserMemberAdvantageLevels.MemberAdvantageLevelId = dbo.MemberAdvantageLevels.Id
WHERE
dbo.tbh_OrderStatuses.OrderStatusID = 4 AND
(dbo.tbh_Orders.AddedDate BETWEEN dateadd(year,-1,getdate()) AND GETDATE()) and IsCurrent = 1
GROUP BY
dbo.UserProfile.UserId,
dbo.UserProfile.UserName,
dbo.UserProfile.Email,
dbo.MemberAdvantageLevels.Name,
dbo.MemberAdvantageLevels.MinAmount,
dbo.MemberAdvantageLevels.MaxAmount,
dbo.UserMemberAdvantageLevels.LevelAchievmentDate,
dbo.UserMemberAdvantageLevels.LevelAchiementAmount,
dbo.UserMemberAdvantageLevels.IsCurrent,
dbo.MemberAdvantageLevels.Id
So, I need to check the OrdersTotal and if it exceeds the current level threshold, I then need to find the Level that fits their current order total and create a new record with their new level.
So for example, lets say jon#doe.com currently is at bronze. The MinAmount for bronze is 0 and the MaxAmount is 999. Currently his Orders for the year are at $2500. I need to find the level that $2500 fits within and upgrade his account. I also need to check their LevelAchievmentDate and if it is outside of the current year we may need to demote the user if there has been no activity.
I was thinking I could create a temp table that holds the results of all levels and then somehow create a CASE statement in the query above to determine the new level. I don't know if that is possible. Or, is it better to iterate over my order results and perform additional queries? If I use the iteration pattern I know i can use the When statement to iterate over the rows.
Update
I updated my Query A bit and so far came up with this, but I may need more information than just the ID from the SubQuery
Select * into #memLevels from MemberAdvantageLevels
SELECT
Sum(dbo.tbh_Orders.SubTotal) AS OrderTotals,
Count(dbo.AZProfile.UserId) AS UserOrders,
dbo.AZProfile.UserId,
dbo.AZProfile.UserName,
dbo.AZProfile.Email,
dbo.MemberAdvantageLevels.Name,
dbo.MemberAdvantageLevels.MinAmount,
dbo.MemberAdvantageLevels.MaxAmount,
dbo.UserMemberAdvantageLevels.LevelAchievmentDate,
dbo.UserMemberAdvantageLevels.LevelAchiementAmount,
dbo.UserMemberAdvantageLevels.IsCurrent as IsCurrentLevel,
dbo.MemberAdvantageLevels.Id as MemberLevelId,
(Select Id from #memLevels where Sum(dbo.tbh_Orders.SubTotal) >= #memLevels.MinAmount and Sum(dbo.tbh_Orders.SubTotal) <= #memLevels.MaxAmount) as NewLevelId
FROM
dbo.tbh_Orders
INNER JOIN dbo.tbh_OrderStatuses ON dbo.tbh_Orders.StatusID = dbo.tbh_OrderStatuses.OrderStatusID
INNER JOIN dbo.AZProfile ON dbo.tbh_Orders.CustomerID = dbo.AZProfile.UserId
INNER JOIN dbo.UserMemberAdvantageLevels ON dbo.AZProfile.UserId = dbo.UserMemberAdvantageLevels.UserId
INNER JOIN dbo.MemberAdvantageLevels ON dbo.UserMemberAdvantageLevels.MemberAdvantageLevelId = dbo.MemberAdvantageLevels.Id
WHERE
dbo.tbh_OrderStatuses.OrderStatusID = 4 AND
(dbo.tbh_Orders.AddedDate BETWEEN dateadd(year,-1,getdate()) AND GETDATE()) and IsCurrent = 1
GROUP BY
dbo.AZProfile.UserId,
dbo.AZProfile.UserName,
dbo.AzProfile.Email,
dbo.MemberAdvantageLevels.Name,
dbo.MemberAdvantageLevels.MinAmount,
dbo.MemberAdvantageLevels.MaxAmount,
dbo.UserMemberAdvantageLevels.LevelAchievmentDate,
dbo.UserMemberAdvantageLevels.LevelAchiementAmount,
dbo.UserMemberAdvantageLevels.IsCurrent,
dbo.MemberAdvantageLevels.Id
This hasn't been syntax checked or tested but should handle the inserts and updates you describe. The insert can be done as single statement using a derived/virtual table which contains the orders group by caluclation. Note that both the insert and update statement be done within the same transaction to ensure no two records for the same user can end up with IsCurrent = 1
INSERT UserMemberAdvantageLevels (UserId, MemberAdvantageLevelId, IsCurrent,
LevelAchiementAmount, LevelAchievmentDate)
SELECT t.UserId, mal.Id, 1, t.OrderTotals, GETDATE()
FROM
(SELECT ulp.UserId, SUM(ord.SubTotal) OrderTotals, COUNT(ulp.UserId) UserOrders
FROM UserLevelProfile ulp
INNER JOIN tbh_Orders ord ON (ord.CustomerId = ulp.UserId)
WHERE ord.StatusID = 4
AND ord.AddedDate BETWEEN DATEADD(year,-1,GETDATE()) AND GETDATE()
GROUP BY ulp.UserId) AS t
INNER JOIN MemberAdvantageLevels mal
ON (t.OrderTotals BETWEEN mal.MinAmount AND mal.MaxAmount)
-- Left join needed on next line in case user doesn't currently have a level
LEFT JOIN UserMemberAdvantageLevels umal ON (umal.UserId = t.UserId)
WHERE umal.MemberAdvantageLevelId IS NULL -- First time user has been awarded a level
OR (mal.Id <> umal.MemberAdvantageLevelId -- Level has changed
AND (t.OrderTotals > umal.LevelAchiementAmount -- Acheivement has increased (promotion)
OR t.UserOrders = 0)) -- No. of orders placed is zero (de-motion)
/* Reset IsCurrent flag where new record has been added */
UPDATE UserMemberAdvantageLevels
SET umal1.IsCurrent=0
FROM UserMemberAdvantageLevels umal1
INNER JOIN UserMemberAdvantageLevels umal2 On (umal2.UserId = umal1.UserId)
WHERE umal1.IsCurrent = 1
AND umal2.IsCurrent = 2
AND umal1.LevelAchievmentDate < umal2.LevelAchievmentDate)
One approach:
with cte as
(SELECT Sum(o.SubTotal) AS OrderTotals,
Count(p.UserId) AS UserOrders,
p.UserId,
p.UserName,
p.Email,
l.Name,
l.MinAmount,
l.MaxAmount,
ul.LevelAchievmentDate,
ul.LevelAchiementAmount,
ul.IsCurrent as IsCurrentLevel,
l.Id as MemberLevelId
FROM dbo.tbh_Orders o
INNER JOIN dbo.UserProfile p ON o.CustomerID = p.UserId
INNER JOIN dbo.UserMemberAdvantageLevels ul ON p.UserId = ul.UserId
INNER JOIN dbo.MemberAdvantageLevels l ON ul.MemberAdvantageLevelId = l.Id
WHERE o.StatusID = 4 AND
o.AddedDate BETWEEN dateadd(year,-1,getdate()) AND GETDATE() and
IsCurrent = 1
GROUP BY
p.UserId, p.UserName, p.Email, l.Name, l.MinAmount, l.MaxAmount,
ul.LevelAchievmentDate, ul.LevelAchiementAmount, ul.IsCurrent, l.Id)
select cte.*, ml.*
from cte
join #memLevels ml
on cte.OrderTotals >= ml.MinAmount and cte.OrderTotals <= ml.MaxAmount
I also need help speeding up this query, take 25min on LIVE db, 1 second on TEST db.
This is after some modifications, originally I was querying the db server several hundred times to get the data from a php page using a while loop for each row of the main query, then I tried making a procedure using a temp table to return the data, I canceled the execution after 45min. So then I tried this.
I imagine one could do a inner join on the "select top 1 from NurQueryResults" queries but I couldn't figure it out. I got results but not the most recent at the top, I did an "order by t1.time, t2.time, t3.time...." it did have t1 most recent result at top though, and it returned more results than it should have.
SELECT o.VisitID AS VisitID,
(SELECT TOP 1 Response
FROM NurQueryResults
WHERE QueryID = 'OEDTCAT'
AND VisitID = o.VisitID
ORDER BY DateTime DESC) AS PPN,
(SELECT TOP 1 Response
FROM NurQueryResults
WHERE QueryID = 'OEDTMEAT'
AND VisitID = o.VisitID
ORDER BY DateTime DESC) AS MEAT,
(SELECT TOP 1 Response
FROM OeOrderQueries
WHERE QueryID = 'OESPMOD'
AND VisitID = o.VisitID
ORDER BY RowUpdateDateTime DESC) AS SPMOD,
(SELECT TOP 1 Response
FROM OeOrderQueries
WHERE QueryID = 'OERT3'
AND VisitID = o.VisitID
ORDER BY RowUpdateDateTime DESC) AS SPMOD2,
(SELECT TOP 1 Response
FROM NurQueryResults
WHERE QueryID = 'OEDTDECUB'
AND VisitID = o.VisitID
ORDER BY DateTime DESC) AS DECUB,
(SELECT TOP 1 Response
FROM NurQueryResults
WHERE QueryID = 'OEALL2'
AND VisitID = o.VisitID
ORDER BY DateTime DESC) AS FOODALL,
o.OrderedProcedureName,
o.OrderDateTime,
a.RoomID,
a.BedID,
a.Name,
a.Sex,
DATEDIFF(year, a.ComputedBirthDateTime, GETDATE()) AS Age
FROM OeOrders o
INNER JOIN AdmVisits a
ON o.VisitID = a.VisitID
AND o.Category = 'DIET'
AND o.StatusChoice = 'S'
AND a.Status = 'ADM IN'
ORDER BY o.VisitID,
o.OrderDateTime DESC
I'm just wondering, how would the results be when you do something like this:
SELECT
o.VisitID AS VisitID,
PPN.Response
FROM OeOrders o
JOIN AdmVisits a
ON o.VisitID = a.VisitID
AND o.Category = 'DIET'
AND o.StatusChoice = 'S'
AND a.Status = 'ADM IN'
join
(
SELECT
Response,
VisitID,
ROW_NUMBER() OVER(ORDER BY DateTime DESC) AS MyRowNumber
FROM NurQueryResults
WHERE QueryID = 'OEDTCAT'
) AS PPN
on o.VisitID = PPN.VisitID
and PPN.MyRowNumber = 1
This is not your entire query, but just the first sub-query which gets column PPN.
If you change all sub-queries like this JOIN, does it have any performance gain?
Turn on the Execution Plan and check out the results. It may show that you are missing an index or something of that nature.
Speaking of indexes check the fragmentation of the indexes involved and if they're high consider rebuilding them. I have had quite a lot of success recently in getting procedures that took 2min to run on production to execute in 1-3 seconds by following the above process.
I'm trying to build the following sql query as a Zend_Db_Select object
$sql = "
SELECT
u.id,
u.email,
s.nic as nic,
(SELECT COUNT(*) FROM event WHERE user_id = u.id AND event='login') as logins,
(SELECT COUNT(*) FROM event WHERE user_id = u.id AND event='export') as exports,
(SELECT MAX(time) FROM event WHERE user_id = u.id AND event='login') as lastlogin,
(DATEDIFF(u.expire_date, NOW())) as daysleft
FROM
user u,
seller s
WHERE
u.seller_id = s.id";
with no luck. I can't get the subquerys working in the Zend_Db_Select object.
Is it possible to achieve the same same result by joins instead of subquerys?
Any hints on how to get this working would be highly appreciated.
Try something like this:
$select->from(array('u'=>'user'),array('id','email'));
$select->join(array('s'=>'seller'),'s.id = u.seller_id', array('nic'));
$select->columns(array(
'logins' =>"(SELECT COUNT(*) FROM event WHERE user_id = u.id AND event='login')",
'exports' =>"(SELECT COUNT(*) FROM event WHERE user_id = u.id AND event='export')",
'lastLogin' =>"(SELECT MAX(time) FROM event WHERE user_id = u.id AND event='login')",
'daysLeft' =>"(DATEDIFF(u.expire_date, NOW()))",));
$stmt = $select->query();
var_dump($stmt->fetchAll());