Nested SELECT Statement - sql

SQL is not my forte, but I'm working on it - thank you for the replies.
I am working on a report that will return the completion percent of services for indiviudals in our contracts. There is a master table "Contracts," each individual Contract can have multiple services from the "services" table, each service has multiple standards for the "standards" table which records the percent complete for each standard.
I've gotten as far as calculating the total percent complete for each individual service for a specific Contract_ServiceID, but how do I return all the services percentages for all the contracts? Something like this:
Contract Service Percent complete
abc Company service 1 98%
abc Company service 2 100%
xyz Company service 1 50%
Here's what I have so far:
SELECT
Contract_ServiceId,
(SUM(CompletionPercentage)/COUNT(CompletionPercentage)) * 100 as "Percent Complete"
FROM dbo.Standard sta WITH (NOLOCK)
INNER JOIN dbo.Contract_Service conSer ON sta.ServiceId = conSer.ServiceId
LEFT OUTER JOIN dbo.StandardResponse standResp ON sta.StandardId = standResp.StandardId
AND conSer.StandardReportId = standResp.StandardReportId
WHERE Contract_ServiceId = '[an id]'
GROUP BY Contract_ServiceID
This gets me too:
Contract_serviceid Percent Complete
[an id] 100%
EDIT: Tables didn't show up in post.

I'm not sure if I understand the problem, if the result is ok for a service_contract you canContract Service
SELECT con.ContractId,
con.Contract,
conSer.Contract_ServiceID,
conSer.Service,
(SUM(CompletionPercentage)/COUNT(CompletionPercentage)) * 100 as "Percent Complete"
FROM dbo.Standard sta WITH (NOLOCK)
INNER JOIN dbo.Contract_Service conSer ON sta.ServiceId = conSer.ServiceId
INNER JOIN dbo.Contract con ON con.ContractId = conSer.ContractId
LEFT OUTER JOIN dbo.StandardResponse standResp ON sta.StandardId = standResp.StandardId
AND conSer.StandardReportId = standResp.StandardReportId
GROUP BY con.ContractId, con.Contract, conSer.Contract_ServiceID, conSer.Service
make sure you have all the columns you select from the Contract table in the group by clause

You should be able to add in your select the company name and group by that and the service id and ditch the where clause...
Perhaps like this:
SELECT
Contract,
Contract_ServiceId,
(SUM(CompletionPercentage)/COUNT(CompletionPercentage)) * 100 as "Percent Complete"
FROM dbo.Standard sta WITH (NOLOCK)
INNER JOIN dbo.Contract_Service conSer ON sta.ServiceId = conSer.ServiceId
LEFT OUTER JOIN dbo.StandardResponse standResp ON sta.StandardId = standResp.StandardId
AND conSer.StandardReportId = standResp.StandardReportId
GROUP BY Contract, Contract_ServiceID

Assuming your query works for just the one service, looks like you're most of the way there, leave off the WHERE clause to obtain all results, your GROUP BY will take care of one service per result.
Just join on the Contract table to show the contract related to each service, and you're done.

In addition to removing the where clause and adding more group conditions, you also will want to watch out for null records in each of your tables. This requires changing an INNER JOIN to a LEFT JOIN (unless you don't want to see those rows) and some ISNULL's to clean up data. I'm not sure where the StandardReportId concept falls in here, but it looks like a filtering mechanism that I won't toy with.
SELECT
ContractID
ISNULL(Contract_ServiceId, '-1') -- or some other stand in value
ISNULL((SUM(CompletionPercentage)/COUNT(CompletionPercentage)) * 100, 0) as "Percent Complete"
FROM
Contract AS con
LEFT OUTER JOIN dbo.Contract_Service conSer ON con.ContractID = conSer.ContractID
LEFT OUTER JOIN dbo.Standard sta WITH (NOLOCK) ON conSer.ServiceId = sta.StandardID
LEFT OUTER JOIN dbo.StandardResponse standResp ON sta.StandardId = standResp.StandardId
AND conSer.StandardReportId = standResp.StandardReportId
GROUP BY
ContractID, Contract_ServiceID

Because you are grouping by the contract serviceid I think you can just remove the where clause and it should calculate the percentage for all contact serviceids.
If there are no records in dbo.Standard for that contract serviceid, you may need to left outer join instead from the contract service table to the dbo.Standard table in order to show contracts without completion records.
I hope that makes sense... My SQL is getting rusty after migrating to a data framework.

(SUM(CompletionPercentage)/COUNT(CompletionPercentage)) * 100
If CompletionPercentage is an int field you will have trouble with integer math. Anytime you divide by an integer you need to multiply it by 1.0 to make sure it is considering the number as a decimal. Otherwise 49/100 would = 0.

Related

If transaction within date range, then return customer name (and not all the transactions!)

This code is taking a significant amount of time to run. It's returning every single transaction within the date range but I just need to know if the customer has had at least one transaction, then include the CustomerID, CustomerName, Type, Sign, ReportingName.
I think I need to GROUP BY 'CustomerID' but again only if there was a transaction within the date range. And of course, I'm sure there is an optimal way to execute the below TSQL because it's quite slow at present.
Thanks in advance for any help!
SELECT [ABC].[dbo].[vwPrimary].[RelatedNameId] AS CustomerID
,[ABC].[dbo].[vwPrimary].[RelatedName] AS CustomerName
,[AFGPurchase].[IvL].[TaxTreatment].[ParticluarType] AS Type
,[AFGPurchase].[IvL].[Product].[Sign] AS [Sign]
,[AFGPurchase].[IvL].[Product].[ReportingName] AS ReportingName
,[AFGPurchase].[IvL].[Transaction].[EffectiveDate] AS 'Date'
FROM (((([AFGPurchase].[IvL].[Account]
INNER JOIN [AFGPurchase].[IvL].[Position] ON [AFGPurchase].[IvL].[Account].[AccountId] = [AFGPurchase].[IvL].[Position].[AccountId])
INNER JOIN [AFGPurchase].[IvL].[Product] ON [AFGPurchase].[IvL].[Position].[ProductID] = [AFGPurchase].[IvL].[Product].[ProductId])
INNER JOIN [ABC].[dbo].[vwPrimary] ON [AFGPurchase].[IvL].[Account].[ReportingEntityId] = [ABC].[dbo].[vwPrimary].[RelatedNameId])
INNER JOIN [AFGPurchase].[IvL].[TaxTreatment] ON [AFGPurchase].[IvL].[Account].[TaxTreatmentId] = [AFGPurchase].[IvL].[TaxTreatment].[TaxTreatmentId])
INNER JOIN [AFGPurchase].[IvL].[Transaction] ON [AFGPurchase].[IvL].[Position].[PositionId] = [AFGPurchase].[IvL].[Transaction].[PositionId]
WHERE ((([AFGPurchase].[IvL].[TaxTreatment].[RegistrationType]) LIKE 'NON%')
AND (([AFGPurchase].[IvL].[Product].[Sign])='XYZ2')
AND (([AFGPurchase].[IvL].[Position].[Quantity])<>0)
AND (([AFGPurchase].[IvL].[Transaction].[EffectiveDate]) between '2021-12-31' and '2022-12-31'))
Check your indexes on fragmentation, to speed up your query. And make sure you have indexes.
If you just need one result, just TOP 1
SELECT TOP 1 [ABC].[dbo].[vwPrimary].[RelatedNameId] AS CustomerID
,[ABC].[dbo].[vwPrimary].[RelatedName] AS CustomerName
,[AFGPurchase].[IvL].[TaxTreatment].[ParticluarType] AS Type
,[AFGPurchase].[IvL].[Product].[Sign] AS [Sign]
,[AFGPurchase].[IvL].[Product].[ReportingName] AS ReportingName
,[AFGPurchase].[IvL].[Transaction].[EffectiveDate] AS 'Date'
FROM (((([AFGPurchase].[IvL].[Account]
INNER JOIN [AFGPurchase].[IvL].[Position] ON [AFGPurchase].[IvL].[Account].[AccountId] = [AFGPurchase].[IvL].[Position].[AccountId])
INNER JOIN [AFGPurchase].[IvL].[Product] ON [AFGPurchase].[IvL].[Position].[ProductID] = [AFGPurchase].[IvL].[Product].[ProductId])
INNER JOIN [ABC].[dbo].[vwPrimary] ON [AFGPurchase].[IvL].[Account].[ReportingEntityId] = [ABC].[dbo].[vwPrimary].[RelatedNameId])
INNER JOIN [AFGPurchase].[IvL].[TaxTreatment] ON [AFGPurchase].[IvL].[Account].[TaxTreatmentId] = [AFGPurchase].[IvL].[TaxTreatment].[TaxTreatmentId])
INNER JOIN [AFGPurchase].[IvL].[Transaction] ON [AFGPurchase].[IvL].[Position].[PositionId] = [AFGPurchase].[IvL].[Transaction].[PositionId]
WHERE ((([AFGPurchase].[IvL].[TaxTreatment].[RegistrationType]) LIKE 'NON%')
AND (([AFGPurchase].[IvL].[Product].[Sign])='XYZ2')
AND (([AFGPurchase].[IvL].[Position].[Quantity])<>0)
AND (([AFGPurchase].[IvL].[Transaction].[EffectiveDate]) between '2021-12-31' and '2022-12-31'))
If you only need to check for the existence of a row, and not actually get any data from it then use EXISTS() rather than INNER JOIN, e.g.
SELECT vpr.[RelatedNameId] AS CustomerID
,vpr.[RelatedName] AS CustomerName
,tt.[ParticluarType] AS Type
,prd.[Sign]
,prd.ReportingName
,tr.[EffectiveDate] AS [Date]
FROM [AFGPurchase].[IvL].[Account] AS acc
INNER JOIN [AFGPurchase].[IvL].[Position] AS pos ON acc.[AccountId] = pos.[AccountId]
INNER JOIN [AFGPurchase].[IvL].[Product] AS prd ON pos.[ProductID] = prd.[ProductId]
INNER JOIN [ABC].[dbo].[vwPrimary] AS vpr ON acc.[ReportingEntityId] = vpr.[RelatedNameId]
INNER JOIN [AFGPurchase].[IvL].[TaxTreatment] AS tt ON acc.[TaxTreatmentId] = tt.[TaxTreatmentId]
WHERE tt.[RegistrationType] LIKE 'NON%'
AND prd.[Sign]='XYZ2'
AND pos.[Quantity]<>0
AND EXISTS
( SELECT 1
FROM [AFGPurchase].[IvL].[Transaction] AS tr
WHERE tr.[PositionId] = pos.[PositionId]
AND tr.[EffectiveDate] BETWEEN '2021-12-31' AND '2022-12-31'
);
N.B. I have added in table aliases and removed all the unnecessary parentheses for readability - you may disagree that it is more readable, but I would expect that most people would agree
This may not offer any performance benefits over simply grouping by the columns you are selecting and keeping your joins as they are - SQL is after all a declarative language where you tell the engine what you want, not how to get it. So you may find that the two plans are the same because you are requesting the same result. Using EXISTS does have the advance of being more semantically tied to what you are trying to do though, so gives the optimiser the best chance of getting to the right plan. If you are still having performance issues, then you may need to inspect the execution plan, and see if it suggests any indexes.
Finally, if you are really still using SQL Server 2008 then you really need to start thinking about your upgrade path. It has been completely unsupported for over 3 years now.

How to filter where a condition is true at least once

I need to filter down to only service orders that have a "service" work group value in at least one of their tasks. However, I don't want to get rid of the rows that aren't work group = "Service" if at least one of the task rows has that value. The end result would leave out all data from service orders that didn't have at least one BI_WRKFLW_TASK_KEY that was equal to "SERVICE". I know how to do normal filters but getting it to this specificity is beyond my current experience.
I've experimented with normal filters but they leave out rows that are a part of the same Service Order but just don't have that work group.
SELECT W.BI_WRKFLW_KEY,
T.BI_WORK_EVENT_CD,
T.BI_TASK_CD,
T.BI_WORKGRP,
**M.BI_SO_NBR**,
M.BI_SO_TYPE_CD,
M.BI_CLOSE_DT,
M.BI_OPEN_DT,
M.BI_SO_STAT_CD,
R.BI_WRKFLW_TMPLT_NM,
T.BI_WRKFLW_TASK_SEQ_NBR,
T.BI_WORKGRP,
A.BI_WORK_EVENT_CD,
A.BI_EVENT_DT_TM,
A.SY_JOB_QUEUE_ID,
**A.BI_WORKGRP**,
A.SY_USER_ID,
**A.BI_WRKFLW_TASK_KEY**
FROM BI_WRKFLW W
LEFT JOIN BI_WRKFLW_TASKS T ON W.BI_WRKFLW_KEY = T.BI_WRKFLW_KEY
LEFT JOIN BI_SO_DET D ON W.BI_WRKFLW_KEY = D.BI_WRKFLW_KEY
LEFT JOIN BI_SO_MASTER M ON D.BI_SO_NBR = M.BI_SO_NBR
LEFT JOIN BI_WRKFLW_TMPLT_REF R ON W.BI_WRKFLW_TMPLT_ID = R.BI_WRKFLW_TMPLT_ID
LEFT JOIN BI_TASK_ACT A ON T.BI_WRKFLW_TASKS_KEY = A.BI_WRKFLW_TASKS_KEY
WHERE M.BI_OPEN_DT >= ADD_MONTHS(CURRENT_DATE, -'12')
--AND M.BI_SO_TYPE_CD IN ('IVC-NEW1')
--AND M.BI_SO_STAT_CD LIKE 'O'
ORDER BY M.BI_SO_NBR, T.BI_EVENT_DT_TM
Any Service order row where the Service order has at least one BI_WRKFLOW_TASK_CD = "Service" would be kept and all other service orders filtered out.
I tried to map this out, i may not have got it quite right,
I think you are asking for BI_SO_MASTER records that have >=1 BI_WRKFLW_TASKS that belong to a certain group.
Try using a CTE to get the detail rows with a correct task, then you can find the SO population... then you can ???not sure what the ultimate result set goal is?
;with matchingTasks as ( D.BI_SO_NBR, D.<id> , W.BI_WRKFLW_KEY , T.<key> , A.Key
from BI_WRKFLW W
LEFT JOIN BI_WRKFLW_TASKS T ON W.BI_WRKFLW_KEY = T.BI_WRKFLW_KEY
LEFT JOIN BI_SO_DET D ON W.BI_WRKFLW_KEY = D.BI_WRKFLW_KEY
LEFT JOIN BI_TASK_ACT A ON T.BI_WRKFLW_TASKS_KEY = A.BI_WRKFLW_TASKS_KEYW
Where
<good dates>
and <A.field is what I am looking for>
)
/*Here you have the SO population
as well as the ids that helped this SO qualify.
*/
, My_SO_Population as (select Distinct BI_SO_NBR from matchingTasks )
/*now you can go get what you need.
the challenge of finding SOs w/ >=1 matching task has been solved...
*/
select <necessary fields> from
My_SO_Population
join <whatever you need....this is where i am cloudy>
if i am missing the goal, let me know where...
You can just add this to your WHERE clause:
AND T.BI_WRKFLW_KEY IN (
SELECT BI_WRKFLW_KEY
FROM BI_WRKFLW_TASKS
WHERE BI_WRKFLOW_TASK_CD = 'Service')

Sum up multiple values based off a single column

For context, I work in transportation. Also, I apologize for a poor title - I'm not exactly sure how to summarize my issue.
I am currently editing an existing report which returns a drivers ID, their name, when they were hired, and the total amount of miles they have driven since they have started at the company. It was brought to my attention that drivers who move within the company are assigned a different driverID, which is not counted towards their total miles driven. Using an example provided to me, I was indeed able to confirm this scenario, as indicated below:
DriverCode DriverName
----------- ----------------
WETDE Wethington,Dean
WETDEA Wethington,Dean
This is the query that gets the above (example driver is hardcoded at the moment):
select mpp.mpp_id as DriverCode,
mpp.mpp_lastfirst as DriverName
from manpowerprofile mpp
outer apply (select top 1 mpp_id
from manpowerprofile) as id
where mpp_firstname = 'Dean'
and mpp_lastname = 'Wethington'
This is the current query as it stands:
SELECT lh.lgh_driver1 as DriverCode
,m.mpp_lastfirst as DriverName
,m.mpp_hiredate as HireDate
,SUM(s.stp_lgh_mileage) as TotMiles
FROM stops s (nolock)
INNER JOIN legheader lh (nolock) on lh.lgh_number = s.lgh_number
INNER JOIN manpowerprofile m (nolock) on m.mpp_id = lh.lgh_driver1
/* OUTER APPLY ( SELECT top 1 mpp_id
FROM manpowerprofile) as id */
WHERE m.mpp_terminationdt > GETDATE()
AND m.mpp_id <> 'UNKNOWN'
AND lh.lgh_outstatus = 'CMP'
GROUP BY lh.lgh_driver1, m.mpp_lastfirst, m.mpp_hiredate
HAVING SUM(s.stp_lgh_mileage) > 850000
ORDER BY DriverCode DESC
What I'm looking to do is check to see if a name exists twice, and if it does, add both of those driver code's total miles together to return a single result for that individual driver. I'm a pretty novice SQL Developer still and have only now really started to delve into databases.
My current train of thought was to use an outer apply, but I'm sure there's a better way to do this.
As per your comment, leaving off the driver code and hire date...
(Because they could/would be different for the drivers being combined.)
SELECT
m.mpp_lastfirst as DriverName
,SUM(s.stp_lgh_mileage) as TotMiles
FROM
stops s (nolock)
INNER JOIN
legheader lh (nolock)
on lh.lgh_number = s.lgh_number
INNER JOIN
manpowerprofile m (nolock)
on m.mpp_id = lh.lgh_driver1
WHERE
m.mpp_terminationdt > GETDATE()
AND m.mpp_id <> 'UNKNOWN'
AND lh.lgh_outstatus = 'CMP'
GROUP BY
m.mpp_lastfirst
HAVING
SUM(s.stp_lgh_mileage) > 850000
ORDER BY
m.mpp_lastfirstDESC

Sum(IIF( including results with 0 count

Hey All i am using sum iff to return a count based on multiple criteria.
i am basically running a report on calls recieved per site, however i need sites with 0 calls included in the result set, with the value of 0 or even Null, if they have no calls for that week.
only issue is that my where cluase has only included sites that have had calls in the week
Any ideas.
Code:
SELECT
d.sitename,
count(c.Chargeablecalls) AS All_Calls,
SUM(IIf(c.ChargeableCalls Like "Chargeable",1,0)) AS Chargeable_calls,
d.sitetype
FROM
(Callstatus AS s LEFT JOIN statusconversion AS c ON s.description=c.reportheading)
INNER JOIN sitedetails AS d ON s.zone=d.zone
WHERE s.date_loaded BETWEEN
(SELECT reportdate FROM reportMonth) AND (SELECT priorweek FROM reportMonth)
GROUP BY d.sitename, d.sitetype;
You need a RIGHT JOIN for sitedetails in order to get all the sites even those with no calls.
You may need to do the first half of query separately and then use that query in the main query.
create a new query - qryCallStatus:
SELECT DISTINCT zone, description
FROM Callstatus, reportMonth
WHERE
Callstatus.date_loaded BETWEEN reportMonth.reportdate AND reportMonth.priorweek;
Then change your output query to:
SELECT
d.sitename,
count(c.Chargeablecalls) AS All_Calls,
SUM(IIf(c.ChargeableCalls Like "Chargeable",1,0)) AS Chargeable_calls,
d.sitetype
FROM
(sitedetails AS d LEFT JOIN qryCallStatus AS s ON d.zone=s.zone)
LEFT JOIN statusconversion AS c ON s.description=c.reportheading
GROUP BY d.sitename, d.sitetype;

Timeout running SQL query

I'm trying to using the aggregation features of the django ORM to run a query on a MSSQL 2008R2 database, but I keep getting a timeout error. The query (generated by django) which fails is below. I've tried running it directs the SQL management studio and it works, but takes 3.5 min
It does look it's aggregating over a bunch of fields which it doesn't need to, but I wouldn't have though that should really cause it to take that long. The database isn't that big either, auth_user has 9 records, ticket_ticket has 1210, and ticket_watchers has 1876. Is there something I'm missing?
SELECT
[auth_user].[id],
[auth_user].[password],
[auth_user].[last_login],
[auth_user].[is_superuser],
[auth_user].[username],
[auth_user].[first_name],
[auth_user].[last_name],
[auth_user].[email],
[auth_user].[is_staff],
[auth_user].[is_active],
[auth_user].[date_joined],
COUNT([tickets_ticket].[id]) AS [tickets_captured__count],
COUNT(T3.[id]) AS [assigned_tickets__count],
COUNT([tickets_ticket_watchers].[ticket_id]) AS [tickets_watched__count]
FROM
[auth_user]
LEFT OUTER JOIN [tickets_ticket] ON ([auth_user].[id] = [tickets_ticket].[capturer_id])
LEFT OUTER JOIN [tickets_ticket] T3 ON ([auth_user].[id] = T3.[responsible_id])
LEFT OUTER JOIN [tickets_ticket_watchers] ON ([auth_user].[id] = [tickets_ticket_watchers].[user_id])
GROUP BY
[auth_user].[id],
[auth_user].[password],
[auth_user].[last_login],
[auth_user].[is_superuser],
[auth_user].[username],
[auth_user].[first_name],
[auth_user].[last_name],
[auth_user].[email],
[auth_user].[is_staff],
[auth_user].[is_active],
[auth_user].[date_joined]
HAVING
(COUNT([tickets_ticket].[id]) > 0 OR COUNT(T3.[id]) > 0 )
EDIT:
Here are the relevant indexes (excluding those not used in the query):
auth_user.id (PK)
auth_user.username (Unique)
tickets_ticket.id (PK)
tickets_ticket.capturer_id
tickets_ticket.responsible_id
tickets_ticket_watchers.id (PK)
tickets_ticket_watchers.user_id
tickets_ticket_watchers.ticket_id
EDIT 2:
After a bit of experimentation, I've found that the following query is the smallest that results in the slow execution:
SELECT
COUNT([tickets_ticket].[id]) AS [tickets_captured__count],
COUNT(T3.[id]) AS [assigned_tickets__count],
COUNT([tickets_ticket_watchers].[ticket_id]) AS [tickets_watched__count]
FROM
[auth_user]
LEFT OUTER JOIN [tickets_ticket] ON ([auth_user].[id] = [tickets_ticket].[capturer_id])
LEFT OUTER JOIN [tickets_ticket] T3 ON ([auth_user].[id] = T3.[responsible_id])
LEFT OUTER JOIN [tickets_ticket_watchers] ON ([auth_user].[id] = [tickets_ticket_watchers].[user_id])
GROUP BY
[auth_user].[id]
The weird thing is that if I comment out any two lines in the above, it runs in less that 1s, but it doesn't seem to matter which lines I remove (although obviously I can't remove a join without also removing the relevant SELECT line).
EDIT 3:
The python code which generated this is:
User.objects.annotate(
Count('tickets_captured'),
Count('assigned_tickets'),
Count('tickets_watched')
)
A look at the execution plan shows that SQL Server is first doing a cross-join on all the table, resulting in about 280 million rows, and 6Gb of data. I assume that this is where the problem lies, but why is it happening?
SQL Server is doing exactly what it was asked to do. Unfortunately, Django is not generating the right query for what you want. It looks like you need to count distinct, instead of just count: Django annotate() multiple times causes wrong answers
As for why the query works that way: The query says to join the four tables together. So say an author has 2 captured tickets, 3 assigned tickets, and 4 watched tickets, the join will return 2*3*4 tickets, one for each combination of tickets. The distinct part will remove all the duplicates.
what about this?
SELECT auth_user.*,
C1.tickets_captured__count
C2.assigned_tickets__count
C3.tickets_watched__count
FROM
auth_user
LEFT JOIN
( SELECT capturer_id, COUNT(*) AS tickets_captured__count
FROM tickets_ticket GROUP BY capturer_id ) AS C1 ON auth_user.id = C1.capturer_id
LEFT JOIN
( SELECT responsible_id, COUNT(*) AS assigned_tickets__count
FROM tickets_ticket GROUP BY responsible_id ) AS C2 ON auth_user.id = C2.responsible_id
LEFT JOIN
( SELECT user_id, COUNT(*) AS tickets_watched__count
FROM tickets_ticket_watchers GROUP BY user_id ) AS C3 ON auth_user.id = C3.user_id
WHERE C1.tickets_captured__count > 0 OR C2.assigned_tickets__count > 0
--WHERE C1.tickets_captured__count is not null OR C2.assigned_tickets__count is not null -- also works (I think with beter performance)