Drastic CTE query speed degradation on actions in outer select - sql

I have a complex report (Similar to example below) it returns with only 52 items in 42 seconds (Slow but does complex joins) it drastically slows down when I do the following:
Add a column in the outer select to get a serialized list of items for
each item in the result (Adds 150 seconds) - See function.
Insert result into a variable table (Adds 120 seconds), I though hoped flattening it would reduce the (1) issue but no cigar.
How I understand the execution is that the where is executed first then logic in the select (Should only be done for the 52 result items). However if I run that exact 52 items with the (1) scenario it only takes 7 seconds vs the added 150 when using it on the outer CTE select.
Why could this be and how can I add the column without the bloated execution time?
CREATE FUNCTION PriorShippers
(
#customerId nchar(5)
)
RETURNS varchar(500)
AS
BEGIN
DECLARE #return varchar(500);
with data as
(
select distinct S.CompanyName from Customers C
join Orders O on C.CustomerID = O.CustomerID
join Shippers S on O.ShipVia = S.ShipperID
where C.CustomerID = #customerId
) select #return = STUFF((select CompanyName + ' ' from data FOR XML PATH('')),1,0,'')
return #return
END
The query (I used Northwind database - Install Instructions here for SQL 2012)
DECLARE #categories TABLE
(
Name varchar(100),
SourceCountry varchar(100)
);
insert into #categories VALUES ('Seafood', 'US');
insert into #categories VALUES ('Beverages', 'US');
insert into #categories VALUES ('Condiments', 'US');
insert into #categories VALUES ('Dairy Products', 'India');
insert into #categories VALUES ('Grains/Cereals', 'India');
with data as
(
select C.CustomerID, C.CompanyName,
(CASE WHEN EXISTS(select * from Orders O where O.CustomerID = C.CustomerID) THEN
(select count(distinct CAT.CategoryID) from Orders O
join [Order Details] OD on O.OrderID = OD.OrderID
join Products P on OD.ProductID = P.ProductID
join Categories CAT on P.CategoryID = CAT.CategoryID
where EXISTS(select * from #categories where Name = CAT.CategoryName AND SourceCountry = 'US'))
ELSE 0 END) as 'US Orders',
(CASE WHEN EXISTS(select * from Orders O where O.CustomerID = C.CustomerID) THEN
(select count(distinct CAT.CategoryID) from Orders O
join [Order Details] OD on O.OrderID = OD.OrderID
join Products P on OD.ProductID = P.ProductID
join Categories CAT on P.CategoryID = CAT.CategoryID
where EXISTS(select * from #categories where Name = CAT.CategoryName AND SourceCountry = 'India'))
ELSE 0 END) as 'India Orders'
from Customers C
) select top 10 CompanyName, [US Orders], [India Orders]
-- Below: Adding this have significant slow down
, dbo.PriorShippers(CustomerID)
from data where [US Orders] > 0 Order By [US Orders]

You are executing a user-defined function for each selected row. In short, don't do that.
Move the code in the function into a DTE directly within the report.
Also, I noticed that the joins for your counts are identical. They look like prime candidates for a third DTE that you could simply join to to further improve the performance of the overall query.

Related

T-SQL SELECT with multiple variables

I am trying to get this output using a SQL statement and the NORTHWIND database:
Employee Name:Nancy Davolio
Number of Sales:345
Total Sales:192107.60
Employee Name:Andrew Fuller
Number of Sales:241
Total Sales:166537.75
Employee Name:Janet Leverling
Number of Sales:321
Total Sales:202812.84
Employee Name:Margaret Peacock
Number of Sales:420
Total Sales:232890.85
Employee Name:Steven Buchanan
Number of Sales:117
Total Sales:68792.28
...and 4 more entries
When I use this statement:
USE Northwind
DECLARE #EmployeeName VARCHAR(40),
#NumberOfSales INT,
#TotalSales DECIMAL(10,2),
#Counter TINYINT = 1,
#NumEmployees INT = IDENT_CURRENT('dbo.Employees');
WHILE #Counter < #NumEmployees
BEGIN
--SELECT #EmployeeName = E.FirstName+' '+E.LastName
--SELECT #NumberOfSales = count(od.OrderID)
SELECT #TotalSales = SUM(unitprice * quantity * (1 - Discount))
FROM Employees E
JOIN Orders AS O ON O.EmployeeID = E.EmployeeID
JOIN [Order Details] AS OD ON OD.OrderID = O.OrderID
WHERE E.EmployeeID = #Counter
PRINT 'Employee Name: '--+ #EmployeeName;
PRINT 'Number of Sales: '--+ LTRIM(STR(#NumberOfSales));
PRINT 'Total Sales: '+CONVERT(varchar(10),#TotalSales);
PRINT '';
SET #Counter += 1;
END
I can get each select to work singly but I cannot figure out the syntax to get a single SELECT statement to do all the work. I should also be able to do this with three SET statements but I've not been able to figure that out either. Pointers to both possibilities would be awesome.
Here's that actual step verbiage:
"Within the loop, use a SELECT statement to retrieve the first and last name of each employee, the number of orders handled by each employee and the total sales amount for each employee (you are processing each employee one by one). You will need to join multiple tables together and use aggregate functions to get a count and a total. Assign the concatenated full name, number of sales and total sales amount to the appropriate variables."
Output should be in Messages tab, no table or format other than the expected output listed above.
There is no need for loop(RBAR - Row By Agonizing Row approach should be avoided if possible):
SELECT EmployeeID
,[Employee Name] = E.FirstName+' '+E.LastName
,[TotalSales] = SUM(unitprice * quantity * (1-Discount))
,[NumberOfSales] = COUNT(DISTINCT o.OrderID)
FROM Employees E
JOIN Orders AS O ON O.EmployeeID = E.EmployeeID
JOIN [Order Details] AS OD ON OD.OrderID = O.OrderID
GROUP BY E.EmployeeID, E.FirstName+' '+E.LastName
ORDER BY E.EmployeeID;
EDIT:
Loop version - assigning multiple variables at once.
USE Northwind
DECLARE #EmployeeName VARCHAR(40),
#NumberOfSales INT,
#TotalSales DECIMAL(10,2),
#Counter TINYINT = 1,
#NumEmployees INT = IDENT_CURRENT('dbo.Employees');
WHILE #Counter < #NumEmployees
BEGIN
SELECT #EmployeeName = E.FirstName+' '+E.LastName
,#NumberOfSales = COUNT(DISTINCT o.OrderID)
,#TotalSales = SUM(unitprice * quantity * (1 - Discount))
FROM Employees E
JOIN Orders AS O ON O.EmployeeID = E.EmployeeID
JOIN [Order Details] AS OD ON OD.OrderID = O.OrderID
WHERE E.EmployeeID = #Counter
GROUP BY E.FirstName+' '+E.LastName;
PRINT 'Employee Name: '+ #EmployeeName;
PRINT 'Number of Sales: '+ LTRIM(STR(#NumberOfSales));
PRINT 'Total Sales: '+ CONVERT(varchar(10),#TotalSales);
PRINT '';
SET #Counter += 1;
END
Please note that using WHILE loop maybe very inefficient when you have gaps(i.e. you are starting from 1 up to IDENT_CURRENT, it may be a situation where you have ids like 1,5, 200671 and you end up with unecessary looping).
EDIT 2:
It seems the GROUP BY is required when multiple assigns take place in the select
I've added GROUP BY because FirstName and LastName was not wrapped with aggregated function. You could skip that clause but then you need to add MIN/MAX function:
SELECT #EmployeeName = MIN(E.FirstName)+' '+MIN(E.LastName)
,#NumberOfSales = COUNT(DISTINCT o.OrderID)
,#TotalSales = SUM(unitprice * quantity * (1 - Discount))
FROM Employees E
JOIN Orders AS O ON O.EmployeeID = E.EmployeeID
JOIN [Order Details] AS OD ON OD.OrderID = O.OrderID
WHERE E.EmployeeID = #Counter;
-- and we are sure that all values for First/Last nane are the same because of
-- WHERE E.EmployeeID = #Counter
Related: Group by clause
In standard SQL, a query that includes a GROUP BY clause cannot refer to nonaggregated columns in the select list that are not named in the GROUP BY clause
This should do it. I used CROSS APPLY to unpivot the set and then format it accordingly. You can read more about it in the article called: "CROSS APPLY an Alternative Method to Unpivot". Since SQL works with sets, input and output from SQL should always be a set in my humble opinion.
I am afraid that the way you formatted might not be a SQL's job but still do-able with a "single" select statement as a set operation:
;WITH CTE AS
(
SELECT
EMPLOYEENAME = E.FirstName +' '+ E.LastName,
NUMBEROFORDERS = COUNT(OD.OrderID),
TOTALSALES = SUM(unitprice * quantity * (1-Discount))
FROM Employees E
INNER JOIN Orders AS O ON O.EmployeeID = E.EmployeeID
INNER JOIN [Order Details] AS OD ON OD.OrderID = O.OrderID
GROUP BY E.FirstName + ' ' + E.LastName
)
SELECT COLNAME, ColValue
FROM CTE
CROSS APPLY ( VALUES ('Employe Name:', EMPLOYEENAME),
('Number of Sales:', LTRIM(STR(NUMBEROFORDERS, 25, 5)) ),
('Total Sales:', LTRIM(STR(TOTALSALES, 25, 5)) ),
('','')
) A (COLNAME, ColValue)
Sample output is following:
COLNAME ColValue
------------- | -------------
Employe Name: | Nancy Davolio
Number of Sales:| 345.00000
Total Sales: | 192107.60432

Divide results in two columns depending on the input values? SQL Server

I am using the Nortwind database with SQL Server 2014, I try to make a query to divide the results of the orders in two different years, The format that I want in my query is
category |anio one | anio two
where the years may vary , What I try so far is
SELECT ca.CategoryName , YEAR(o.OrderDate), SUM(ot.UnitPrice*ot.Quantity) as total
FROM Orders o
INNER JOIN [Order Details] ot ON O.OrderID = ot.OrderID
INNER JOIN Products pro ON ot.ProductID = pro.ProductID
INNER JOIN Categories ca ON pro.CategoryID = ca.CategoryID
GROUP BY ca.CategoryName,YEAR(o.OrderDate)
ORDER BY ca.CategoryName;
This gives me the totals of each category for a different year, 1996-1997-1998 in column YEAR(o.OrderDate)
I want to get for example
CategoryName | 1996 | 1997
Beverages |53879,20 | 110424,00
Condiments |19458,30 | 59679,00
....
Use "conditional aggregates".
SELECT
ca.CategoryName
, SUM(case when year(o.OrderDate) = 1996 then ot.UnitPrice * ot.Quantity end) AS total_1996
, SUM(case when year(o.OrderDate) = 1997 then ot.UnitPrice * ot.Quantity end) AS total_1997
FROM Orders o
INNER JOIN [Order Details] ot ON o.OrderID = ot.OrderID
INNER JOIN Products pro ON ot.ProductID = pro.ProductID
INNER JOIN Categories ca ON pro.CategoryID = ca.CategoryID
where o.OrderDate >= '19960101' and o.OrderDate < '19980101'
GROUP BY
ca.CategoryName
ORDER BY
ca.CategoryName
Basically that means use a case expression inside the aggregate function.
I case you are wondering why I have not used "between in the where clause: see
Bad habits to kick : mis-handling date / range queries
You can use PIVOT to get your desired Output
BEGIN TRAN
CREATE TABLE #Temp(CategoryName NVARCHAR(50),[Year]INT,TOTAL DECIMAL(15,2))
INSERT INTO #Temp
SELECT ca.CategoryName , YEAR(o.OrderDate), SUM(ot.UnitPrice*ot.Quantity) as total
FROM Orders o
INNER JOIN [Order Details] ot ON O.OrderID = ot.OrderID
INNER JOIN Products pro ON ot.ProductID = pro.ProductID
INNER JOIN Categories ca ON pro.CategoryID = ca.CategoryID
GROUP BY ca.CategoryName,YEAR(o.OrderDate)
ORDER BY ca.CategoryName;
SELECT * FROM #Temp
GO
select *
from
(
select CategoryName, [Year], TOTAL
from #Temp
) src
pivot
(
sum(TOTAL)
for YEAR in ([1996], [1997]
)) piv;
ROLLBACK TRAN
you can use pivot to get the desired output
CREATE TABLE #TEMP
(
Category VARCHAR(200),
YEAR1 NUMERIC,
Total MONEY
)
INSERT INTO #TEMP
SELECT 'beverages', 1996, 500
union
SELECT 'beverages', 1997, 750
union
SELECT 'Condiments', 1997, 1000
union
SELECT 'Condiments', 1996, 800
SELECT *
FROM
(
SELECT Category,YEAR1, Total FROM #TEMP
) AS SourceTable
PIVOT
(
AVG(Total) FOR YEAR1 IN ( [1996], [1997])
) AS PivotTable;

Update column with a subquery

ALTER TABLE order_t ADD Totalfixed DECIMAL(7,2);
UPDATE Order_t
SET Totalfixed = (
SELECT orderid, SUM(price * quantity) AS tf
FROM
orderline ol,
product p
WHERE
ol.productid = p.productid
AND ol.orderid = orderid
GROUP BY orderid
);
Everything works fine separately but I get:
operand should contain 1 column
And if I remove orderid from the subquery, I get:
subquery returns more than 1 row
Is there anyway to make this work without a join?
Regardless of the database, the context requires a scalar subquery. This means avoid the group by and return only one column:
UPDATE Order_t
SET Totalfixed = (
SELECT SUM(price * quantity) AS tf
FROM orderline ol JOIN
product p
ON ol.productid = p.productid
WHERE ol.orderid = Order_t.orderid
);
I also fixed the JOIN syntax (always use explicit joins) and the correlation clause so it refers to the outer query.
UPDATE A
SET Totalfixed = SUM(price * quantity)
FROM Order_t A
INNER JOIN orderline ol ON ol.orderid = A.orderid
INNER JOIN product p ON ol.productid = p.productid

SQL Change output of column if duplicate

I have a table which has rows for each product that a customer has purchased. I want to output a column from a SELECT query which shows the time it takes to deliver said item based on whether the customer has other items that need to be delivered. The first item takes 5 mins to deliver and all subsequent items take 2 mins to deliver e.g. 3 items would take 5+2+2=9 mins to deliver.
This is what I have at the moment(Using the Northwind sample database on w3schools to test the query):
SELECT orders.customerid,
orders.orderid,
orderdetails.productid,
CASE((SELECT Count(orders.customerid)
FROM orders
GROUP BY orders.customerid))
WHEN 1 THEN '00:05'
ELSE '00:02'
END AS DeliveryTime
FROM orders
LEFT JOIN orderdetails
ON orderdetails.orderid = orders.orderid
This outputs '00:05' for every item due to the COUNT in my subquery(I think?), any ideas on how to fix this?
Try this
SELECT orders.customerid,
orders.orderid,
orderdetails.productid,
numberorders,
2 * ( numberorders - 1 ) + 5 AS deleveryMinutes
FROM orders
INNER JOIN (SELECT orders.customerid AS countId,
Count(1) AS numberOrders
FROM orders
GROUP BY orders.customerid) t1
ON t1.countid = orders.customerid
LEFT JOIN orderdetails
ON orderdetails.orderid = orders.orderid
ORDER BY customerid
Gregory's answer works a treat and here's my attempts
-- Without each product line item listed
SELECT O.CustomerId,
O.OrderId,
COUNT(*) AS 'NumberOfProductsOrderd',
CASE COUNT(*)
WHEN 1 THEN 5
ELSE (COUNT(*) * 2) + 3
END AS 'MinutesToDeliverAllProducts'
FROM Orders AS O
INNER JOIN OrderDetails AS D ON D.OrderId = O.OrderId
GROUP BY O.CustomerId, O.OrderId
-- Without each product line item listed
SELECT O.CustomerId,
O.OrderId,
D.ProductId,
CASE
WHEN P.ProductsInOrder = 1 THEN 5
ELSE (P.ProductsInOrder * 2) + 3
END AS 'MinutesToDeliverAllProducts'
FROM Orders AS O
INNER JOIN OrderDetails AS D ON D.OrderId = O.OrderId
INNER JOIN (
SELECT OrderId, COUNT(*) AS ProductsInOrder
FROM OrderDetails
GROUP BY OrderId
) AS P ON P.OrderId = O.OrderId
GROUP BY O.CustomerId,
O.OrderId,
D.ProductId,
P.ProductsInOrder
Final code is below for anyone interested:
SELECT O.CustomerId,
O.OrderId,
Group_Concat(D.ProductID) AS ProductID,
CASE COUNT(*)
WHEN 1 THEN 5
ELSE (COUNT(*) * 2) + 3
END AS 'MinutesToDeliverAllProducts'
FROM Orders AS O
INNER JOIN OrderDetails AS D ON D.OrderId = O.OrderId
GROUP BY O.CustomerId

SQL Server 2008 R2 Invalid object name error

Here's the stored procedure I'm trying to implement:
Alter PROCEDURE pro_worst_supplier_test
#datey nvarchar(4), #datem nvarchar(2)
AS
BEGIN
-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
SET NOCOUNT ON;
-- Insert statements for procedure here
select y.CompanyName from (SELECT s.CompanyName,(select min(t.b) from (select (round(sum(OD.Quantity * (1-OD.Discount) * OD.UnitPrice),0)) as b from [Order Details] group by ProductID) t) as q
FROM ((((Products p
inner join [Order Details] OD on p.ProductID=OD.ProductID)
inner join Categories c on p.CategoryID=c.CategoryID)
inner join Suppliers s on p.SupplierID=s.SupplierID)
inner join Orders o on o.OrderID=od.OrderID)
where c.CategoryName='Produce' AND SUBSTRING(CONVERT(nvarchar(22), o.OrderDate, 111),1,4)=#datey AND SUBSTRING(CONVERT(nvarchar(22),o.OrderDate, 111),6,2)=#datem
group by s.CompanyName) y
where q=(select MIN(q)from y)
END
GO
When I tried to execute it by
exec proc_worst_supplier_test '1998','04'
I got an error
Invalid object name 'y'
try this:
;WITH CTE as (
SELECT s.CompanyName,(select min(t.b) from (select (round(sum(OD.Quantity * (1-OD.Discount) * OD.UnitPrice),0)) as b from [Order Details] group by ProductID) t) as q
FROM ((((Products p
inner join [Order Details] OD on p.ProductID=OD.ProductID)
inner join Categories c on p.CategoryID=c.CategoryID)
inner join Suppliers s on p.SupplierID=s.SupplierID)
inner join Orders o on o.OrderID=od.OrderID)
where c.CategoryName='Produce' AND SUBSTRING(CONVERT(nvarchar(22), o.OrderDate, 111),1,4)=#datey AND SUBSTRING(CONVERT(nvarchar(22),o.OrderDate, 111),6,2)=#datem
group by s.CompanyName
)
select CompanyName from CTE where q=(select MIN(q) from CTE)