More effective way to write following SQL query - sql

I am writing a query to return a list of articles for the news portal homepage.
Requirement is following.
Each category which needs to be on the homepage needs to display 5 articles by following criteria.
Each category needs to have one article which is main news for the category, followed by 4 most popular news at the time being.
If there is no first news for category set, then display 5 most popular insted.
I wrote a SQL Function which has CategoryID parameter and another SQL procedure which calls that function N Times.
Is there more efficient way to write this query?
Function
CREATE FUNCTION [dbo].[Fn_FetchHomepageCategory]
(
-- Add the parameters for the function here
#categoryId int
)
RETURNS #ArticlesToReturn TABLE
( Id int,
Title nvarchar(500),
Slug nvarchar(500),
Summary nvarchar(1500),
IsCategoryFirst bit,
RootCategoryId int,
RootCategory nvarchar(500),
OldFacebookCommentsUrl nvarchar(500),
Icon nvarchar(500),
TopicName nvarchar(500),
MainArticlePhoto nvarchar(500),
FrontPagePhoto nvarchar(500),
PublishDate datetime
)
AS
BEGIN
-- select category first news if any
INSERT INTO #ArticlesToReturn
SELECT TOP 1
ART.Id, ART.Title, ART.InitialTitle, ART.Summary,ART.IsCategoryFirst,
ART.RootCategoryId, CAT.Name, ART.OldFacebookCommentsUrl, ICO.CssClass,
ART.TopicName, ART.MainArticlePhoto, ART.FrontPagePhoto, ART.PublishDate
FROM Articles ART WITH (NOLOCK)
INNER JOIN ArticleViewCountSum AVS WITH (NOLOCK) ON AVS.ArticleId = ART.Id
INNER JOIN Categories CAT WITH (NOLOCK) ON CAT.Id = ART.RootCategoryId
LEFT JOIN ArticleIcons ICO WITH (NOLOCK) ON ICO.Id = ART.IconId
WHERE ART.RootCategoryId = #categoryId
AND ART.PublishDate < GETDATE()
AND ART.Active = 1
AND IsCategoryFirst = 1
-- select 5 most popular by coefficient
INSERT INTO #ArticlesToReturn
SELECT TOP 5
ART.Id, ART.Title, ART.InitialTitle, ART.Summary,ART.IsCategoryFirst,
ART.RootCategoryId, CAT.Name, ART.OldFacebookCommentsUrl, ICO.CssClass,
ART.TopicName, ART.MainArticlePhoto, ART.FrontPagePhoto, ART.PublishDate
FROM Articles ART WITH (NOLOCK)
INNER JOIN ArticleViewCountSum AVS WITH (NOLOCK) ON AVS.ArticleId = ART.Id
INNER JOIN Categories CAT WITH (NOLOCK) ON CAT.Id = ART.RootCategoryId
LEFT JOIN ArticleIcons ICO WITH (NOLOCK) ON ICO.Id = ART.IconId
WHERE ART.RootCategoryId = #categoryId
AND ART.PublishDate < GETDATE()
AND ART.Active = 1
ORDER BY ART.Coefficient DESC
RETURN
END
Stored procedure:
CREATE PROCEDURE [dbo].[Fetch_HomePageArticles]
AS
BEGIN
-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
SET NOCOUNT ON;
DECLARE #dateNow datetime = GETDATE();
-- first main news
SELECT TOP 1 * FROM Articles
WHERE IsFirst = 1 AND PublishDate < #dateNow
--TODO: featured
SELECT TOP 10 * From Featured
WHERE PublishDate < #dateNow AND Active = 1
ORDER BY PublishDate DESC
SELECT TOP 5 * FROM Fn_FetchHomepageCategory(3)
SELECT TOP 5 * FROM Fn_FetchHomepageCategory(150)
SELECT TOP 5 * FROM Fn_FetchHomepageCategory(1523)
SELECT TOP 5 * FROM Fn_FetchHomepageCategory(1509)
SELECT TOP 5 * FROM Fn_FetchHomepageCategory(1569)
SELECT TOP 5 * FROM Fn_FetchHomepageCategory(1545)
SELECT TOP 5 * FROM Fn_FetchHomepageCategory(1548)
SELECT TOP 5 * FROM Fn_FetchHomepageCategory(67)
END
I tried to modify function to have only one SELECT and included Order BY IsFirstCategory DESC, but query ran much slower then.

One potential improvement would be merging two SELECT clauses in the Fn_FetchHomepageCategory function into one single query by adding a new made-up Coefficient parameter:
SELECT
TOP 5 ART.Id,
ART.Title,
ART.InitialTitle,
ART.Summary,
ART.IsCategoryFirst,
ART.RootCategoryId,
CAT.Name,
ART.OldFacebookCommentsUrl,
ICO.CssClass,
ART.TopicName,
ART.MainArticlePhoto,
ART.FrontPagePhoto,
ART.PublishDate
FROM
Articles ART WITH (NOLOCK)
INNER JOIN ArticleViewCountSum AVS WITH (NOLOCK) ON AVS.ArticleId = ART.Id
INNER JOIN Categories CAT WITH (NOLOCK) ON CAT.Id = ART.RootCategoryId
LEFT JOIN ArticleIcons ICO WITH (NOLOCK) ON ICO.Id = ART.IconId
WHERE
ART.RootCategoryId = #categoryId
AND ART.PublishDate < GETDATE()
AND ART.Active = 1
ORDER BY
CASE IsCategoryFirst
WHEN 1 THEN 1000000
ELSE ART.Coefficient
END DESC
You can replace 1000000 with another big number. Its only point is assigning the highest co-efficiency score possible to the post that have IsCategoryFirst = 1.
Please note that it works fine only if you have only one post with IsCategoryFirst = 1.

Related

Optional Where Clause Based On Another Table

I have a stored proc where a table of integers is passed as a parameter. I'm trying to find a reasonable way of writing "give me all the records, but if the parameter table has values in it then limit my results to those values".
Both approaches in the queries below work, but when I use either approach in my real-world proc (with a substantial number of joins and apply clauses and a ton of data) it's quite a bit slower than I would like even when the number of rows in the variable table is limited to 1 or 2 records.
Is there a better way of doing this?
-- Apprroach1 - Weird WHERE clause
IF OBJECT_ID('tempdb.dbo.#list') IS NOT NULL DROP TABLE #list
create table #list(Id int)
insert into #list(id) values (726), (712), (725)
declare #listCount int
select #listCount = count(*) from #list
select * from SalesLT.Product p
where 1 = 1
AND
(
#listCount > 0 AND p.ProductID in (select Id from #list)
OR
#listCount = 0
)
and
-- approach 2 - goofy looking JOIN
IF OBJECT_ID('tempdb.dbo.#list') IS NOT NULL DROP TABLE #list
create table #list(Id int)
insert into #list(id) values (726), (712), (725)
declare #listCount int
select #listCount = count(*) from #list
select * from SalesLT.Product p
inner join #list l on
case when #listCount > 0 and l.Id = p.ProductID Then 1
else 0
end = 1
Generally, case when in joins (ON clause) are avoided as it will make less performant query.
Use the left join approach as follows:
select * from SalesLT.Product p
Left join #list l on l.Id = p.ProductID
Where ( (#listCount > 0 and l.id is not null)
or #listCount = 0)
Try this
IF #listCount > 0
BEGIN
SELECT
*
FROM SalesLT.Product p
------------------------
INNER JOIN #list l ON
------------------------
l.Id = p.ProductID
------------------------
END
-- I assume you want to output everything if #listCount = 0
ELSE IF #listCount = 0
BEGIN
SELECT
*
FROM SalesLT.Product p
END
If you have a bunch of joins using that table outputs, you can store the output and use it on your real join/query.
Example:
IF #listCount > 0
BEGIN
SELECT
*
INTO #TempSalesTbl
FROM SalesLT.Product p
------------------------
INNER JOIN #list l ON
------------------------
l.Id = p.ProductID
------------------------
END
-- In your query
SELECT
*
FROM Table A
INNER JOIN #TempSalesTbl ON
...

SQL Server query inefficient for table with high I/O operations

I'm trying to write an sql script that returns an item from a list, if that item can be found in the list, if not, it returns the most recent item added to the list. I came up with a solution using count and an if-else statement. However my table has very frequent I/O operations and I think this solution is inefficient. Does anyone have a away to optimize this solution or a better approach.
here is my solution:
DECLARE #result_set INT
SET #result_set = (
SELECT COUNT(*) FROM
( SELECT *
FROM notification p
WHERE p.code = #code
AND p.reference = #reference
AND p.response ='00'
) x
)
IF(#result_set > 0)
BEGIN
SELECT *
FROM notification p
WHERE p.code = #code
AND p.reference = #reference
AND p.response ='00'
END
ELSE
BEGIN
SELECT
TOP 1 p.*
FROM notification p (nolock)
WHERE p.code = #code
AND p.reference = #reference
ORDER BY p.id DESC
END
I also think there should be a way around repeating this select statement:
SELECT *
FROM notification p
WHERE p.code = #code
AND p.reference = #reference
AND p.response ='00'
I'm just not proficient enough in SQL to figure it out.
You can do something like this:
SELECT TOP (1) n.*
FROM notification n
WHERE p.code = #code AND p.reference = #reference
ORDER BY (CASE WHEN p.response ='00' THEN 1 ELSE 2 END), id DESC;
This will return the row with response of '00' first and then any other row. I would expect another column i the ORDER BY to handle recency, but your sample code doesn't provide any clue on what this might be.
WITH ItemIWant AS (
SELECT *
FROM notification p
WHERE p.code = #code
AND p.reference = #reference
AND p.response ='00'
),
SELECT *
FROM ItemIWant
UNION ALL
SELECT TOP 1 *
FROM notification p
WHERE p.code = #code
AND p.reference = #reference
AND NOT EXISTS (SELECT * FROM ItemIWant)
ORDER BY id desc
This will do that with minimal passes on the table. It will only return the top row if there are no rows returned by ItemIWant. There is no conditional logic so it can be compiled and indexed effectively.

Counting records in a SQL subquery

I'm having difficult with a subquery. In plain English I'm trying to pick a random userID from the QCUsers table that has less than 20 records from the QCTier1_Assignments table. The problem is that my query below is only picking users where it meets the criteria of the inner query when I need it to pick any user from QCUsers table even if the user does not have any records at all in the QCTier1_Assignments table. I need something like this
AND (Sub.QCCount < 20 OR Sub.QCCount = 0 )
DECLARE #ReviewPeriodMonth varchar(10) = '10'
DECLARE #ReviewPeriodYear varchar(10) = '2015'
SELECT TOP 1
E1.UserID
,Sub.QCCount --Drawn from the subquery
FROM QCUsers E1
JOIN (SELECT
QCA.UserID,
COUNT(*) AS QCCount
FROM QCTier1_Assignments QCA
WHERE QCA.ReviewPeriodMonth = #ReviewPeriodMonth
AND QCA.ReviewPeriodYear = #ReviewPeriodYear
GROUP BY QCA.UserID
) Sub
ON E1.UserID = Sub.UserID
WHERE Active = 1
AND Grade = 12
AND Sub.QCCount < 20
ORDER BY NEWID()
I also tried it this way with no luck
DECLARE #ReviewPeriodMonth varchar(10) = '10'
DECLARE #ReviewPeriodYear varchar(10) = '2015'
SELECT TOP 1
E1.UserID
,Sub.QCCount --Drawn from the subquery
FROM QCUsers E1
RIGHT JOIN (SELECT
QCA.UserID,
ReviewPeriodMonth,
ReviewPeriodYear,
COUNT(*) AS QCCount
FROM QCTier1_Assignments QCA
GROUP BY
QCA.UserID,
ReviewPeriodMonth,
ReviewPeriodYear
) Sub
ON E1.UserID = Sub.UserID
WHERE Active = 1
AND Grade = 12
AND Sub.QCCount < 20
AND Sub.ReviewPeriodMonth = #ReviewPeriodMonth
AND Sub.ReviewPeriodYear = #ReviewPeriodYear
ORDER BY NEWID()
Try using your second query but change the WHERE clause to use COALESCE(Sub.QCCount, 0) instead of justSub.QCCount`
If the subquery returns no rows then with your RIGHT JOIN you'll at least still get the row, but the QCCount will be NULL which when compared to anything will result in a "false" effectively.
Also, you should look into the HAVING clause. It might allow you to do this without a subquery at all.
Here's an example with the HAVING clause. If it doesn't give the correct results please let me know as I'm not able to test this.
DECLARE
#ReviewPeriodMonth VARCHAR(10) = '10'
#ReviewPeriodYear VARCHAR(10) = '2015'
SELECT TOP 1
E1.UserID,
COUNT(QCA.UserID) AS QCCount
FROM
QCUsers E1
LEFT OUTER JOIN QCTier1_Assignments QCA ON
QCA.UserID = E1.UserID AND
QCA.ReviewPeriodMonth = #ReviewPeriodMonth AND
QCA.ReviewPeriodYear = #ReviewPeriodYear
WHERE
E1.Active = 1 AND
Grade = 12 AND
HAVING
COUNT(*) < 20
ORDER BY
NEWID()
You should use LEFT JOIN instead of JOIN(INNER JOIN), And you'd better to put the predicate to the outer query based on your practice, but I recommend the following way:
SELECT TOP1 ABC.UserID,ABC.QCCount
FROM
(
SELECT E1.UserID, COUNT(*) as QCCount
FROM QCUsers as E1
LEFT JOIN QCTier1_Assignments as QCA
ON QCA.UserID = E1.UserID
WHERE QCA.ReviewPeriodMonth = #ReviewPeriodMonth
AND QCA.ReviewPeriodYear = #ReviewPeriodYear
AND Active = 1
AND Grade = 12
GROUP BY E1.UserID
) as ABC
WHERE ABC.QCCount <20
ORDER BY NEWID()
I was able to work it out through a combination of responses here
DECLARE #ReviewPeriodMonth varchar(10) = '10'
DECLARE #ReviewPeriodYear varchar(10) = '2015'
SELECT TOP 1
QCUsers.UserID,
COUNT(QCTier1_Assignments.ReviewID) AS ReviewCount
FROM
QCTier1_Assignments RIGHT OUTER JOIN
QCUsers ON QCTier1_Assignments.UserID = QCUsers.UserID
WHERE
QCUsers.Active = 1
AND QCUsers.Grade = '12'
AND (ReviewPeriodMonth = #ReviewPeriodMonth OR ReviewPeriodMonth IS NULL)
AND (ReviewPeriodYear = #ReviewPeriodYear OR ReviewPeriodYear IS NULL)
GROUP BY
QCUsers.UserID
HAVING
(COALESCE(COUNT(QCTier1_Assignments.ReviewID),0) < 4)
ORDER BY NEWID()

Query, subquery and using as variables from subquery

Is it not possible to use the "as [item] and then use the item variable in the query.
For example:
select c.category as [category],c.orderby as [CatOrder], m.masterno, m.master
,-- select OUT (select count(*) from rentalitem ri with (nolock),
rentalitemstatus ris with (nolock),
rentalstatus rs with (nolock)
where ri.rentalitemid = ris.rentalitemid
and ris.rentalstatusid = rs.rentalstatusid
and ri.masterid = m.masterid
and rs.statustype in ('OUT', 'INTRANSIT', 'ONTRUCK')) as [qtyout]
,-- select OWNED owned=
(select top 1 mwq.qty
from masterwhqty mwq
where mwq.masterid = m.masterid)
, -([owned]-[qtyout]) as [Variance]
from master m
inner join category c on c.categoryid=m.categoryid and c.categoryid=#category
inner join inventorydepartment d on c.inventorydepartment=#department
I cannot seem to use qtyout or owned when calculating variance. How can I do that?
You can also use a table variable and then reference that table variable like you are trying to do above....here's an example from MSDN
USE AdventureWorks2012;
GO
DECLARE #MyTableVar table(
EmpID int NOT NULL,
OldVacationHours int,
NewVacationHours int,
ModifiedDate datetime);
UPDATE TOP (10) HumanResources.Employee
SET VacationHours = VacationHours * 1.25,
ModifiedDate = GETDATE()
OUTPUT inserted.BusinessEntityID,
deleted.VacationHours,
inserted.VacationHours,
inserted.ModifiedDate
INTO #MyTableVar;
--Display the result set of the table variable.
SELECT EmpID, OldVacationHours, NewVacationHours, ModifiedDate
FROM #MyTableVar;
GO
--Display the result set of the table.
SELECT TOP (10) BusinessEntityID, VacationHours, ModifiedDate
FROM HumanResources.Employee;
GO
need to move your calculated fields into a subquery, and then use them by their alias in the outer query.
select subquery.*, -([owned]-[qtyout]) as [Variance]
from
(
select c.category as [category],c.orderby as [CatOrder], m.masterno, m.master
,-- select OUT (select count(*) from rentalitem ri with (nolock),
rentalitemstatus ris with (nolock),
rentalstatus rs with (nolock)
where ri.rentalitemid = ris.rentalitemid
and ris.rentalstatusid = rs.rentalstatusid
and ri.masterid = m.masterid
and rs.statustype in ('OUT', 'INTRANSIT', 'ONTRUCK')) as [qtyout]
,-- select OWNED owned=
(select top 1 mwq.qty
from masterwhqty mwq
where mwq.masterid = m.masterid) as [owned]
from master m
inner join category c on c.categoryid=m.categoryid and c.categoryid=#category
inner join inventorydepartment d on c.inventorydepartment=#department
) as subquery
YOu need to use a subquery:
select t.*,
([owned]-[qtyout]) as [Variance]
from (<something like your query here
) t
You query, even without the comments, doesn't quite make sense (select OUT (select . . . for isntance). But, the answer to your question is to define the base variables in a subquery or CTE and then subsequently use them.
And, you are calling the difference "variance". Just so you know, you are redefining the statistical meaning of the term (http://en.wikipedia.org/wiki/Variance), which is based on the squares of the differences.

SQL get single value inside existing query?

I have a query that returns a bunch of rows.
But using the same query i would like to:
1. get the total row count in the table
2. get the row number where a certian username is located
Right now im doing like so:
BEGIN
DECLARE #startRowIndex INT;
DECLARE #PageIndex INT;
DECLARE #RowsPerPage INT;
SET #PageIndex = 0;
SET #RowsPerPage = 15;
SET #startRowIndex = (#PageIndex * #RowsPerPage) + 1;
WITH messageentries
AS (SELECT Row_number()
OVER(ORDER BY score DESC) AS row,
Count(DISTINCT town.townid) AS towns,
user_details.username,
user_score.score,
allience.alliencename,
allience.allienceid,
allience.alliencetagname,
(SELECT Count(* ) FROM user_details) AS numberofrows
FROM user_details
INNER JOIN user_score
ON user_details.username = user_score.username
INNER JOIN town
ON user_details.username = town.townownername
LEFT OUTER JOIN allience_roles
ON user_details.useralliencerole = allience_roles.roleid
LEFT OUTER JOIN allience
ON allience_roles.allienceid = allience.allienceid
GROUP BY user_details.username,
user_score.score,
allience.alliencename,
allience.allienceid,
allience.alliencetagname)
SELECT *, (SELECT row FROM messageentries WHERE username = 'myUsername') AS myself
FROM messageentries
WHERE row BETWEEN #startRowIndex AND #StartRowIndex + #RowsPerPage - 1
END
That works, but isn't the two nested selects going to run once for every row in the table? :/
...
(SELECT Count(* ) FROM user_details) AS numberofrows
...
(SELECT row FROM messageentries WHERE username = 'myUsername') AS myself
So my question being how can i get the values i want as "low-cost" as possible, and preferably in the same query?
Thanks in advance :)
try this...
DECLARE #NumberOfRows INT
SELECT #NumberOfRows = Count(* ) FROM user_details
WITH messageentries
AS (SELECT Row_number()
OVER(ORDER BY score DESC) AS row,
Count(DISTINCT town.townid) AS towns,
user_details.username,
user_score.score,
allience.alliencename,
allience.allienceid,
allience.alliencetagname,
#NumberOfRows AS numberofrows
FROM user_details
INNER JOIN user_score
ON user_details.username = user_score.username
INNER JOIN town
ON user_details.username = town.townownername
LEFT OUTER JOIN allience_roles
ON user_details.useralliencerole = allience_roles.roleid
LEFT OUTER JOIN allience
ON allience_roles.allienceid = allience.allienceid
GROUP BY user_details.username,
user_score.score,
allience.alliencename,
allience.allienceid,
allience.alliencetagname)
SELECT *, MyRowNumber.row AS myself
FROM messageentries,
(SELECT row FROM messageentries WHERE username = 'myUsername') MyRowNumber
WHERE row BETWEEN #startRowIndex AND #StartRowIndex + #RowsPerPage - 1
(SELECT Count(* ) FROM user_details)
This one will be cached (most probably materialized in a Worktable).
(SELECT row FROM messageentries WHERE username = 'myUsername')
For this one, most probably a Lazy Spool (or Eager Spool) will be built, which will be used to pull this value.