How to Speed up SQL query for Date - sql

I have a query, I'm using an inner join from 2 tables that have about a million rows. I'm trying to run the query so it only gets data from last month. However, it takes a really long time when using the getDate() function. But when I enter the date in this format '2016-12-01' and '2017-01-01' - it's really quick. How can I modify the query so it runs faster? I read that I might have to create a non-clustered index but I'm not really good with those yet.
select
custKey,
sum(salesAmt) as Sales,
sum(returnAmt) as Credit,
(sum(salesAmt) - sum(returnAmt)) as CONNET
from
[SpotFireStaging].[dbo].[tsoSalesAnalysis]
inner join
[SpotFireStaging].[dbo].OOGPLensDesc as o on tsoSalesAnalysis.ItemKey = O.ItemKey
where
PostDate between --DATEADD(MONTH, DATEDIFF(MONTH,0, GETDATE())-1,0 )
--AND DATEADD(MS, -3,DATEADD(MM, DATEDIFF(M,-1, GETDATE()) -1, 0))
'2016-12-01' and '2017-01-01'
group by
custkey

declare #startDate DateTime = DATEADD(MONTH, DATEDIFF(MONTH,0, GETDATE())-1,0 )
declare #endDate DateTime = DATEADD(MS, -3,DATEADD(MM, DATEDIFF(M,-1, GETDATE()) -1, 0))
select
custKey,
sum(salesAmt) as Sales,
sum(returnAmt) as Credit,
(sum(salesAmt) - sum(returnAmt)) as CONNET
from
[SpotFireStaging].[dbo].[tsoSalesAnalysis]
inner join
[SpotFireStaging].[dbo].OOGPLensDesc as o on tsoSalesAnalysis.ItemKey = O.ItemKey
where
PostDate between #startDate AND #endDate
group by
custkey
another alternative, check out the selected answer here:
When using GETDATE() in many places, is it better to use a variable?
GetDate() is calculated separately for each row, so we gotta belive so is DateDiff() and DateAdd(). So we are better off moving it into a local variable.

Related

SQL Server : GETDATE not returning today's date

Below is my query. Everything is working except for the GETDATE function in the WHERE clause. It won't return today's date if I put the date in there like this: 7/12/22. It is a DATETIME column in the backend. Thanks in advance.
SELECT
acsMFG.dbo.production_posting_trans.item_no,
SUM(acsMFG.dbo.production_posting_trans.good_quantity) AS [Good Qty],
SUM(acsMFG.dbo.production_posting_trans.scrap_quantity) AS [Scrap Qty],
acsAUTOSYS.dbo.inventory_master.selling_price
FROM
acsAUTOSYS.dbo.inventory_master
FULL OUTER JOIN
acsMFG.dbo.production_posting_trans ON acsMFG.dbo.production_posting_trans.item_no = acsAUTOSYS.dbo.inventory_master.item_no
AND acsAUTOSYS.dbo.inventory_master.company_code = acsMFG.dbo.production_posting_trans.company_code
WHERE
acsMFG.dbo.production_posting_trans.company_code = '10'
AND acsMFG.dbo.production_posting_trans.production_date = GETDATE()
AND acsMFG.dbo.production_posting_trans.posting_type = 'MMQ'
OR acsMFG.dbo.production_posting_trans.posting_type = 'IRS'
OR acsMFG.dbo.production_posting_trans.posting_type = 'PME'
GROUP BY
acsMFG.dbo.production_posting_trans.item_no,
acsAUTOSYS.dbo.inventory_master.selling_price
Well, when you say SELECT GETDATE(); what do you see? There is a time component there too, so if the data in the table is 2022-07-12 15:12 and you run the query at 2022-07-12 15:13, that's not a match.
If you want data from today, you need a range query:
WHERE col >= CONVERT(date, GETDATE())
AND col < DATEADD(DAY, 1, CONVERT(date, GETDATE()));
It is cleaner to use variables, e.g.
DECLARE #today date = GETDATE();
DECLARE #tomorrow date = DATEADD(DAY, 1, #today);
...
WHERE col >= #today
AND col < #tomorrow;
Don't get tempted into doing this:
WHERE CONVERT(date, col) = CONVERT(date, GETDATE());
It will work, but it's not fantastic.
For the actual problem with OR logic, you have:
... date clauses with AND ...
AND acsMFG.dbo.production_posting_trans.posting_type='MMQ'
Or acsMFG.dbo.production_posting_trans.posting_type ='IRS'
Or acsMFG.dbo.production_posting_trans.posting_type ='PME'
I think you want:
AND
(
acsMFG.dbo.production_posting_trans.posting_type='MMQ'
Or acsMFG.dbo.production_posting_trans.posting_type ='IRS'
Or acsMFG.dbo.production_posting_trans.posting_type ='PME'
)
As for aliases:
FROM
acsAUTOSYS.dbo.inventory_master AS im
FULL OUTER JOIN
acsMFG.dbo.production_posting_trans AS ppt
Now all your references can be:
AND
(
ppt.posting_type='MMQ'
ppt.posting_type ='IRS'
Or ppt.posting_type ='PME'
)
GROUP BY
ppt.item_no, im.selling_price;
Or better:
AND
(
ppt.posting_type IN ('MMQ', 'IRS', 'PME')
)
GROUP BY
ppt.item_no, im.selling_price;
...so much more readable.
Since GETDATE() returns the time too, you will never match. You need something like:
CAST(acsMFG.dbo.production_posting_trans.production_date AS Date)
= CAST(GETDATE() AS Date)

SQL Count to include zero values

I have created the following stored procedure that is used to count the number of records per day between a specific range for a selected location:
[dbo].[getRecordsCount]
#LOCATION as INT,
#BEGIN as datetime,
#END as datetime
SELECT
ISNULL(COUNT(*), 0) AS counted_leads,
CONVERT(VARCHAR, DATEADD(dd, 0, DATEDIFF(dd, 0, Time_Stamp)), 3) as TIME_STAMP
FROM HL_Logs
WHERE Time_Stamp between #BEGIN and #END and ID_Location = #LOCATION
GROUP BY DATEADD(dd, 0, DATEDIFF(dd, 0, Time_Stamp))
but the problem is that the result does not show the days where there are zero records, I pretty sure that it has something to do with my WHERE statement not allowing the zero values to be shown but I do not know how to over come this issue.
Thanks in advance
Neil
Not so much the WHERE clause, but the GROUP BY. The query will only return data for rows that exist. That means when you're grouping by the date of the timestamp, only days for which there are rows will be returned. SQL Server can't know from context that you want to "fill in the blanks", and it wouldn't know what with.
The normal answer is a CTE that produces all the days you want to see, thus filling in the blanks. This one's a little tricky because it requires a recursive SQL statement, but it's a well-known trick:
WITH CTE_Dates AS
(
SELECT #START AS cte_date
UNION ALL
SELECT DATEADD(DAY, 1, cte_date)
FROM CTE_Dates
WHERE DATEADD(DAY, 1, cte_date) <= #END
)
SELECT
cte_date as TIME_STAMP,
ISNULL(COUNT(HL_Logs.Time_Stamp), 0) AS counted_leads,
FROM CTE_Dates
LEFT JOIN HL_Logs ON DATEADD(dd, 0, DATEDIFF(dd, 0, Time_Stamp)) = cte_date
WHERE Time_Stamp between #BEGIN and #END and ID_Location = #LOCATION
GROUP BY cte_date
Breaking it down, the CTE uses a union that references itself to recursively add one day at a time to the previous date and remember that date as part of the table. If you ran a simple statement that used the CTE and just selected * from it, you'd see a list of dates between start and end. Then, the statement joins this list of dates to the log table based on the log timestamp date, while preserving dates that have no log entries using the left join (takes all rows from the "left" side whether they have matching rows on the "right" side or not). Finally, we group by date and count instead and we should get the answer you want.
When there is no data to count, there is no row to return.
If you want to include empty days as a 0, you need to create a table (or temporary table, or subquery) to store the days, and left join to your query from that.
eg: something like
SELECT
COUNT(*) AS counted_leads,
CONVERT(VARCHAR, DATEADD(dd, 0, DATEDIFF(dd, 0, Time_Stamp)), 3) as TIME_STAMP
FROM
TableOfDays
left join
HL_Logs
on TableOfDays.Date = convert(date,HL_Logs.Time_Stamp)
and ID_Location = #LOCATION
WHERE TableOfDays.Date between #BEGIN and #END
GROUP BY DATEADD(dd, 0, DATEDIFF(dd, 0, Time_Stamp))
Use a left outer join. Such as
select count(stuff_ID), extra_NAME
from dbo.EXTRAS
left outer join dbo.STUFF on suff_EXTRA = extra_STUFF
group by extra_NAME
I just recently has a similar task and used this as a backdrop to my work. However, as explained by robwilliams I too, couldn't get it KeithS solution to work. Mine task was slightly different I was doing it by hours vs days but I think the solution to the neilrudds question would be
DECLARE #Start as DATETIME
,#End as DATETIME
,#LOCATION AS INT;
WITH CTE_Dates AS
(
SELECT #Start AS cte_date, 0 as 'counted_leads'
UNION ALL
SELECT DATEADD(DAY, 1, cte_date) as cte_date, 0 AS 'counted_leads'
FROM CTE_Dates
WHERE DATEADD(DAY, 1, cte_date) <= #End
)
SELECT cte_date AS 'TIME_STAMP'
,COUNT(HL.ID_Location) AS 'counted_leads'
FROM CTE_Dates
LEFT JOIN HL_Logs AS HL ON CAST(HL.Time_Stamp as date) = CAST(cte_date as date)
AND DATEPART(day, HL.Time_Stamp) = DATEPART(day,cte_date)
AND HL.ID_Location = #LOCATION
group by cte_date
OPTION (MAXRECURSION 0)

How to count databases elements in a range of date?

In an SQL Server procedure, I need to get all rows matching some constraints(simple where conditions), and then group them by month.
The goal is to create a graph(in Sql server reporting services), which display all data.
I've already something like this:
Select Count(*) AS Count, Month(a.issueDate) AS Month, Year(a.issueDate) AS Year
FROM MyTable a
WHERE
....
GROUP BY YEAR(a.issueDate), MONTH(a.issueDate)
I got my data, I got my graph, but the problem is that if I've NOT any rows in "MyTable", which match my Where conditions, I won't have any rows.
The result is that I've a graph Starting with january, skipping february, and then displaying march.
I cannot post-process data since it's directly connected to the SQL Server Reporting Services report.
Since I have this problem for ~20 stored procedure, I will appreciate to have the simpliest way of doing it.
Thank you very much for your advices
Let's say you want a specific year:
DECLARE #year INT;
SET #year = 2012;
DECLARE #start SMALLDATETIME;
SET #start = DATEADD(YEAR, #year-1900, 0);
;WITH y AS (SELECT TOP (12) rn = ROW_NUMBER() OVER (ORDER BY [object_id])-1
FROM sys.all_objects ORDER BY [object_id])
SELECT DATEADD(MONTH, y.rn, #start), COUNT(t.issueDate)
FROM y
LEFT OUTER JOIN dbo.MyTable AS t
ON t.issueDate >= DATEADD(MONTH, y.rn, #start)
AND t.issueDate < DATEADD(MONTH, y.rn + 1, #start)
GROUP BY DATEADD(MONTH, y.rn, #start);
If it's not a specific year, then you can do it slightly differently to cover any date range, as long as you provide the 1st day of the 1st month and the 1st day of the last month (or pass 4 integers and construct the dates manually):
DECLARE #startdate SMALLDATETIME, #enddate SMALLDATETIME;
SELECT #startdate = '20111201', #enddate = '20120201';
;WITH y AS (SELECT TOP (DATEDIFF(MONTH, #startdate, #enddate)+1)
rn = ROW_NUMBER() OVER (ORDER BY [object_id])-1
FROM sys.all_objects ORDER BY [object_id]
)
SELECT DATEADD(MONTH, y.rn, #startdate), COUNT(t.issueDate)
FROM y
LEFT OUTER JOIN dbo.MyTable AS t
ON t.issueDate >= DATEADD(MONTH, y.rn, #startdate)
AND t.issueDate < DATEADD(MONTH, y.rn + 1, #startdate)
GROUP BY DATEADD(MONTH, y.rn, #startdate);
In report builder, right click on the date axis, select properties, and then set the axis up as a date range, it will add the empty columns for you, and you won't have to change your SQL
You need to build a table (a Table variable would work best here) that contains all year/month combinations from your minimum to maximum.
You then need to cross join this with your main query to get results for all year/months ready for the graph.

SQL Query With Nested Sum

I am having to write quite a complicated query at the moment but I am getting stuck. The table structure is as follows
Inquiry is linked to Timelog by a field called Inquiry_ID. My current code which brings back total minutes but for the entire table and not per company. What I am basically after:
Two Columns one for company name (dbo.inquiry.concom) and another for total minutes. The table INQUIRY holds say 100 entries for the same company, I want a row to return the company name once and the total amount of minutes counted for that company name from TIMELOG.LOGMINS
So for example there are 50 entries in dbo.inquiry that have the same company name, I want it to display a distinct company but I need it to total the amount of minutes that is in another table. I am completely lost!
DECLARE #StartDate DATETIME, #EndDate DATETIME
SET #StartDate = dateadd(mm, - 1, getdate())
SET #StartDate = dateadd(dd, datepart(dd, getdate()) * - 1, #StartDate)
SET #EndDate = dateadd(mm, 1, #StartDate)
SELECT DISTINCT TOP 100 PERCENT dbo.INQUIRY.CONCOM, TIMELOG_1.LOGMINS, dbo.INQUIRY.ESCDATE, dbo.INQUIRY.INQUIRY_ID,
(SELECT SUM(LOGMINS) AS Expr1
FROM dbo.TIMELOG
WHERE dbo.INQUIRY.ESCDATE BETWEEN #Startdate AND #EndDate) AS TOTALMINUTES
FROM dbo.INQUIRY INNER JOIN
dbo.TIMELOG AS TIMELOG_1 ON dbo.INQUIRY.INQUIRY_ID = TIMELOG_1.INQUIRY_ID INNER JOIN
dbo.PROD ON dbo.INQUIRY.PROD_ID = dbo.PROD.PROD_ID INNER JOIN
dbo.CATEGORY ON dbo.PROD.CATEGORY_ID = dbo.CATEGORY.CATEGORY_ID
WHERE dbo.INQUIRY.ESCDATE BETWEEN #Startdate AND #EndDate
ORDER BY dbo.INQUIRY.CONCOM
EDIT: The reason the category and product tables are there is because I will need to exclude the count based on whether a product is in a certain category.
SELECT i.concom, COALESCE(SUM(t.logmins), 0)
FROM inquiry i
LEFT JOIN
timelog t
ON t.inquiry_id = i.inquiry_id
GROUP BY
i.concom

SQL Checking for NULL and incrementals

I'd like to check if there is anything to return given a number to check against, and if that query returns no entries, increase the number until an entry is reached and display that entry. Currently, the code looks like this :
SELECT *
FROM news
WHERE DATEDIFF(day, date, getdate() ) <= #url.d#
ORDER BY date desc
where #url.d# is an integer being passed through (say 31). If that returns no results, I'd like to increase the number stored in #url.d# by 1 until an entry is found.
This kind of incremental querying is just not efficient. You'll get better results by saying - "I'll never need more than 100 results so give me these" :
SELECT top 100 *
FROM news
ORDER BY date desc
Then filtering further on the client side if you want only a particular day's items (such as the items with a common date as the first item in the result).
Or, you could transform your multiple query request into a two query request:
DECLARE
#theDate datetime,
#theDate2 datetime
SET #theDate = (SELECT Max(date) FROM news)
--trim the time off of #theDate
SET #theDate = DateAdd(dd, DateDiff(dd, 0, #theDate), 0)
SET #theDate2 = DateAdd(dd, 1, #theDate)
SELECT *
FROM news
WHERE #theDate <= date AND date < #theDate2
ORDER BY date desc
In MySQL:
SELECT news.*,
(
SELECT COUNT(*)
FROM news
WHERE date < DATEADD(day, GETDATE(), -#url.d#)
)
FROM news
WHERE date >= DATEADD(day, GETDATE(), -#url.d#)
ORDER BY
date DESC
LIMIT 1
In SQL Server:
SELECT TOP 1
news.*,
(
SELECT COUNT(*)
FROM news
WHERE date < DATEADD(day, GETDATE(), -#url.d#)
)
FROM news
WHERE date >= DATEADD(day, GETDATE(), -#url.d#)
ORDER BY
date DESC
Note that using this syntax makes your query sargable, that is an index can be used to filter on date efficiently.
First, I think you will probably want to avpod using the DateDiff function in your where clause, instead, compute the desired cutoff date and do use any computations on the date column within the where clause, this will be more efficient, so rather than
WHERE DATEDIFF(day, date, getdate() ) <= #url.d#
you would have something like
WHERE date >= #cutoffDate
where #cutoffDate is a computed date based on #url.d#
Now, as for grabbing the correct cutoff date. My assumption is that under normal circumstances, there will be articles returned from the request otherwise you would just grab articles from the most recent date. So, the approach that I would take would be to grab the OLDEST of the computed cutoff date (based on #url.d# and the MOST RECENT article date. Something like
-- #urld == #url.d
-- compute the cutoff date as the OLDEST of the most recent article and
-- the date based on #url.d
declare #cutoff datetime
select #cutoff = DateAdd(dd,-1*#urld,GetDate())
select #cutoff
select #cutoff = min(cutoffDate)
from
(SELECT Max(date) as cutoffDate from News
UNION
select #cutoff) Cutoff
-- grab the articles with dates that are more recent than the cutoff date
select *
from News
WHERE date >= #cutoff
I'm also guessing that you would probably want to round to midnight for the dates (which I didn't do here). This is a multi-query approach and should probably be implemented in a single stored procedure ... if this is what you are looking for.
Good luck with the project!
If you wanted the one row:
SELECT t.*
FROM NEWS t
WHERE t.id = (SELECT MAX(n.id)
FROM NEWS n
WHERE n.date BETWEEN DATEADD(day, -:url.d, getDate()) AND getDate())
It might not be obvious that the DATEADD is using a negative in order to go back however many number of days desired.
If you wanted all the rows in that date:
SELECT t.*
FROM NEWS t
WHERE t.date BETWEEN DATEADD(day, -:url.d, getDate()) AND getDate())