Dividing by number of days not referencing correct month/number of days - sql

I have a table that users enter a daily population. How many people in a particular facility that day. The table looks similar to this:
select * from stat_summary where MONTH(report_date) = 9
results:
stat_summary_id | report_date | facility | adp
----------------------------------------------------
29 |2015-09-01 | YORK | 1855
30 |2015-09-02 | YORK | 1750
31 |2015-09-04 | YORK | 1655
32 |2015-09-04 | YORK | 1699
What I want to do is calculate the average daily population grouped by month. I want to take the MAX(report_date) in case a corrected value has to be re-entered. My query looks like:
SELECT
MONTH(t.report_date) as 'report_month',
SUM(ss1.adp)/DAY(DATEADD(DD,-1,DATEADD(MM,DATEDIFF(MM,-1,MONTH(t.report_date)),0))),
DAY(DATEADD(DD,-1,DATEADD(MM,DATEDIFF(MM,-1,MONTH(t.report_date)),0)))
FROM
stat_summary ss1
INNER JOIN
(SELECT MAX(stat_summary_id) as 'stat_summary_id', report_date
FROM stat_summary
GROUP BY report_date
) t ON t.stat_summary_id = ss1.stat_summary_id
WHERE
ss1.facility_id = 'YORK'
AND MONTH(t.report_date) = 9
GROUP BY
MONTH(t.report_date)
ORDER BY
MONTH(t.report_date)
I've referenced this thread:
Dividing a value by number of days in a month in a date field in SQL table
And I was able to see how to dynamically divide by the number of days in the month, but it looks like it is dividing by the current month (October) which has 31 days, when I need the query to divide by the referenced month of September which has 30 days.
Currently my results look like:
The adp value should be 176.8 since there are 30 days in September, not 31.

So quick check it looks like that formula returns 31 for all months. The proper formula can be found here: How to determine the number of days in a month in SQL Server?
datediff(day, dateadd(day, 1-day(#date), #date),
dateadd(month, 1, dateadd(day, 1-day(#date), #date)))
More precisely use:
datediff(day, dateadd(day, 1-day(MIN(t.report_date)), MIN(t.report_date)),
dateadd(month, 1, dateadd(day, 1-day(MIN(t.report_date)), MIN(t.report_date))))
EDIT: Note the original formula was in fact correct, the problem was that you were passing in a month instead of a day. Months are numbers from 1-12, so all of your dates were in January.

I should use
day(DateAdd(day, DateAdd(month, MONTH(t.report_date), DateAdd(Year, YEAR(t.report_date)-1900, 0)), -1)) as monthDays
Also sounds to me that to obtain average, it is wrong to divide by that number, it is only correct if the number of records match the number of days in the month, in other case, just the count is enough
SUM(ss1.adp)/count(ss1.adp) as average

Related

Custom month numbers that take last 30 days instead of Number of month (SQL Server)

I am trying to create a lag function to return current month and last month streams for an artist.
Instead of returning streams for Feb vs Jan, I wan the function to use the last 30 days as a period for current month, and the previous 30 days as the previous month.
The query that I am currently using is this:
SELECT
DATEPART(month, date) AS month,
artist,
SUM([Streams]) AS streams,
LAG(SUM([Streams])) OVER (PARTITION BY artist ORDER BY DATEPART(month, date)) AS previous_month_streams
FROM combined_artist
WHERE date > DATEADD(m, -2, DATEADD(DAY, 2 - DATEPART(WEEKDAY, GETDATE()-7), CAST(GETDATE()-7 AS DATE)))
GROUP BY DATEPART(month, date), artist;
While this works, it is not giving me the data I need. This is returning the sum of streams for February vs the Streams for the month of January. February seems very low because we only have one week worth of data in February.
My goal is to get the last 30 days from the max date in the table using a lag function. So if the max date is Feb. 7 2023, I want the current month to include data from Jan. 7 2023 - Feb. 7 2023, and the previous month to include data from Dec. 7 2022 - Jan. 7 2023. I am thinking to create a custom month date part that will start from the max date and give a month number to the last 30 days . (2 for Jan 7 - Feb 7, 1 for Dec 7 - Jan-7...) I am not sure how to go about this. This is in SQL Server and I am looking to use the lag function for performance reasons.
I think you could probably use something like datediff(d, date_you_care_about, max_date)/30 in your group by and partition by clauses.
The basic idea is that integer division rounds down, so if the difference between the dates is < 30, dividing it by 30 is 0. If the difference is >=30 but less than 60, dividing it by 30 is 1. And so forth.
You can see a proof of concept in this Fiddle.

How to combine two SQL date functions into one to get output

I am trying to populate a column with the date that is the first of the month before a given date
i.e. If the date in column one is 25/01/2017, I want the date in column two to read 01/12/2016
So far, I have managed to take a month off, or reset it to the first of the month, but I can't work out a way to combine the two. Is it possible, or do I have to do it in two different statements. Here is what I have so far:
SELECT
PAY_START_DATE,
DATEADD(MONTH, -1, CONVERT(DATE,PAY_START_DATE)) AS AMMENDONE,
DATEADD(m, DATEDIFF(m, 0, PAY_START_DATE), 0) AS AMMENDTWO
FROM [hronline_iTrent].[iTrent].[Payroll]
And this gives the output:
PAY_START_DATE | AMMENDONE | AMMENDTWO
2016-04-06 |06/03/2016 | 01/04/2016 00:00
2016-06-12 |12/05/2016 | 01/06/2016 00:00
Any chance I can combine the two so I can get an output like:
PAY_START_DATE | AMMEND
2016-04-06 | 2016-03-01
2016-06-12 | 2016-05-01
(The format of the output is not important)
Try the below query,
SELECT PAY_START_DATE
,DATEADD(DAY,1, DATEADD(MONTH,-2,EOMONTH(PAY_START_DATE)))
FROM [hronline_iTrent].[iTrent].[Payroll]
SELECT DATEADD(month,-1,DATEADD(month, DATEDIFF(month, 0, GETDATE()), 0)) AS StartOfMonth
Here is one method:
select dateadd(month, -1,
dateadd(day, 1 - day(PAY_START_DATE), PAY_START_DATE)
)
The inner dateadd() gets the first day of the month. The outer one subtracts one month.
You can do it with just one DATEADD/DATEDIFF pair, so long as you don't mind that it's somewhat cute:
SELECT DATEADD(month,DATEDIFF(month,'20010101',PAY_START_DATE),'20001201')
It employs the fact that we pick two arbitrary dates that already have the required relationship between them that you want to achieve - I.e. 20001201 is the first of the month previous to 20010101.

Select Date Between Just Day and Month Excluding Year

The following is the pseudo code for what I want to do:
When Date is Between 04-01 and 03-31 of the following year then output as Q1.
I know how to do this with the year but not excluding the year.
I have no idea what you mean by output "Q1". However, if you want your years to start on April 1st (which seems like a reasonable interpretation of what you are sking), the easiest way is to subtract a number of days. For most years you will deal with, you can do:
select year(dateadd(day, - (31 + 28 + 31), date) as theyear
Of course, this only works three years out of four, because of leap years. One way to fix this is with explicit logic -- but that gets messy. Another way is to add the remaining months and subtract one year:
select year(dateadd(day, (30 + 31 + 30 + 31 + 31 + 30 + 31 + 30 + 31), date) - 1 as theyear
It's unclear exactly what you're trying to do. Q1 usually indicates a quarter, a three-month period. A quarter running from 1 April to 31 March of the following year isn't much of a quarter :)
However, assuming you're trying to select stuff within a certain span of time starting from a particular date, you might try a little date/time arithmetic. First, a few notes:
datetime values have a nominal precision of 1 millisecond (and an actual precision of approximately 3ms). That means that something like '31 March 2014 23:59:59.999' is rounded up to '1 April 2014 00:00:00.000'. The largest time value for a given day is `23:59:59.997'. This can have...deleterious effects on your queries if you're not cognizant of it. Don't ask me how I know this.
datetime literals without a time component, such as '1 April 2013', are interpreted as start-of-day ('1 April 2014 00:00:00.000').
So, something like this:
declare
#dtFrom datetime ,
#dtThru datetime
set #dtFrom = '1 April 2013'
set #dtThru = dateAdd(year,1,dtFrom)
select *
from foo t
where t.someDateTimeValue >= #dtFrom
and t.someDateTimevalue < #dtThru
should probably do you.
You might want to adjust the setting of #dtThru to suit your requirements: if you're actually looking for the end of a quarter, you migh change it to something like
set #dtThru = dateAdd(month,3,dtFrom)
If you have a fiscal year that runs from 1 April through 31 March and want to figure out, say, what fiscal year and quarter your data represents, you might do something like this:
select FiscalYear = datepart(year,t.someDateTimeValue)
- case datepart(month,t.someDateTimeValue) / 4
when 0 then 1 -- jan/feb/mar is quarter 4 of the prev FY
else 0 -- everything else is this FY
end ,
FiscalQuarter = case datepart(month,t.someDateTimevalue) / 4
when 0 then 4 -- jan/feb/mar is Q4 of the prev FY
when 1 then 1 -- apr/may/jun is Q1 of the curr FY
when 2 then 2 -- jul/aug/sep is Q2 of the curr FY
when 3 then 3 -- oct/nov/dec is Q3 of the curr FY
end ,
*
from foo t
I think what you want is the following:
SELECT year(dateadd(q, -1, mydate)) AS yearEndingQ1
FROM mytable
This would give the year as 2014 for all dates between 04/01/2014 and 03/31/2015. Of course it's possible you want a result of 2015 instead in which case you want:
SELECT year(dateadd(q, 3, mydate)) AS yearEndingQ1
FROM mytable
Hope this helps.
UPDATE per OP's comment: "I am tracking data for a year ending Quarter x. Our fiscal year is a bit weird around here. So basically it would be fiscal year ending Q1, fiscal year ending Q2, etc. Perhaps I could have provided more clarity in my question."
This would give results in three separate columns for fiscal year ending Q1, fiscal year ending Q2, and fiscal year ending Q3. (I assume you don't need anything for fiscal year ending Q4!!)
SELECT year(dateadd(q, -1, mydate)) AS yearEndingQ1
, year(dateadd(q, -2, mydate)) AS yearEndingQ2
, year(dateadd(q, -3, mydate)) AS yearEndingQ3
FROM mytable

Number of specific one-hour periods between two date/times

I have a table of table records, call it "game"
It has an id and timestamp.
What I need to know is unrelated to the table specifically. In order to know the average number of games played per hour, I need to know :
Total games played for each hour over the date range
Number of hourly
periods between the date range.
Finding the first is a matter of extracting the hour from the timestamp and grouping by it.
For the second, if the date range was rounded to the nearest day, finding this value would be easy (totalgames/numdays).
Unfortunately I can't assume this. What I need help with is finding the number of specific hour periods existing within a time range.
Example:
If the range is 5 PM today to 8 PM tomorrow, there is one "00" hour (midnight to 1 AM), but two 17, 18, 19 hours (5-6, 6-7, 7-8)
Thanks for the help
Edit: for clarity, consider the following query:
I have table game:
id, daytime
select EXTRACT(hour from daytime) as hour_period, count (*)
from game
where daytime > dateFrom and daytime < dayTo
group by hour_period
This will give me the number of games played broken down into hourly chunks for the time period.
In order to find the average games played per hour, I need to know exactly how many specific hour durations are between two timestamps. Simply dividing by the number of days is not accurate.
Edit: The ideal output will look something like this:
00 275
01 300
02 255
...
Consider the following: How many times does midnight occur between date 1 and date 2 ? If you have 1.5 days, that doesn't guarantee that midnight will occur twice. 6 AM today to 6 PM tomorrow night, for example, has 1 midnight, but 9PM tonight to 9 AM two days from now has 2 midnights.
What I'm trying to find is how many of the EXACT HOUR occurs between two timestamps, so I can use it to average the number of games played at THAT HOUR over a time period.
EDIT:
The following query gets the days, hours, and # of games, giving an output as below:
29 23 100
29 00 130
30 22 140
30 23 150
Then, the outer query adds up the number of games for each distinct hour and divides by the number of hours, as follows
22 140
23 125
00 130
The modified query is below:
SELECT
hour_period,
sum(hourly_no_of_games) / count(hour_period)
FROM
(
SELECT
EXTRACT(DAY from daytime) as day_period,
EXTRACT(HOUR from daytime) as hour_period,
count (*) hourly_no_of_games
from game
where daytime > dateFrom and daytime < dayTo
group by EXTRACT(DAY from daytime), EXTRACT(HOUR from daytime)
) hourly_data
GROUP BY hour_period
ORDER BY hour_period;
SQL Fiddle demo
If you need something to GROUP BY, you can truncate the timestamp to the level of hour, as in the following:
DECLARE #Date DATETIME
SET #Date = GETDATE()
SELECT #Date, DATEADD(Hour, DATEDIFF(Hour, 0, #Date), 0) AS RoundedDate
If you just need to find the total hours, you can just select the DATEDIFF in hours, such as with
SELECT DATEDIFF(Hour, '5/29/2014 20:01:32.999', GETDATE())
Extract not only the hour of the day but the day of the year (1-366). Then group on those. If there is the possibility the interval could span a year, then add the year itself and group by all three.
year dy hr games
2013 365 23 115
2014 1 00 103

Sales Grouped by Week of the year

I have a requirement to output the number sales in a year to date in weekly format where Monday is the first day of the week and Sunday is the last.
The table structure is as follows.
SalesId | Representative | DateOfSale.
Below is what I have tried but it doesn't seem to give me the correct result. The counts don't seem to add up for a given week. The Sunday results are not included in the correct week. I am thinking it has something to do with the date not including 11:59:59.999 for the last day of the week.
SELECT DATEADD(wk, DATEDIFF(wk, 6, Sales.DateOfSale), 6) as [Week Ending], count(SalesID) as Sales,
count(distinct(representative)) as Agents, count(SalesID) / count(distinct(representative)) as SPA
FROM Sales
where DateOfSale >= DATEADD(yy, DATEDIFF(yy,0,getdate()), 0)
GROUP BY DATEADD(wk, DATEDIFF(wk, 6, Sales.DateOfSale), 6)
ORDER BY DATEADD(wk, DATEDIFF(wk, 6, Sales.DateOfSale), 6)
I am hoping to have something like this:
Week Ending | Sales
01/05/2014 | 5
01/12/2014 | 8
01/19/2014 | 11
01/26/2014 | 14
Please excuse the formatting of the table above. I couldn't seem to figure out how to create a pipe/newline based table using the editor.
~Nick
I suggest creating a table or table parameter that has all of your calendar information. In this case, it would need at minimum the column WeekEnding.
For example
DECLARE #MyCalendar TABLE
(
WeekEnding date
);
Populate this with your valid WeekEnding dates. I might also make parameters to limit the amount of sales data, e.g. #BeginDate and #EndDate.
If you join using "<=" on the week ending date, then I believe you will get the return you want:
SELECT
MyCalendar.WeekEnding,
COUNT(Sales.SalesId) Sales,
COUNT(DISTINCT Sales.Representative) Agents,
CAST(COUNT(Sales.SalesId) AS float) / CAST(COUNT(DISTINCT Sales.Representative) AS float) Spa
FROM
Sales
INNER JOIN
#MyCalendar MyCalendar
ON
Sales.DateOfSale <= MyCalendar.WeekEnding
WHERE
Sales.DateOfSale BETWEEN #BeginDate AND #EndDate
GROUP BY
MyCalendar.WeekEnding;
I am assuming you are using SQL 2012, but I believe this will work in 2008 too. I might point out two other things. First, consider your data type when dividing the COUNT of SalesId by the distinct count of Representative. You may not get the return you expect, and that is why I cast as float. Second, you apply count distinct slightly differently than what I use; the extra parenthesis are not needed.
I have a simplified version in SQL Fiddle.