Access SQL - Count days in each month between two dates - sql

I have two tables - [Students] Which holds a list of students (and the name of their teacher), and [Absence Extract] Which holds a record of each instance of absence for any student (along with the start and end dates and total days).
I am trying to write a query that will group the days of absence by teacher and then show how many days of absence they have had in each month. I've started by writing the below query where I have shown the calculation I've come up with so far for days lost in January and February (I would then add the calculations for the other months), however this example isn't working as if they had e.g 15 days absence spread across January and February this is returning 15 days for both months.
Could someone point me in the right direction with this please?
SELECT
[Students].[Teacher Name] AS [Teacher],
SUM(IIF(ae.[Absence End Date] >= #1/1/18# AND ae.[Absence Start Date] <= #1/31/18#,[Total Days],0)) AS [Jan Days],
SUM(IIF(ae.[Absence End Date] >= #2/1/18# AND ae.[Absence Start Date] <= #2/28/18#,[Total Days],0)) AS [Feb Days]
FROM
[Students]
INNER JOIN
[Absence Extract] ae ON [Students].[ID] = [ae].[Student ID]
GROUP BY [Students].[Teacher Name];

Do not use the calculated column, Total Days, as it sums the difference between start and end dates across months and even years. Consider calculating separate durations:
absence duration from start date to end of start date month
absence duration from first of end date month to end date
Then join the aggregations together. So for example, the absence range Jan 30, 2018 - Feb 2, 2018:
first query calculates days from Jan 30 to Jan 31 (end of month), grouped to January month
second query calculates from Feb 1 (start of month) to Feb 2, grouped to February month
join query aligns to Teacher, Year, and Month, and calculates total duration with arithmetic addition, +, where 2 days result for Jan and 2 days result for Feb on different rows
Start Date Query
SELECT
s.[Teacher Name] AS [Teacher],
Year(ae.[Absence End Date]) As Year_Absence,
MonthName(Month(ae.[Absence Start Date]), TRUE) As Month_Absence,
SUM(DateDiff('d', ae.[Absence Start Date],
DateAdd('m', 1, ae.[Absence Start Date]) -
Day(ae.[Absence Start Date]))) As StartDuration
FROM
[Students] s
INNER JOIN
[Absence Extract] ae ON s[ID] = ae.[Student ID]
GROUP BY s.[Teacher Name],
Year(ae.[Absence Start Date])
MonthName(Month(ae.[Absence Start Date]), TRUE);
End Date Query
SELECT
s.[Teacher Name] AS [Teacher],
Year(ae.[Absence End Date]) As Year_Absence,
MonthName(Month(ae.[Absence End Date]), TRUE) As Month_Absence,
SUM(DateDiff('d', ae.[Absence End Date],
ae.[Absence End Date] -
(Day(ae.[Absence End Date])-1))) As EndDateDuration
FROM
[Students] s
INNER JOIN
[Absence Extract] ae ON s[ID] = ae.[Student ID]
GROUP BY s.[Teacher Name],
Year(ae.[Absence End Date])
MonthName(Month(ae.[Absence End Date]), TRUE);
Join Query (Long Format)
SELECT s.[Teacher Name],
s.Year_Absence,
s.Month_Absence,
NZ(s.StartDuration) + NZ(e.EndDateDuration) As TotalDuration
FROM startdate_query s
LEFT JOIN enddate_query e
ON s.[Teacher Name] = e.[Teacher Name]
AND s.Year_Absence = e.Year_Absence
AND s.Month_Absence = e.Month_Absence
And since you are looking for a wide report with month columns maintaining aggregate sums of absence duration, consider MS Access's own crosstab query.
Crosstab Query (Wide Format)
TRANSFORM SUM(NZ(s.StartDuration) + NZ(e.EndDateDuration)) AS [SumDays]
SELECT s.[Teacher Name],
s.Year_Absence
FROM startdate_query s
LEFT JOIN enddate_query e
ON s.[Teacher Name] = e.[Teacher Name]
AND s.Year_Absence = e.Year_Absence
AND s.Month_Absence = e.Month_Absence
GROUP BY s.[Teacher Name],
s.Year_Absence
PIVOT s.Month_Absence IN ('Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun',
'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec')
Of course, all this is untested without actual data. So, various adjustments may be needed.

Related

How to select max date over the year function

I am trying to select the max date over the year, but it is not working. Any ideas on what to do?
SELECT a.tkinit [TK ID],
YEAR(a.tkeffdate) [Rate Year],
max(a.tkeffdate) [Max Date],
tkrt03 [Standard Rate]
FROM stageElite.dbo.timerate a
join stageElite.dbo.timekeep b ON b.tkinit = a.tkinit
WHERE a.tkinit = '02672'
and tkeffdate BETWEEN '2014-01-01' and '12-31-2014'
GROUP BY a.tkinit,
tkrt03,
a.tkeffdate
Perhaps you only want it by year and not rolled up by calendar date. For SQL server you can try this.
SELECT
…
MaxDate = MAX(a.tkeffdate) OVER (PARTITION BY a.tkinit, YEAR(a.tkeffdate)))
…
Or you could modify the query above to group by the year instead of date-->
GROUP BY a.tkinit,
tkrt03,
YEAR(a.tkeffdate)
You seem to want only one row and all the columns. Use ORDER BY and TOP:
SELECT TOP (1) tr.tkinit as [TK ID],
YEAR(tr.tkeffdate) as [Rate Year],
a.tkeffdate as [Max Date],
tkrt03 as [Standard Rate]
FROM stageElite.dbo.timerate tr JOIN
stageElite.dbo.timekeep tk
ON tk.tkinit = tr.tkinit
WHERE tr.tkinit = '02672' AND
tr.tkeffdate >= '2014-01-01' AND
tr.tkeffdate < '2015-01-01'
ORDER tr.tkeffdate DESC;
Note that I also fixed your date comparisons and table aliases.

SQL: Max of One Column and Corresponding Other Columns, Joined Tables

I need to find the max demand for each store for every day and the corresponding hour that the max demand happens in. I have one table with the store number, date and time, and demand value and another tables with the table with the date and time broken out by date and hour.
I need my results to look like this essentially:
Max Demand Date Hour Store Number
420 7/1/2019 19 516
415 7/1/2019 20 6228
390 7/1/2019 17 520
402 7/1/2019 20 1363
357 7/1/2019 22 8949
I can basically get this without the hour column, but the hour is really important.
This is on Access, so my actual SQL knowlege is very limited. I'm trying use a subquery but I can't figure it out
SELECT [Time and Demand].Demand, [Time and Demand].[Store Number], Dates.Hour, Dates.ThisDate
FROM Dates INNER JOIN [Time and Demand] ON Dates.[Date and Time] = [Time and Demand].[Date and Time]
WHERE Demand IN
(SELECT Max([Time and Demand].Demand) AS MaxOfDemand, [Time and Demand].[Store Number], Dates.ThisDate
FROM Dates INNER JOIN [Time and Demand] ON Dates.[Date and Time] = [Time and Demand].[Date and Time])
GROUP BY [Time and Demand].[Store Number], Dates.ThisDate;
This gets an error "You have written a subquery that can return more than one field without using the EXISTS reserved word in the main query's from clause. Revise the SELECT statement of the subquery to request only one field"
This way your subquery will return only 1 column:
SELECT td.Demand, td.[Store Number], d.Hour, d.ThisDate
FROM (Dates dINNER JOIN [Time and Demand] td ON d.[Date and Time] = td.[Date and Time])
INNER JOIN (
SELECT MAX(td.Demand) maxdemand, td.[Store Number], d.ThisDate
FROM Dates d INNER JOIN [Time and Demand] td
ON d.[Date and Time] = td.[Date and Time]
GROUP BY td.[Store Number], d.ThisDate
) g ON g.ThisDate = td.[Date and Time] AND g.[Store Number] = td.[Store Number] AND g.maxdemand = td.Demand

New to SQL. Would like to convert an IF(COUNTIFS()) Excel formula to SQL code and have SQL calculate it instead of Excel

I am running SQL Server 2008 R2 (RTM).
I have a SQL query that pulls Dates, Products, Customers and Units:
select
[Transaction Date] as Date,
[SKU] as Product,
[Customer Name] as Customer,
sum(Qty) as Units
from dataset
where [Transaction Date] < '2019-03-01' and [Transaction Date] >= '2016-01-01'
group by [Transaction Date], [SKU], [Customer Name]
order by [Transaction Date]
This pulls hundreds of thousands of records and I wanted to determine if a certain transaction was a new order or reorder based on the following logic:
Reorder: That specific Customer has ordered that specific product in the last 6 months
New Order: That specific Customer hasn’t ordered that specific product in the last 6 months
For that I have this formula in Excel that seems to be working:
=IF(COUNTIFS(A$1:A1,">="&DATE(YEAR(A2),MONTH(A2)-6,DAY(A2)),C$1:C1,C2,B$1:B1,B2),"Reorder","New Order")
The formula works when I paste it individually or in a smaller dataset, but when I try to copy paste it to all 500K+ rows, Excel gives up because it loops for each calculation.
This could probably be done in SQL, but I don’t have the knowledge on how to convert this excel formula to SQL, I just started studying it.
You're doing pretty well with the start of your query there. There are three additional functions you're looking to add to your query.
The first thing you'll need is the easiest. GETDATE() simply returns the current date. You'll need that when you're comparing the current date to the transaction date.
The second function is DATEDIFF, which will give you a unit of time between two dates (months, days, years, quarters, etc). Using DATEDIFF, you can say "is this date within the last 6 months". The format for this is pretty easy. It's DATEDIFF(interval, date1, date2).
The thrid function you're looking for is CASE, which allows you to tell SQL to give you one answer if one condition is met, but a different answer if a different condition is met. For your example, you can say "if the difference in days is < 60, return 'Reorder', if not give me 'New Order'".
Putting it all together:
SELECT CASE
WHEN DATEDIFF(MONTH, [Transaction Date], GETDATE()) <= 6
THEN 'Reorder'
ELSE 'New Order'
END as ORDER_TYPE
,[Transaction Date] AS DATE
,[SKU] AS PRODUCT
,[Customer Name] AS CUSTOMER
,Qty AS UNITS
FROM DATASET
For additonal examples on CASE, take a look at this site: https://www.w3schools.com/sql/sql_ref_case.asp
For additional examples on DATEDIFF, take a look here: See the
following webpage for examples and a chance to try it out:
https://www.w3schools.com/sql/func_sqlserver_datediff.asp
SELECT CASE
WHEN Datediff(day, [transaction date], Getdate()) <= 180 THEN 'reorder'
ELSE 'Neworder'
END,
[transaction date] AS Date,
[sku] AS Product,
[customer name] AS Customer,
qty AS Units
FROM datase
If I understand correctly, you want to peak at the previous date and make a comparison. This suggests lag():
select (case when lag([Transaction Date]) over (partition by SKU, [Customer Name] order by [Transaction Date]) >
dateadd(month, -6, [Transaction Date])
then 'Reorder'
else 'New Order'
end) as Order_Type
[Transaction Date] as Date,
[SKU] as Product,
[Customer Name] as Customer,
sum(Qty) as Units
from dataset d
group by [Transaction Date], [SKU], [Customer Name];
EDIT:
In SQL Server 2008, you can emulate the LAG() using OUTER APPLY:
select (case when dprev.[Transaction Date] >
dateadd(month, -6, d.[Transaction Date])
then 'Reorder'
else 'New Order'
end) as Order_Type
d.[Transaction Date] as Date,
d.[SKU] as Product,
d.[Customer Name] as Customer,
sum(d.Qty) as Units
from dataset d outer apply
(select top (1) dprev.*
from dataset dprev
where dprev.SKU = d.SKU and
dprev.[Customer Name] = d.[Customer Name] and
dprev.[Transaction Date] < d.[Transaction Date]
order by dprev.[Transaction Date] desc
) dprev
group by d.[Transaction Date], d.[SKU], d.[Customer Name];

SQL WHERE date partially field and partially predefined string

For my project we are using MS SQL 2008. In an subquery i like to search in an WHERE clause to the last day of the current (in the SQL loop) year. It should be something like:
WHERE datefieldsubquery = currentyearparentquery+'-12-31 23:59:00'
But there is no output in the subquery while using this. When the query is for example:
WHERE datefieldsubquery = '2014-12-31 23:59:00'
the query returns an result. I like that '2014' is dynamically inherited from the parent query. Is that possible in SQL?
-- edit --
Both fields are datetime.
Real complete query:
SELECT factuurregel.[Internal costkind] AS code,
COUNT(factuurregel.[Internal costkind]) AS total,
YEAR(factuurregel.[Invoice date lease company]) AS maand,
(
SELECT COUNT(leasecontract.[Contract Ending Date]) AS autos
FROM [test$Lease Contract] AS leasecontract
WHERE (leasecontract.[Status] = '0' OR leasecontract.[Status] = '1')
AND YEAR(leasecontract.[Contract Activation Date]) <= YEAR(factuurregel.[Invoice date lease company])
AND (YEAR(leasecontract.[Contract Ending Date]) > YEAR(factuurregel.[Invoice date lease company])
OR leasecontract.[Contract Ending Date] = '1753-01-01 00:00:00')
) AS autos
FROM [test$Invoice line] AS factuurregel
LEFT JOIN [test$Lease Car] AS leasecar ON factuurregel.[License No_] = leasecar.[License No_]
WHERE factuurregel.[Internal costkind] >= '200'
AND factuurregel.[Internal costkind] < '300'
AND (leasecar.[Licence Type] = 1 OR leasecar.[Licence Type] = 2)
GROUP BY YEAR(factuurregel.[Invoice date lease company]),factuurregel.[Internal costkind]
ORDER BY factuurregel.[Internal costkind]
Try this way in your WHERE Clause
WHERE datefieldsubquery = Convert(varchar(50),YEAR(GETDATE()))+'-12-31 23:59:00'

Displaying different date periods on income data

I have the below query which displays data like so:
Income Type This Month Last Month This Year Last Year
1 179640.00 179640.00 179640.00 179640.00
2 12424440.00 12424440.00 12424440.00 12424440.00
Select
Income_Type As [Income Type],
Sum(Income_Amount) As [This Month],
Sum(Income_Amount) As [Last Month],
Sum(Income_Amount) As [This Year],
Sum(Income_Amount) As [Last Year]
From Income I
Left Join Finance_Types FT On I.Income_Type = FT.Type_ID
Group By
Income_Type
The Income table has a Income_Date which is a datetime column.
I'm struggling to get my head around how I would pull out the data for 'This Month', 'Last Month', 'This Year', 'Last Year' with the correct Sums in one query if possible?
Use date functions:
SUM(CASE WHEN YEAR(yourdatefield) = YEAR(GetDate()) - 1 THEN Income_Amount ELSE 0 END) AS 'Last Year'
That case statement only returns the Income_Amount if it was the last year, so you would be summing up only those amounts.
If you're not using SQL Server, the syntax might be a bit different.