Calculating Open incidents per month - sql

We have Incidents in our system with Start Time and Finish Time and project name (and other info) .
We would like to have report: How many Incidents has 'open' status per month per project.
Open status mean: Not finished.
If incident is created in December 2009 and closed in March 2010, then it should be included in December 2009, January and February of 2010.
Needed structure should be like this:
Project Year Month Count
------- ------ ------- -------
Test 2009 December 2
Test 2010 January 10
Test 2010 February 12
....

In SQL Server:
SELECT
Project,
Year = YEAR(TimeWhenStillOpen),
Month = DATENAME(month, MONTH(TimeWhenStillOpen)),
Count = COUNT(*)
FROM (
SELECT
i.Project,
i.Incident,
TimeWhenStillOpen = DATEADD(month, v.number, i.StartTime)
FROM (
SELECT
Project,
Incident,
StartTime,
FinishTime = ISNULL(FinishTime, GETDATE()),
MonthDiff = DATEDIFF(month, StartTime, ISNULL(FinishTime, GETDATE()))
FROM Incidents
) i
INNER JOIN master..spt_values v ON v.type = 'P'
AND v.number BETWEEN 0 AND MonthDiff - 1
) s
GROUP BY Project, YEAR(TimeWhenStillOpen), MONTH(TimeWhenStillOpen)
ORDER BY Project, YEAR(TimeWhenStillOpen), MONTH(TimeWhenStillOpen)
Briefly, how it works:
The most inner subselect, that works directly on the Incidents table, simply kind of 'normalises' the table (replaces NULL finish times with the current time) and adds a month difference column, MonthDiff. If there can be no NULLs in your case, just remove the ISNULL expression accordingly.
The outer subselect uses MonthDiff to break up the time range into a series of timestamps corresponding to the months where the incident was still open, i.e. the FinishTime month is not included. A system table called master..spt_values is also employed there as a ready-made numbers table.
Lastly, the main select is only left with the task of grouping the data.

A useful technique here is to create either a table of "all" dates (clearly that would be infinite so I mean a sufficiently large range for your purposes) OR create two tables: one of all the months (12 rows) and another of "all" years.
Let's assume you go for the 1st of these:
create table all_dates (d date)
and populate as appropriate. I'm going to define your incident table as follows
create table incident
(
incident_id int not null,
project_id int not null,
start_date date not null,
end_date date null
)
I'm not sure what RDBMS you are using and date functions vary a lot between them so the next bit may need adjusting for your needs.
select
project_id,
datepart(yy, all_dates.d) as "year",
datepart(mm, all_dates.d) as "month",
count(*) as "count"
from
incident,
all_dates
where
incident.start_date <= all_dates.d and
(incident.end_date >= all_dates.d or incident.end_date is null)
group by
project_id,
datepart(yy, all_dates.d) year,
datepart(mm, all_dates.d) month
That is not going to quite work as we want as the counts will be for every day that the incident was open in each month. To fix this we either need to use a subquery or a temporary table and that really depends on the RDBMS...
Another problem with it is that, for open incidents it will show them against all future months in your all_dates table. adding a all_dates.d <= today solves that. Again, different RDBMSs have different methods of giving back now/today/systemtime...
Another approach is to have an all_months rather than all_dates table that just has the date of first of the month in it:
create table all_months (first_of_month date)
select
project_id,
datepart(yy, all_months.first_of_month) as "year",
datepart(mm, all_months.first_of_month) as "month",
count(*) as "count"
from
incident,
all_months
where
incident.start_date <= dateadd(day, -1, dateadd(month, 1, first_of_month)
(incident.end_date >= first_of_month or incident.end_date is null)
group by
project_id,
datepart(yy, all_months.first_of_month),
datepart(mm, all_months.first_of_month)

Related

Can you pull from before a casted time

sql newbie here. I'm trying to make a report for a tracking software where assets go into the database with a "Status" and a time stamp when they make that change.
What I'm trying to do is write a report that tells me when something hits the status 'complete' but its origin status was a month prior. How would I go about getting this to work properly with declaring a start and end date?
Declare #StartDate datetime
Declare #EndDate datetime
-- Set the start and end dates here yyyy-mm-dd --
set #StartDate = '2021-08-01'
set #EndDate = '2021-08-31'
Select *
from assetcurrent.dbo ac
inner join (select assetid, min(hdate) as [hdate] from history where hstatus = "created" group by assetid, hdate) ah
Where a.status in ('complete')
and cast(ac.hdate as date) <= cast(#EndDate as date)
and cast(ac.hdate as date) >= cast(#StartDate as date)
and ah.hdate < dateadd(month, -1, #StartDate)
order by [Report number]
Am I properly using dateadd here in the where clauses? I just get 0 results when I know there should be at least a few.
Thanks!
--and ah.hdate < dateadd(month, -1, #StartDate)
Was what i was trying and giving me failed results.
and MONTH(ac.hdate) != month(ah.hdate)
Is what i should have used instead. much simpler.
Correct me if I'm wrong.
Please clarify for month prior. Ex, something ordered on Sept 30 and Completed on Oct 1 is only 1 day, yet in the prior month. Is your question really what projects too longer than, ex: 30 days, to be completed? Much more specific condition. Also, are you looking for completed tasks such as Completed within September 2021 that took more than 30 days from the order/project start date?
select
AC.*,
PQ.DateCreated,
PQ.DateCompleted
from
( select
h.assetid,
MIN( h2.hdate) as DateCreated,
min(h.hdate) as DateCompleted,
from
history h
JOIN history h2
ON h.assetid = h2.assetid
AND h2.status = 'created'
AND h2.hdate < dateadd( month, -1, h.hdate )
where
h.hstatus = 'complete'
AND h.hdate >= '2021-09-01'
AND h.hdate < '2021-10-01'
group by
h.assetid ) PQ
JOIN AssetCurrent AC
ON PQ.assetid = AC.assetid
order by
ac.[Report Number]
So, in the query above, I am starting to look for all orders COMPLETED within the given time period (changed to Sept 1 to Sept 30). From that, I joined secondary to same history table again with alias h2 on the same asset ID, but this time based on when it was originally created. Now, I have a record of complete and created that I am looking at. From that, I can now apply the h2.date being LESS than a month prior to its creation. So in an example of an order was complete on Sept 27, it would only return the same asset if it was ordered prior to Oct 27, thus took more than a month to complete.
Once I had that part resolved, I could then join to your AssetCurrent table based on an implied key of assetid to get those details. While at it, I grabbed both context of Created and Completed Dates in the return set.

how do I create a calculated field that returns days remaining till end of FISCAL_QUARTER?

Current output with no DAYS_LEFT_IN_QUARTERI am new to using Snowflake and was tasked to create a Calendar Dimension table that would aid in reporting weekly / monthly /quarterly reports. I am confused on how to return days remaining in the FISCAL_QUARTER. Q1 spans from Feb - Apr.
Attached below is the code I have been writing to generate the dates projecting 14 years in the future.
--Set the start date and number of years to produce
SET START_DATE = '2012-01-01';
SET NUMBER_DAYS = (SELECT TRUNC(14 * 365));
--Set parameters to force ISO
ALTER SESSION SET WEEK_START = 1, WEEK_OF_YEAR_POLICY = 1;
WITH CTE_MY_DATE AS (
SELECT DATEADD(DAY, SEQ4(), $START_DATE) AS MY_DATE
FROM TABLE(GENERATOR(ROWCOUNT=>$NUMBER_DAYS)) -- Number of days after reference date in previous line
)
SELECT
MY_DATE::date
,YEAR(MY_DATE) AS YEAR
,MONTH(MY_DATE) AS MONTH
,MONTHNAME(MY_DATE) AS MONTH_ABBREVIATION
,DAY(MY_DATE)
,DAYOFWEEK(MY_DATE)
,WEEKOFYEAR(MY_DATE)
,DAYOFYEAR(MY_DATE)
,YEAR(ADD_MONTHS(DATE_TRUNC('month', MY_DATE),11)) AS FISCAL_YEAR
,CONCAT('Q', QUARTER(ADD_MONTHS(DATE_TRUNC('month', MY_DATE),11))) AS FISCAL_QUARTER
,MONTH(ADD_MONTHS(DATE_TRUNC('month', MY_DATE),11)) AS FISCAL_MONTH
FROM CTE_MY_DATE
;
firstly your generator will get gaps, as SEQx() function are allowed to have gaps, so you need to use SEQx() as the OVER BY of a ROW_NUMBER like so:
WITH cte_my_date AS (
SELECT DATEADD(DAY, ROW_NUMBER() OVER(ORDER BY SEQ4()), $START_DATE) AS my_date
FROM TABLE(GENERATOR(ROWCOUNT=>$NUMBER_DAYS)) -- Number of days after reference date in previous line
)
and days left in quarter, is the day truncated to quarter, +1 quarter, date-diff in days to day:
,DATEDIFF('days', my_date, DATEADD('quarter', 1, DATE_TRUNC('quarter', my_date))) AS days_left_in_quarter
How's this? You can copy/paste the code straight into snowflake to test.
Using last_day() tends to make it look a little tidier :-)
WITH CTE_MY_DATE AS (
SELECT DATEADD(DAY, SEQ4(), current_date()) AS MY_DATE
FROM TABLE(GENERATOR(ROWCOUNT=>300)))
SELECT
MY_DATE::date
,YEAR(last_day(my_date,year)) AS FISCAL_YEAR
,concat('Q',quarter(my_date)) AS FISCAL_QUARTER
,datediff(d, my_date, last_day(my_date,quarter)) AS
DAYS_LEFT_IN_QUARTER
FROM CTE_MY_DATE

Running Total - Create row for months that don't have any sales in the region (1 row for each region in each month)

I am working on the below query that I will use inside Tableau to create a line chart that will be color-coded by year and will use the region as a filter for the user. The query works, but I found there are months in regions that don't have any sales. These sections break up the line chart and I am not able to fill in the missing spaces (I am using a non-date dimension on the X-Axis - Number of months until the end of its fiscal year).
I am looking for some help to alter my query to create a row for every month and every region in my dataset so that my running total will have a value to display in the line chart. if there are no values in my table, then = 0 and update the running total for the region.
I have a dimDate table and also a Regions table I can use in the query.
My Query now, (Results sorted in Excel to view easier) Results Table Now
What I want to do; New rows highlighted in Yellow What I want to do
My Code using SQL Server:
SELECT b.gy,
b.sales_month,
b.region,
b.gs_year_total,
b.months_away,
Sum(b.gs_year_total)
OVER (
partition BY b.gy, b.region
ORDER BY b.months_away DESC) RT_by_Region_GY
FROM (SELECT a.gy,
a.region,
a.sales_month,
Sum(a.gy_total) Gs_Year_Total,
a.months_away
FROM (SELECT g.val_id,
g.[gs year] AS GY
,
g.sales_month
AS
Sales_Month,
g.gy_total,
Datediff(month, g.sales_month, dt.lastdayofyear) AS
months_away,
g.value_type,
val.region
FROM uv_sales g
JOIN dbo.dimdate AS dt
ON g.[gs year] = dt.gsyear
JOIN dimvalsummary val
ON g.val_id = val.val_id
WHERE g.[gs year] IN ( 2017, 2018, 2019, 2020, 2021 )
GROUP BY g.valuation_id,
g.[gs year],
val.region,
g.sales_month,
dt.lastdayofyear,
g.gy_total,
g.value_type) a
WHERE a.months_away >= 0
AND sales_month < Dateadd(month, -1, Getdate())
GROUP BY a.gy,
a.region,
a.sales_month,
a.months_away) b
It's tough to envision the best method to solve without data and the meaning of all those fields. Here's a rough sketch of how one might attempt to solve it. This is not complete or tested, sorry, but I'm not sure the meaning of all those fields and don't have data to test.
Create a table called all_months and insert all the months from oldest to whatever date in the future you need.
01/01/2017
02/01/2017
...
12/01/2049
May need one query per region and union them together. Select the year & month from that all_months table, and left join to your other table on month. Coalesce your dollar values.
select 'East' as region,
extract(year from m.month) as gy_year,
m.month as sales_month,
coalesce(g.gy_total, 0) as gy_total,
datediff(month, m.month, dt.lastdayofyear) as months_away
from all_months m
left join uv_sales g on g.sales_month = m.month
--and so on

Modification todate dimension in SQL Server

I need a suggestion around one of the columns that I'm creating in the Date dimension in SQL Server, basically rolling weeks..
I have a table dimDate in my datawarehouse.
I want to create a column in the dimdate table which will have week number in any year and each week should have 7 days.
For eg: In year 2015 there are 53 weeks but the 53rd week has only 5 days (because the week starts on Sunday in SQL Server I guess).
I want to include 2 more days from 2016 (1st and 2nd Jan in 2016) to complete the 53rd week with 7 days and also the the 1st week in 2016 should start on 3rd of Jan 2016, so on and so forth.
If there are any suggestions that will be great to start with.
Assuming that you already have weeks populated (but not extended into the next year), and making some assumptions about columns names
This query finds the last week in a year (which would almost always always be 53 but don't count on it:) and the date that it ends on
SELECT YearNo, MAX(Week) As Week, MAX(DateKey) As DateKey
FROM dimDate
GROUP BY YearNo
This query finds all weeks that are shorter than 7 days, and how many extra days are required to make them 7 days.
SELECT
YearNo,
Week,
7-COUNT(DISTINCT DateKey) As ExtraDaysRequired
FROM dimDate
GROUP BY YearNo, Week
HAVING COUNT(DISTINCT DateKey) < 7
This might always be the last week of the year but lets not make assumptions.
Lets combine these to find all final weeks that have less than 7 days, as well as add the number of days required:
SELECT
Under7Days.YearNo, Under7Days.Week, Under7Days.ExtraDaysRequired,
FinalWeeks.DateKey StartDate,
DATEADD(d,Under7Days.ExtraDaysRequired,FinalWeeks.DateKey) EndDate
FROM
(
SELECT YearNo, MAX(Week) As Week, MAX(DateKey) As DateKey
FROM dimDate
GROUP BY YearNo
) As FinalWeeks
INNER JOIN
(
SELECT YearNo, Week, 7-COUNT(DISTINCT DateKey) As ExtraDaysRequired
FROM dimDate
GROUP BY YearNo, Week
HAVING COUNT(DISTINCT DateKey) < 7
) As Under7Days
ON FinalWeeks.Week = Under7Days.Week
AND FinalWeeks.YearNo = Under7Days.YearNo
So we have a query that identifies the start date and end date and week number that it needs to be updated to. So now we run an update:
UPDATE TGT
SET Week = SRC.Week
FROM dimDate TGT
INNER JOIN
(
SELECT
Under7Days.YearNo, Under7Days.Week, Under7Days.ExtraDaysRequired,
FinalWeeks.DateKey StartDate,
DATEADD(d,Under7Days.ExtraDaysRequired,FinalWeeks.DateKey) EndDate
FROM
(
SELECT YearNo, MAX(Week) As Week, MAX(DateKey) As DateKey
FROM dimDate
GROUP BY YearNo
) As FinalWeeks
INNER JOIN
(
SELECT YearNo, Week, 7-COUNT(DISTINCT DateKey) As ExtraDaysRequired
FROM dimDate
GROUP BY YearNo, Week
HAVING COUNT(DISTINCT DateKey) < 7
) As Under7Days
ON FinalWeeks.Week = Under7Days.Week
AND FinalWeeks.YearNo = Under7Days.YearNo
) SRC
ON TGT.DateID BETWEEN SRC.StartDate AND SRC.EndDate
Looks complicated? There's half a dozen ways to write the same thing but this approach is step-by-step. You could probably write a windowing function to do the same thing but I leave that as an exercise for someone else.

Query to check number of records created in a month.

My table creates a new record with timestamp daily when an integration is successful. I am trying to create a query that would check (preferably automated) the number of days in a month vs number of records in the table within a time frame.
For example, January has 31 days, so i would like to know how many days in january my process was not successful. If the number of records is less than 31, than i know the job failed 31 - x times.
I tried the following but was not getting very far:
SELECT COUNT (DISTINCT CompleteDate)
FROM table
WHERE CompleteDate BETWEEN '01/01/2015' AND '01/31/2015'
Every 7 days the system executes the job twice, so i get two records on the same day, but i am trying to determine the number of days that nothing happened (failures), so i assume some truncation of the date field is needed?!
One way to do this is to use a calendar/date table as the main source of dates in the range and left join with that and count the number of null values.
In absence of a proper date table you can generate a range of dates using a number sequence like the one found in the master..spt_values table:
select count(*) failed
from (
select dateadd(day, number, '2015-01-01') date
from master..spt_values where type='P' and number < 365
) a
left join your_table b on a.date = b.CompleteDate
where b.CompleteDate is null
and a.date BETWEEN '01/01/2015' AND '01/31/2015'
Sample SQL Fiddle (with count grouped by month)
Assuming you have an Integers table*. This query will pull all dates where no record is found in the target table:
declare #StartDate datetime = '01/01/2013',
#EndDate datetime = '12/31/2013'
;with d as (
select *, date = dateadd(d, i - 1 , #StartDate)
from dbo.Integers
where i <= datediff(d, #StartDate, #EndDate) + 1
)
select d.date
from d
where not exists (
select 1 from <target> t
where DATEADD(dd, DATEDIFF(dd, 0, t.<timestamp>), 0) = DATEADD(dd, DATEDIFF(dd, 0, d.date), 0)
)
Between is not safe here
SELECT 31 - count(distinct(convert(date, CompleteDate)))
FROM table
WHERE CompleteDate >= '01/01/2015' AND CompleteDate < '02/01/2015'
You can use the following query:
SELECT DATEDIFF(day, t.d, dateadd(month, 1, t.d)) - COUNT(DISTINCT CompleteDate)
FROM mytable
CROSS APPLY (SELECT CAST(YEAR(CompleteDate) AS VARCHAR(4)) +
RIGHT('0' + CAST(MONTH(CompleteDate) AS VARCHAR(2)), 2) +
'01') t(d)
GROUP BY t.d
SQL Fiddle Demo
Explanation:
The value CROSS APPLY-ied, i.e. t.d, is the ANSI string of the first day of the month of CompleteDate, e.g. '20150101' for 12/01/2015, or 18/01/2015.
DATEDIFF uses the above mentioned value, i.e. t.d, in order to calculate the number of days of the month that CompleteDate belongs to.
GROUP BY essentially groups by (Year, Month), hence COUNT(DISTINCT CompleteDate) returns the number of distinct records per month.
The values returned by the query are the differences of [2] - 1, i.e. the number of failures per month, for each (Year, Month) of your initial data.
If you want to query a specific Year, Month then just simply add a WHERE clause to the above:
WHERE YEAR(CompleteDate) = 2015 AND MONTH(CompleteDate) = 1