SQL Server group by overlapping 10 day intervals

SQL Server group by overlapping 10 day intervals - sql

I have a table which logs each individual piece produced across several production machines, going back a number of years. Periodically (e.g. once per week) I want to check this table to establish "best performance" records for each machine and product combination, storing them in a new table according to the following rules;
The machine must have produced a minimum of 10,000 parts over a 10 day period - if only 9000 parts were produced over 10 days, this is an invalid record
The machine must have been running the same product without changing over for the entire period i.e. if on day 5 the product changed, this is an invalid record
The Performance data table looks like below [VisionMachineResults]
ID
MCSAP
DateTime
ProductName
InspectionResult
1
123456
2020-01-01 08:29:34:456
Product A
0
2
123456
2020-01-01 08:45:50:456
Product B
1
3
844214
2020-01-01 08:34:48:456
Product A
2
4
978415
2020-01-02 09:29:26:456
Product C
0
5
985633
2020-01-04 23:29:11:456
Product A
2
I am able to produce a result which gives a list of individual days performance per SAP / Product Combination, but I then need to process the data in a complex loop outside of SQL to establish the 10 day groups.
My current query is:
SELECT CAST(DateTime AS date) AS InputDate,
MCSAP,
ZAssetRegister.LocalName,
ProductName,
SUM(CASE WHEN InspectionResult = 0 THEN 1 END) AS OKParts,
COUNT(CASE WHEN InspectionResult > 0 THEN 1 END) AS NGParts
FROM [VisionMachineResults]
INNER JOIN ZAssetRegister ON VisionMachineResults.MCSAP = ZAssetRegister.SAP_Number
GROUP BY CAST(DateTime AS date),
MCSAP,
ProductName,
ZAssetRegister.LocalName
ORDER BY InputDate,
ZAssetRegister.LocalName;
Would it be possible to have the SQL query give the result in 10 day groups, instead of per individual day i.e.
01-01-2021 to 11-01-2021 | Machine 1 | Product 1 | 20,000 | 5,000
02-01-2021 to 12-01-2021 | Machine 1 | Product 1 | 22,000 | 1,000
03-01-2021 to 13-01-2021 | Machine 1 | Product 1 | 18,000 | 4,000
etc...
I would then iterate through the rows to find the one with the best percentage of OK parts. Any ideas appreciated!

This process needs to be considered in many levels. First, you mention 10 consecutive days. We dont know if those days include weekends, if the machines are running 24/7. If the dates running can skip over holidays as well? So, 10 days could be Jan 1 to Jan 10. But if you skip weekends, you only have 6 actual WEEKDAYS.
Next, consideration of a machine working on more than one product, such as a switching between dates, or even within a single day.
As a commenter indicated, having column names by same as a reserved word (such as DateTime), bad practice and try to see if any new columns are common key words that may cause confusion and avoid them.
You also mention that you had to do complex looping checks, and how to handle joining out to 10 days, the splits, etc. I think I have a somewhat elegant approach to doing this and should prove to be rather simple in the scheme of things.'
You are using SQL-Server, so I will do this using TEMP tables via "#" table names. This way, when you are done with a connection, or a call to making this a stored procedure, you dont have to keep deleting and recreating them. That said, let me take you one-step-at-a-time.
First, I'm creating a simple table matching your structure, even with the DateTime context.
CREATE TABLE VisionMachineResults
(
ID int IDENTITY(1,1) NOT NULL,
MCSAP nvarchar(6) NOT NULL,
DateTime datetime NOT NULL,
ProductName nvarchar(10) NOT NULL,
InspectionResult int NOT NULL,
CONSTRAINT ID PRIMARY KEY CLUSTERED
(
[ID] ASC
) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
Now, I'm inserting the data, similar to what you have, but not millions of rows. You mention you are looking for 10 days out, so I just padded the end with several extras to simulate that. I also explicitly forced a gap change of product by the one machine on Jan 5th. Additionally, I added a product change on Jan 7th to trigger this a "break" within your 10-day consideration. You'll see the results later.
insert into VisionMachineResults
(MCSAP, [DateTime], ProductName, InspectionResult )
values
( '123456', '2020-01-01 08:29:34.456', 'Product A', 0 ),
( '123456', '2020-01-01 08:29:34.456', 'Product B', 1 ),
( '844214', '2020-01-01 08:29:34.456', 'Product A', 2 ),
( '978415', '2020-01-02 08:29:34.456', 'Product C', 0 ),
( '985633', '2020-01-04 08:29:34.456', 'Product A', 0 ),
( '985633', '2020-01-05 08:29:34.456', 'Product A', 0 ),
( '985633', '2020-01-05 08:29:34.456', 'Product B', 0 ),
( '985633', '2020-01-06 08:29:34.456', 'Product A', 2 ),
( '985633', '2020-01-07 08:29:34.456', 'Product B', 0 ),
( '985633', '2020-01-08 08:29:34.456', 'Product A', 0 ),
( '985633', '2020-01-09 08:29:34.456', 'Product A', 0 ),
( '985633', '2020-01-10 08:29:34.456', 'Product A', 0 ),
( '985633', '2020-01-11 08:29:34.456', 'Product A', 0 ),
( '985633', '2020-01-12 08:29:34.456', 'Product A', 0 ),
( '985633', '2020-01-13 08:29:34.456', 'Product A', 0 ),
( '985633', '2020-01-14 08:29:34.456', 'Product A', 1 ),
( '985633', '2020-01-15 08:29:34.456', 'Product A', 0 ),
( '985633', '2020-01-16 08:29:34.456', 'Product A', 0 ),
( '985633', '2020-01-17 08:29:34.456', 'Product A', 0 ),
( '985633', '2020-01-18 08:29:34.456', 'Product A', 0 ),
( '985633', '2020-01-19 08:29:34.456', 'Product A', 0 ),
( '985633', '2020-01-20 08:29:34.456', 'Product A', 0 )
go
So now, consider this the baseline of YOUR production data. My first query will be doing a bunch of things, but storing the pre-aggregations INTO #tmpPartDailyCounts result table. This way you can look at them at the different stages to apply sanity check to my approach.
Here, on a per machine (MCSAP), and Date (without time portion), I am grabbing certain aggregates, and keeping them grouped by machine and date.
select
VMR.MCSAP,
cast(VMR.DateTime as Date) as InputDate,
min( VMR.ProductName ) ProductName,
max( VMR.ProductName ) LastProductName,
count( distinct VMR.ProductName ) as MultipleProductsSameDay,
sum( case when VMR.InspectionResult = 0 then 1 else 0 end ) OKParts,
sum( case when NOT VMR.InspectionResult = 0 then 1 else 0 end ) BadParts,
count(*) TotalParts
into
#tmpPartDailyCounts
from
VisionMachineResults VMR
group by
VMR.MCSAP,
cast(VMR.DateTime as Date)
You were joining to an asset table and dont think you really need that. If the machine made the product, does it matter if a final assembly is complete? Dont know, you would know better.
Now, the aggregates and why. The min( VMR.ProductName ) ProductName and max( VMR.ProductName ) LastProductName, this is just to carry-forward the product name created on the date in question for any final output. If on a given day, only one product was made, it would be the same anyhow, just pick one. However, if on any day there are multiple products, the MIN() and MAX() will be of different values. If the same product across all that are built, then both values would be the same -- ON ANY SINGLE GIVEN DATE.
The rest are simple aggregates of OK parts, BAD parts (something was wrong), but also the TOTAL Parts created, regardless of any inspection failure. This is the primary qualifier for you to hit you 10,000, but if you wanted to change to 10,000 GOOD parts, change accordingly.
select
VMR.MCSAP,
cast(VMR.DateTime as Date) as InputDate,
min( VMR.ProductName ) ProductName,
max( VMR.ProductName ) LastProductName,
count( distinct VMR.ProductName ) as MultipleProductsSameDay,
sum( case when VMR.InspectionResult = 0 then 1 else 0 end ) OKParts,
sum( case when NOT VMR.InspectionResult = 0 then 1 else 0 end ) BadParts,
count(*) TotalParts
into
#tmpPartDailyCounts
from
VisionMachineResults VMR
group by
VMR.MCSAP,
cast(VMR.DateTime as Date)
Now, at this point, I have a pre-aggregation done on a per machine and date basis. Now, I want to get some counter that is sequentially applied on a per date that a product was done. I will pull this result into a temp table #tmpPartDays. By using the over/partition, this will create a result that first puts the records in order of MCSAP, then by the date and dumps an output with whatever the ROW_NUMBER() is to that. So, if there is no activity for a given machine such as over a weekend or holiday that the machine is not running, the SEQUENTIAL counter via OVER/PARTITION will keep them sequentially 1 through however many days... Again, query the result of this table and you'll see it.
By querying against the pre-aggregated table, that may account for 500k records and results down to say 450 via per machine/day, This query is now only querying against the 450 and will be very quick.
SELECT
PDC.MCSAP,
PDC.InputDate,
MultipleProductsSameDay,
ROW_NUMBER() OVER(PARTITION BY MCSAP
ORDER BY [InputDate] )
AS CapDay
into
#tmpPartDays
FROM
#tmpPartDailyCounts PDC
ORDER BY
PDC.MCSAP;
Now, is the kicker, tying this all together. I'm starting with just the #tmpPartDays JOINED to itself on the same MCSAP AND a MUST-HAVE matching record 10 days out... So this resolves issues of weekend / holidays since serial consecutive.
This now give me the begin/end date range such as 1-10, 2-11, 3-12, 4-13, etc.
I then join to the tmpPartDailyCounts result on the same part AND the date is at the respective begin (PD.InputDate) and END (PD2.InputDate). I re-apply the same aggregates to get the total counts WITHIN EACH Part + 10 day period. Run this query WITHOUT the "HAVING" clause to see what is coming out.
select
PD.MCSAP,
PD.InputDate BeginDate,
PD2.InputDate EndDate,
SUM( PDC.MultipleProductsSameDay ) as TotalProductsMade,
sum( PDC.OKParts ) OKParts,
sum( PDC.BadParts ) BadParts,
sum( PDC.TotalParts ) TotalParts,
min( PDC.ProductName ) ProductName,
max( PDC.LastProductName ) LastProductName
from
#tmpPartDays PD
-- join again to get 10 days out for the END cycle
JOIN #tmpPartDays PD2
on PD.MCSAP = PD2.MCSAP
AND PD.CapDay +9 = PD2.CapDay
-- Now join to daily counts for same machine and within the 10 day period
JOIN #tmpPartDailyCounts PDC
on PD.MCSAP = PDC.MCSAP
AND PDC.InputDate >= PD.InputDate
AND PDC.InputDate <= PD2.InputDate
group by
PD.MCSAP,
PD.InputDate,
PD2.InputDate
having
SUM( PDC.MultipleProductsSameDay ) = 10
AND min( PDC.ProductName ) = max( PDC.LastProductName )
AND SUM( PDC.TotalParts ) >= 10
Finally, the elimination of the records you DONT want. Since I dont have millions of records to simulate, just follow along. I am doing a HAVING on
SUM( PDC.TotalParts ) >= 10
SUM( PDC.MultipleProductsSameDay ) = 10
If on ANY day there are MORE than 1 product created, the count would be 11 or more, thus indicating not the same product, so that would cause an exclusion. But also, if at the tail-end of data such as only 7 days of production, it would never HIT 10 which was your 10-day qualifier also.
2. AND min( PDC.ProductName ) = max( PDC.LastProductName )
Here, since we are spanning back to the DAILY context, if ANY product changes on any date, the Product Name (via min) and LastProductName (via max) will change, regardless of the day, and regardless of the name context. So, by making sure both the min() and max() are the same, you know it is the same product across the entire span.
3. AND SUM( PDC.TotalParts ) >= 10
Finally, the count of things made. In this case, I did >= 10 because I was only testing with 1 item per day, thus 10 days = 10 items. In your scenario, you may have 987 in one day, but 1100 in another, thus balancing low and high production days to get to that 10,000, but for sample of data, just change YOUR context to the 10,000 limit minimum.
This SQLFiddle shows the results as it gets down to the per machine/day and showing the sequential activity. The last MCSAP machine starts on Jan 4th, but has a sequential day row assignment starting at 1 to give proper context to the 1-10, 2-11, etc.
First SQL Fiddle showing machine/day
Second fiddle shows final query WITHOUT the HAVING clause and you can see the first couple rows of TotalProductsMade is 11 which means SOMETHING on any of the day-span in question created different products and would be excluded from final. For the begin and end dates of Jan 6-15 and Jan 7-16, you will see the MIN/MAX products showing Product A and Product B, thus indicating that SOMEWHERE within its 10-day span a product switched... These too will be excluded.
The FINAL query This query shows the results with the HAVING clause applied.

One option that comes to my mind is the use of a numbers table (google Jeff Moden on SQL Server Central for more background).
The number table then uses a start date (from the range of dates to investigate) that in addition to generate a date to link to also generates a "bucket" by which to group afterwards.
Similar to:
-- generate date frame from and to
DECLARE
#date_start date = Convert( date, '20211110', 112 ),
#date_end date = Convert( date, '20220110', 112 )
;
WITH
cteN
(
Number
)
AS
( -- build a list of 10 single digit numbers
SELECT Cast( 0 AS int ) AS Number UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9
)
,
cteNumbers
(
Number
)
AS
( -- splice single digit numbers to list from 0 to 99999
SELECT
cN10000.Number * 10000 + cN1000.Number * 1000 + cN100.Number * 100 + cN10.Number * 10 + cN1.Number
FROM
cteN AS cN10000
CROSS JOIN cteN AS cN1000
CROSS JOIN cteN AS cN100
CROSS JOIN cteN AS cN10
CROSS JOIN cteN AS cN1
)
,
cteBucketOffset
(
DatediffNum,
Offset
)
AS
( -- determine the offset in datediffs to number buckets later correctly
SELECT
Cast( Datediff( dd, #date_start, #date_end ) AS int ) - 1 AS DatediffNum,
Cast( Datediff( dd, #date_start, #date_end ) % 10 AS tinyint ) - 1 AS Offset
)
,
cteDates
(
Dated,
Bucket,
BucketNumber,
BucketOffset,
DatediffNum
)
AS
( -- generate list of dates with bucket batches and numbers
SELECT
Dateadd( dd, cN.Number * -1, #date_end ) AS Dated,
Cast( ( cBO.Offset + cN.Number ) / 10 AS int ) AS Bucket,
Cast( ( cBO.Offset + cN.Number ) % 10 AS tinyint ) AS BucketNumber,
cBO.Offset,
cBO.DatediffNum
FROM
cteNumbers AS cN
CROSS JOIN cteBucketOffset AS cBO
WHERE
cN.Number <= Datediff( dd, #date_start, #date_end )
)
SELECT
*
FROM
cteDates AS cD
ORDER BY
cD.Dated ASC
;
Long winded due to showing each step. The result is a table-on-the-fly usable to join back to the raw data. "Bucket" can then be used instead of the date itself to group raw data.
Once this data is built then decisions can be made on the grouped conditions like having a minimum number of rows.

Seems just a matter of grouping on the year and the day of the year divided by 10.
SELECT
CONCAT(CONVERT(VARCHAR(10),MIN([DateTime]),105), ' to ', CONVERT(VARCHAR(10), MAX([DateTime]), 105)) AS InputDateRange
, MCSAP
, MAX(ZAssetRegister.LocalName) AS LocalName
, ProductName
, SUM(CASE WHEN InspectionResult = 0 THEN 1 END) AS OKParts
, COUNT(CASE WHEN InspectionResult > 0 THEN 1 END) AS NGParts
, COUNT(DISTINCT CAST([Datetime] AS DATE)) AS total_days
FROM VisionMachineResults
JOIN ZAssetRegister
ON VisionMachineResults.MCSAP = ZAssetRegister.SAP_Number
GROUP BY
DATEPART(YEAR, [DateTime]),
CEILING(DATEPART(DAYOFYEAR, [DateTime])/10.0),
MCSAP,
ProductName
ORDER BY
MIN([DateTime]),
MAX(ZAssetRegister.LocalName);
Simplified test on db<>fiddle here

Related

Proportional distribution of a given value between two dates in SQL Server

There's a table with three columns: start date, end date and task duration in hours. For example, something like that:
Id
StartDate
EndDate
Duration
1
07-11-2022
15-11-2022
40
2
02-09-2022
02-11-2022
122
3
10-10-2022
05-11-2022
52
And I want to get a table like that:
Id
Month
HoursPerMonth
1
11
40
2
09
56
2
10
62
2
11
4
3
10
42
3
11
10
Briefly, I wanted to know, how many working hours is in each month between start and end dates. Proportionally. How can I achieve that by MS SQL Query? Data is quite big so the query speed is important enough. Thanks in advance!
I've tried DATEDIFF and EOMONTH, but that solution doesn't work with tasks > 2 months. And I'm sure that this solution is bad decision. I hope, that it can be done more elegant way.

Here is an option using an ad-hoc tally/calendar table
Not sure I'm agree with your desired results
Select ID
,Month = month(D)
,HoursPerMonth = (sum(1.0) / (1+max(datediff(DAY,StartDate,EndDate)))) * max(Duration)
From YourTable A
Join (
Select Top 75000 D=dateadd(day,Row_Number() Over (Order By (Select NULL)),0)
From master..spt_values n1, master..spt_values n2
) B on D between StartDate and EndDate
Group By ID,month(D)
Order by ID,Month
Results

This answer uses CTE recursion.
This part just sets up a temp table with the OP's example data.
DECLARE #source
TABLE (
SOURCE_ID INT
,STARTDATE DATE
,ENDDATE DATE
,DURATION INT
)
;
INSERT
INTO
#source
VALUES
(1, '20221107', '20221115', 40 )
,(2, '20220902', '20221102', 122 )
,(3, '20221010', '20221105', 52 )
;
This part is the query based on the above data. The recursive CTE breaks the time period into months. The second CTE does the math. The final selection does some more math and presents the results the way you want to seem them.
WITH CTE AS (
SELECT
SRC.SOURCE_ID
,SRC.STARTDATE
,SRC.ENDDATE
,SRC.STARTDATE AS 'INTERIM_START_DATE'
,CASE WHEN EOMONTH(SRC.STARTDATE) < SRC.ENDDATE
THEN EOMONTH(SRC.STARTDATE)
ELSE SRC.ENDDATE
END AS 'INTERIM_END_DATE'
,SRC.DURATION
FROM
#source SRC
UNION ALL
SELECT
CTE.SOURCE_ID
,CTE.STARTDATE
,CTE.ENDDATE
,CASE WHEN EOMONTH(CTE.INTERIM_START_DATE) < CTE.ENDDATE
THEN DATEADD( DAY, 1, EOMONTH(CTE.INTERIM_START_DATE) )
ELSE CTE.STARTDATE
END
,CASE WHEN EOMONTH(CTE.INTERIM_START_DATE, 1) < CTE.ENDDATE
THEN EOMONTH(CTE.INTERIM_START_DATE, 1)
ELSE CTE.ENDDATE
END
,CTE.DURATION
FROM
CTE
WHERE
CTE.INTERIM_END_DATE < CTE.ENDDATE
)
, CTE2 AS (
SELECT
CTE.SOURCE_ID
,CTE.STARTDATE
,CTE.ENDDATE
,CTE.INTERIM_START_DATE
,CTE.INTERIM_END_DATE
,CAST( DATEDIFF( DAY, CTE.INTERIM_START_DATE, CTE.INTERIM_END_DATE ) + 1 AS FLOAT ) AS 'MNTH_DAYS'
,CAST( DATEDIFF( DAY, CTE.STARTDATE, CTE.ENDDATE ) + 1 AS FLOAT ) AS 'TTL_DAYS'
,CAST( CTE.DURATION AS FLOAT ) AS 'DURATION'
FROM
CTE
)
SELECT
CTE2.SOURCE_ID AS 'Id'
,MONTH( CTE2.INTERIM_START_DATE ) AS 'Month'
,ROUND( CTE2.MNTH_DAYS/CTE2.TTL_DAYS * CTE2.DURATION, 0 ) AS 'HoursPerMonth'
FROM
CTE2
ORDER BY
CTE2.SOURCE_ID
,CTE2.INTERIM_END_DATE
;
My results agree with Mr. Cappelletti's, not the OP's. Perhaps some tweaking regarding the definition of a "Day" is needed. I don't know.
If time between start and end date is large (more than 100 months) you may want to specify OPTION (MAXRECURSION 0) at the end.

Splitting out a cost dynamically across weeks

I’m creating an interim table in SQL Server for use with PowerBI to query financial data.
I have a finance transactions table tblfinance with
CREATE TABLE TBLFinance
(ID int,
Value float,
EntryDate date,
ClientName varchar (250)
)
INSERT INTO TBLFinance(ID ,Value ,EntryDate ,ClientName)
VALUES(1,'1783.26','2018-10-31 00:00:00.000','Alpha')
, (2,'675.3','2018-11-30 00:00:00.000','Alpha')
, (3,'243.6','2018-12-31 00:00:00.000','Alpha')
, (4,'8.17','2019-01-31 00:00:00.000','Alpha')
, (5,'257.23','2019-01-31 00:00:00.000','Alpha')
, (6,'28','2019-02-28 00:00:00.000','Alpha')
, (7,'1470.61','2019-03-31 00:00:00.000','Bravo')
, (8,'1062.86','2019-04-30 00:00:00.000','Bravo')
, (9,'886.65','2019-05-31 00:00:00.000','Bravo')
, (10,'153.31','2019-05-31 00:00:00.000','Bravo')
, (11,'150.24','2019-06-30 00:00:00.000','Bravo')
, (12,'690.14','2019-07-31 00:00:00.000','Charlie')
, (13,'21.67','2019-08-31 00:00:00.000','Charlie')
, (14,'339.29','2018-10-31 00:00:00.000','Charlie')
, (15,'807.96','2018-11-30 00:00:00.000','Delta')
, (16,'48.94','2018-12-31 00:00:00.000','Delta')
I’m calculating transaction values that fall within a week. My week ends on a Sunday, so I have the following query:
INSERT INTO tblAnalysis
(WeekTotal
, WeekEnd
, Client
)
SELECT SUM (VALUE) AS WeekTotal
, dateadd (day, case when datepart (WEEKDAY, EntryDate) = 1 then 0 else 8 - datepart (WEEKDAY, EntryDate) end, EntryDate) AS WeekEnd
, ClientName as Client
FROM dbo.tblFinance
GROUP BY dateadd (day, case when datepart (WEEKDAY, EntryDate) = 1 then 0 else 8 - datepart (WEEKDAY, EntryDate) end, EntryDate), CLIENTNAME
I’ve now been informed that some of the costs incurred within a given week maybe monthly, and therefore need to be split into 4 weeks, or annually, so split into 52 weeks. I will write a case statement to update the costs based on ClientName, so assume there is an additional field called ‘Payfrequency’.
I want to avoid having to pull the values affected into a temp table, and effectively write this – because there’ll be different sums applied depending on frequency.
SELECT *
INTO #MonthlyCosts
FROM
(
SELECT
client
, VALUE / 4 AS VALUE
, WEEKENDING
FROM tblAnalysis
UNION
SELECT
client
, nt_acnt
, VALUE / 4 AS VALUE
, DATEADD(WEEK,1,WEEKENDING) AS WEEKENDING
FROM tblAnalysis
UNION
SELECT
client
, VALUE / 4 AS VALUE
, DATEADD(WEEK,2,WEEKENDING) AS WEEKENDING
FROM tblAnalysis
UNION
SELECT
client
, VALUE / 4 AS VALUE
, DATEADD(WEEK,3,WEEKENDING) AS WEEKENDING
FROM tblAnalysis
) AS A
I know I need a stored procedure to hold variables so the calculations can be carried out dynamically, but have no idea where to start.

You can use recursive CTEs to split the data:
with cte as (
select ID, Value, EntryDate, ClientName, payfrequency, 1 as n
from TBLFinance f
union all
select ID, Value, EntryDate, ClientName, payfrequency, n + 1
from cte
where n < payfrequency
)
select *
from cte;
Note that by default this is limited to 100 recursion steps. You can add option (maxrecursion 0) for unlimited numbers of days.

The best solution would be to make use of a numbers table. If you can create a table on your server with one column holding a sequence of integer numbers.
You can then use it like this for your weekly values:
SELECT
client
, VALUE / 52 AS VALUE
, DATEADD(WEEK,N.Number,WEEKENDING) AS WEEKENDING
FROM tblAnalysis AS A
CROSS JOIN tblNumbers AS N
WHERE NCHAR.Number <=52

Querying the same column for 3 different values

I'm trying hard to extract the data in the format I need, but unsuccessful til now.
I have the following table
id_ticket, date_ticket, office_ticket, status_ticket
I need the query to return me, for EVERY MONTH, and always for the same OFFICE:
the number of tickets (COUNT) with any status
the number of tickets (COUNT) with status = 5
the number of tickets (COUNT) with status = 6
Month
Year
The query I made to return ONLY the total amount of tickets with any status was this. It worked!
SELECT
COUNT (id_ticket) as TotalTicketsPerMonth,
'sYear' = YEAR (date_ticket),
'sMonth' = MONTH (date_ticket)
FROM crm_vw_Tickets
WHERE office_ticket = 1
GROUP BY
YEAR (date_ticket), MONTH (date_ticket)
ORDER BY sYear ASC, sMonth ASC
Returning the total amount of ticket with status=5
SELECT
COUNT (id_ticket) as TotalTicketsPerMonth,
'sYear' = YEAR (date_ticket),
'sMonth' = MONTH (date_ticket)
FROM crm_vw_Tickets
WHERE office_ticket = 1 AND status_ticket = 5
GROUP BY
YEAR (date_ticket), MONTH (date_ticket)
ORDER BY sYear ASC, sMonth ASC
But I need the return to be something like:
Year Month Total Status5 Status6
2018 1 15 5 3
2018 2 14 4 5
2018 3 19 2 8
Thank you for your help.

You are close. You can use a CASE Expression to get what you need:
SELECT
COUNT (id_ticket) as TotalTicketsPerMonth,
SUM(CASE WHEN status_ticket = 5 THEN 1 END) as Status5,
SUM(CASE WHEN status_ticket = 6 THEN 1 END) as Status6,
'sYear' = YEAR (date_ticket),
'sMonth' = MONTH (date_ticket)
FROM crm_vw_Tickets
WHERE office_ticket = 1
GROUP BY YEAR (date_ticket), MONTH (date_ticket)
ORDER BY sYear ASC, sMonth ASC

The following code builds off JNevill's answer to include summary rows for "missing" months, i.e. those with no tickets, as well as months with tickets. The basic idea is to create a table of all of the months from the first to the last ticket, outer join the ticket data with the months and then summarize the data. (Tally table, numbers table and calendar table are more or less applicable terms.)
It is a Common Table Expression (CTE) that contains several queries that work step-by-step toward the result. You can see the results of the intermediate steps by replacing the final select statement with one of the ones commented out above it.
-- Sample data.
declare #crm_vw_Tickets as Table ( id_ticket Int Identity, date_ticket Date, office_ticket Int, status_ticket Int );
insert into #crm_vw_Tickets ( date_ticket, office_ticket, status_ticket ) values
( '20190305', 1, 6 ), -- Shrove Tuesday.
( '20190501', 1, 5 ), -- May Day.
( '20190525', 1, 5 ); -- Towel Day.
select * from #crm_vw_Tickets;
-- Summarize the data.
with
-- Get the minimum and maximum ticket dates for office_ticket 1.
Limits as (
select Min( date_ticket ) as MinDateTicket, Max( date_ticket ) as MaxDateTicket
from #crm_vw_Tickets
where office_ticket = 1 ),
-- 0 to 9.
Ten ( Number ) as ( select * from ( values (0), (1), (2), (3), (4), (5), (6), (7), (8), (9) ) as Digits( Number ) ),
-- 100 rows.
TenUp2 ( Number ) as ( select 42 from Ten as L cross join Ten as R ),
-- 10000 rows. We'll assume that 10,000 months should cover the reporting range.
TenUp4 ( Number ) as ( select 42 from TenUp2 as L cross join TenUp2 as R ),
-- 1 to the number of months to summarize.
Numbers ( Number ) as ( select top ( select DateDiff( month, MinDateTicket, MaxDateTicket ) + 1 from Limits ) Row_Number() over ( order by ( select NULL ) ) from TenUp4 ),
-- Starting date of each month to summarize.
Months as (
select DateAdd( month, N.Number - 1, DateAdd( day, 1 - Day( L.MinDateTicket ), L.MinDateTicket ) ) as StartOfMonth
from Limits as L cross join
Numbers as N ),
-- All tickets assigned to the appropriate month and a row with NULL ticket data
-- for each month without tickets.
MonthsAndTickets as (
select M.StartOfMonth, T.*
from Months as M left outer join
#crm_vw_Tickets as T on M.StartOfMonth <= T.date_ticket and T.date_ticket < DateAdd( month, 1, M.StartOfMonth ) )
-- Use one of the following select statements to see the intermediate or final results:
--select * from Limits;
--select * from Ten;
--select * from TenUp2;
--select * from TenUp4;
--select * from Numbers;
--select * from Months;
--select * from MonthsAndTickets;
select Year( StartOfMonth ) as SummaryYear, Month( StartOfMonth ) as SummaryMonth,
Count( id_ticket ) as TotalTickets,
Coalesce( Sum( case when status_ticket = 5 then 1 end ), 0 ) as Status5Tickets,
Coalesce( Sum( case when status_ticket = 6 then 1 end ), 0 ) as Status6Tickets
from MonthsAndTickets
where office_ticket = 1 or office_ticket is NULL -- Handle months with no tickets.
group by StartOfMonth
order by StartOfMonth;
Note that the final select uses Count( id_ticket ), Coalesce and an explicit check for NULL to produce appropriate output values (0) for months with no tickets.

SQL Server 2012 - Running Total With Backlog & Carry Forward

Good afternoon,
Hope that you're all well and wish you a happy new year.
I'm experiencing some curious behaviour with a query that I've written in that the LAG function is inconsistent.
Essentially, I have a dataset (made up of 2 CTEs) which each contain the month (in MMM-YYYY format) and then one holds a count of tickets opened, and the other contains the same but for tickets closed.
What I am then doing is adding in a 'Backlog' column (which will be 0 for the first month in all cases) and a 'Carried Forward' column. The Carried Forward amount will be the balance of that month ( Created + Backlog ) and will be reflected as the Backlog for the following month.
I had this ticking over quite nicely until I realised that negative backlogs were fudging the numbers a bit. What I mean is, for example:
10 Tickets Created
12 Tickets Resolved
0 Ticket Backlog
-2 Tickets Carried Forward
In this circumstance, I've had to zero any negative backlog for our reporting purposes.
This is seemingly where the problems come into play. For the first few months, everything will be fine - the values will be right, carrying forward the correct numbers and factoring them into the calculations accordingly. But then it will carry over a number of (seemingly) indeterminable origin which of course, has a knock-on effect on the accuracy past this point.
With the Window Functions introduced with SQL Server 2012, this should be quite basic - but evidently not!
Whilst I'm quite happy to post code (I have tried a fair few ways of skinning this cat), I feel as though if someone is able to give a high-level overview of how it should be written, I'll see where I went wrong immediately. In doing so, I'll then respond accordingly with my attempt/s for completeness.
Thank you very much in advance!
Picture of result error:
, OpenClosed AS
(
SELECT
c.[Created Month] 'Month'
, c.Tickets 'Created'
, r.Tickets 'Resolved'
, IIF( ( c.Tickets - r.Tickets ) < 0, 0, ( c.Tickets - r.Tickets ) ) 'Balance'
FROM
Created c
JOIN Resolved r ON
c.[Created Month] = r.[Resolved Month]
)
, CarryForward AS
(
SELECT
ROW_NUMBER() OVER( ORDER BY CAST( '1.' + Month AS DATETIME ) ) 'Row No'
, Month 'Month'
, Created 'Created'
, Resolved 'Resolved'
, LAG( Balance, 1, 0 ) OVER( ORDER BY CAST( '1.' + Month AS DATETIME ) ) 'Backlog'
, IIF( ( ( Created + LAG( Balance, 1, 0 ) OVER( ORDER BY CAST( '1.' + Month AS DATETIME ) ) ) - Resolved ) < 0
, 0
, ( ( Created + LAG( Balance, 1, 0 ) OVER( ORDER BY CAST( '1.' + Month AS DATETIME ) ) ) - Resolved )
) 'Carry Forward'
FROM
OpenClosed
)
SELECT
c1.Month 'Month'
, c1.Created 'Created'
, c1.Resolved 'Resolved'
, c2.[Carry Forward] 'Backlog'
, IIF( ( c1.Created + c2.[Carry Forward] ) - c1.Resolved < 0
, 0
, ( c1.Created + c2.[Carry Forward] ) - c1.Resolved
) 'Carried Forward'
FROM
CarryForward c1
JOIN CarryForward c2 ON
c2.[Row No] = c1.[Row No]-1

From comments on question. Incidentally, the Created Month column should be redone somehow so that the year is placed before the month - like 2015-01. This will ensure correct ordering by default sort algorithms.
If the date must be presented as Jan-2015 in the final report, do that presentational work as the very final step in the query.
WITH ticket_account AS
(
SELECT
c.[Created Month] AS Month
,c.Tickets AS Created
,r.Tickets AS Resolved
FROM
Created AS c
INNER JOIN
Resolved AS r
ON c.[Created Month] = r.[Resolved Month]
)
SELECT
*
,(SUM(Created) OVER (ORDER BY Month ASC) - SUM(Resolved) OVER (ORDER BY Month ASC)) AS Balance
FROM
ticket_account

Query to show stock based on previous transactions

Please I need your help..
for an Objective
match SO (Sales Order) quantity to PO (Purchase Order) quantity based on FIFO (First In, First Out) where the first stock items purchased must be the first items sold.
I have a table Stock which use to track the movement of stock in and out of imaginary stock warehouse. The warehouse is initially empty, and stock then moves into the warehouse as a result of a stock purchase (‘IN’) and stock moves out of the warehouse when it is sold (‘OUT’). Each type of stock item is identified by an ItemID. Each movement of stock in or out of the warehouse, due to a purchase or sale of a given item, results in a row being added to the Stock table, uniquely identified by the value in the StockID identify column, and describing how many items were added or removed and the date of the transaction.
Table stock :
StockId DocumentID ItemID TranDate TranCode Quantity
------------------------------------------------------------
1 PO001 A021 2016.01.01 IN 3
4 SO010 A021 2016.01.02 OUT 2
2 PO002 A021 2016.01.10 IN 7
3 PO003 A021 2016.02.01 IN 9
5 SO011 A021 2016.02.11 OUT 8
6 SO012 A023 2016.02.12 OUT 6
How could I write a query to give output like the table below?
SOID POID Quantity
------------------------
SO010 PO001 2
SO011 PO001 1
SO011 PO002 7
SO012 PO003 6

So, seeing as no one else has given this a go, I figure I'll post something that resembles an answer (I believe).
Essentially, what you want to do is keep track of the number of things you have in stock and the number of things that have gone out, based on the date (I haven't accounted for multiple things coming in or going out on the same date, though).
DECLARE #Table TABLE
(
DocumentID VARCHAR(10) NOT NULL,
TranCode VARCHAR(3) NOT NULL,
TranDate DATE NOT NULL,
Quantity INT NOT NULL
); -- I'm ignoring the other columns here because they don't seem important to your overall needs.
INSERT #Table (DocumentID, TranCode, TranDate, Quantity)
VALUES
('PO001', 'IN', '2016-01-01', 3),
('SO010', 'OUT', '2016-01-02', 2),
('PO002', 'IN', '2016-01-10', 7),
('PO003', 'IN', '2016-02-01', 9),
('SO011', 'OUT', '2016-02-11', 8),
('SO012', 'OUT', '2016-02-12', 6);
WITH CTE AS
(
SELECT DocumentID,
TranCode,
TranDate,
Quantity,
RunningQuantity = -- Determine the current IN/OUT totals.
(
SELECT SUM(Quantity)
FROM #Table
WHERE TranCode = T.TranCode
AND TranDate <= T.TranDate
),
PrevQuantity = -- Keep track of the previous IN/OUT totals.
(
SELECT ISNULL(SUM(Quantity), 0)
FROM #Table
WHERE TranCode = T.TranCode
AND TranDate < T.TranDate
)
FROM #Table T
)
SELECT Outgoing.DocumentID,
Incoming.DocumentID,
Quantity =
CASE WHEN Outgoing.RunningQuantity <= Incoming.RunningQuantity AND Outgoing.PrevQuantity >= Incoming.PrevQuantity
THEN Outgoing.RunningQuantity - Outgoing.PrevQuantity
WHEN Outgoing.RunningQuantity <= Incoming.RunningQuantity AND Outgoing.PrevQuantity < Incoming.PrevQuantity
THEN Outgoing.RunningQuantity - Incoming.PrevQuantity
ELSE Incoming.RunningQuantity - Outgoing.PrevQuantity
END
FROM CTE Outgoing
JOIN CTE Incoming ON
Incoming.TranCode = 'IN'
AND Incoming.RunningQuantity > Outgoing.PrevQuantity
AND Incoming.PrevQuantity < Outgoing.RunningQuantity
WHERE Outgoing.TranCode = 'OUT'
ORDER BY Outgoing.TranDate;
Note: I would highly recommend you keep track of the information in a better way. For example, create a table that actually details which orders took what from which other orders (an order transaction table or something), because while it's not impossible to achieve what you want with the way your data is structured, it's much less complicated if you just store more helpful data.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

SQL Server group by overlapping 10 day intervals - sql

Related

Proportional distribution of a given value between two dates in SQL Server

Splitting out a cost dynamically across weeks

Querying the same column for 3 different values

SQL Server 2012 - Running Total With Backlog & Carry Forward

Query to show stock based on previous transactions

Categories

Resources