I’m creating an interim table in SQL Server for use with PowerBI to query financial data.
I have a finance transactions table tblfinance with
CREATE TABLE TBLFinance
(ID int,
Value float,
EntryDate date,
ClientName varchar (250)
)
INSERT INTO TBLFinance(ID ,Value ,EntryDate ,ClientName)
VALUES(1,'1783.26','2018-10-31 00:00:00.000','Alpha')
, (2,'675.3','2018-11-30 00:00:00.000','Alpha')
, (3,'243.6','2018-12-31 00:00:00.000','Alpha')
, (4,'8.17','2019-01-31 00:00:00.000','Alpha')
, (5,'257.23','2019-01-31 00:00:00.000','Alpha')
, (6,'28','2019-02-28 00:00:00.000','Alpha')
, (7,'1470.61','2019-03-31 00:00:00.000','Bravo')
, (8,'1062.86','2019-04-30 00:00:00.000','Bravo')
, (9,'886.65','2019-05-31 00:00:00.000','Bravo')
, (10,'153.31','2019-05-31 00:00:00.000','Bravo')
, (11,'150.24','2019-06-30 00:00:00.000','Bravo')
, (12,'690.14','2019-07-31 00:00:00.000','Charlie')
, (13,'21.67','2019-08-31 00:00:00.000','Charlie')
, (14,'339.29','2018-10-31 00:00:00.000','Charlie')
, (15,'807.96','2018-11-30 00:00:00.000','Delta')
, (16,'48.94','2018-12-31 00:00:00.000','Delta')
I’m calculating transaction values that fall within a week. My week ends on a Sunday, so I have the following query:
INSERT INTO tblAnalysis
(WeekTotal
, WeekEnd
, Client
)
SELECT SUM (VALUE) AS WeekTotal
, dateadd (day, case when datepart (WEEKDAY, EntryDate) = 1 then 0 else 8 - datepart (WEEKDAY, EntryDate) end, EntryDate) AS WeekEnd
, ClientName as Client
FROM dbo.tblFinance
GROUP BY dateadd (day, case when datepart (WEEKDAY, EntryDate) = 1 then 0 else 8 - datepart (WEEKDAY, EntryDate) end, EntryDate), CLIENTNAME
I’ve now been informed that some of the costs incurred within a given week maybe monthly, and therefore need to be split into 4 weeks, or annually, so split into 52 weeks. I will write a case statement to update the costs based on ClientName, so assume there is an additional field called ‘Payfrequency’.
I want to avoid having to pull the values affected into a temp table, and effectively write this – because there’ll be different sums applied depending on frequency.
SELECT *
INTO #MonthlyCosts
FROM
(
SELECT
client
, VALUE / 4 AS VALUE
, WEEKENDING
FROM tblAnalysis
UNION
SELECT
client
, nt_acnt
, VALUE / 4 AS VALUE
, DATEADD(WEEK,1,WEEKENDING) AS WEEKENDING
FROM tblAnalysis
UNION
SELECT
client
, VALUE / 4 AS VALUE
, DATEADD(WEEK,2,WEEKENDING) AS WEEKENDING
FROM tblAnalysis
UNION
SELECT
client
, VALUE / 4 AS VALUE
, DATEADD(WEEK,3,WEEKENDING) AS WEEKENDING
FROM tblAnalysis
) AS A
I know I need a stored procedure to hold variables so the calculations can be carried out dynamically, but have no idea where to start.
You can use recursive CTEs to split the data:
with cte as (
select ID, Value, EntryDate, ClientName, payfrequency, 1 as n
from TBLFinance f
union all
select ID, Value, EntryDate, ClientName, payfrequency, n + 1
from cte
where n < payfrequency
)
select *
from cte;
Note that by default this is limited to 100 recursion steps. You can add option (maxrecursion 0) for unlimited numbers of days.
The best solution would be to make use of a numbers table. If you can create a table on your server with one column holding a sequence of integer numbers.
You can then use it like this for your weekly values:
SELECT
client
, VALUE / 52 AS VALUE
, DATEADD(WEEK,N.Number,WEEKENDING) AS WEEKENDING
FROM tblAnalysis AS A
CROSS JOIN tblNumbers AS N
WHERE NCHAR.Number <=52
Related
I have a table which logs each individual piece produced across several production machines, going back a number of years. Periodically (e.g. once per week) I want to check this table to establish "best performance" records for each machine and product combination, storing them in a new table according to the following rules;
The machine must have produced a minimum of 10,000 parts over a 10 day period - if only 9000 parts were produced over 10 days, this is an invalid record
The machine must have been running the same product without changing over for the entire period i.e. if on day 5 the product changed, this is an invalid record
The Performance data table looks like below [VisionMachineResults]
ID
MCSAP
DateTime
ProductName
InspectionResult
1
123456
2020-01-01 08:29:34:456
Product A
0
2
123456
2020-01-01 08:45:50:456
Product B
1
3
844214
2020-01-01 08:34:48:456
Product A
2
4
978415
2020-01-02 09:29:26:456
Product C
0
5
985633
2020-01-04 23:29:11:456
Product A
2
I am able to produce a result which gives a list of individual days performance per SAP / Product Combination, but I then need to process the data in a complex loop outside of SQL to establish the 10 day groups.
My current query is:
SELECT CAST(DateTime AS date) AS InputDate,
MCSAP,
ZAssetRegister.LocalName,
ProductName,
SUM(CASE WHEN InspectionResult = 0 THEN 1 END) AS OKParts,
COUNT(CASE WHEN InspectionResult > 0 THEN 1 END) AS NGParts
FROM [VisionMachineResults]
INNER JOIN ZAssetRegister ON VisionMachineResults.MCSAP = ZAssetRegister.SAP_Number
GROUP BY CAST(DateTime AS date),
MCSAP,
ProductName,
ZAssetRegister.LocalName
ORDER BY InputDate,
ZAssetRegister.LocalName;
Would it be possible to have the SQL query give the result in 10 day groups, instead of per individual day i.e.
01-01-2021 to 11-01-2021 | Machine 1 | Product 1 | 20,000 | 5,000
02-01-2021 to 12-01-2021 | Machine 1 | Product 1 | 22,000 | 1,000
03-01-2021 to 13-01-2021 | Machine 1 | Product 1 | 18,000 | 4,000
etc...
I would then iterate through the rows to find the one with the best percentage of OK parts. Any ideas appreciated!
This process needs to be considered in many levels. First, you mention 10 consecutive days. We dont know if those days include weekends, if the machines are running 24/7. If the dates running can skip over holidays as well? So, 10 days could be Jan 1 to Jan 10. But if you skip weekends, you only have 6 actual WEEKDAYS.
Next, consideration of a machine working on more than one product, such as a switching between dates, or even within a single day.
As a commenter indicated, having column names by same as a reserved word (such as DateTime), bad practice and try to see if any new columns are common key words that may cause confusion and avoid them.
You also mention that you had to do complex looping checks, and how to handle joining out to 10 days, the splits, etc. I think I have a somewhat elegant approach to doing this and should prove to be rather simple in the scheme of things.'
You are using SQL-Server, so I will do this using TEMP tables via "#" table names. This way, when you are done with a connection, or a call to making this a stored procedure, you dont have to keep deleting and recreating them. That said, let me take you one-step-at-a-time.
First, I'm creating a simple table matching your structure, even with the DateTime context.
CREATE TABLE VisionMachineResults
(
ID int IDENTITY(1,1) NOT NULL,
MCSAP nvarchar(6) NOT NULL,
DateTime datetime NOT NULL,
ProductName nvarchar(10) NOT NULL,
InspectionResult int NOT NULL,
CONSTRAINT ID PRIMARY KEY CLUSTERED
(
[ID] ASC
) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
Now, I'm inserting the data, similar to what you have, but not millions of rows. You mention you are looking for 10 days out, so I just padded the end with several extras to simulate that. I also explicitly forced a gap change of product by the one machine on Jan 5th. Additionally, I added a product change on Jan 7th to trigger this a "break" within your 10-day consideration. You'll see the results later.
insert into VisionMachineResults
(MCSAP, [DateTime], ProductName, InspectionResult )
values
( '123456', '2020-01-01 08:29:34.456', 'Product A', 0 ),
( '123456', '2020-01-01 08:29:34.456', 'Product B', 1 ),
( '844214', '2020-01-01 08:29:34.456', 'Product A', 2 ),
( '978415', '2020-01-02 08:29:34.456', 'Product C', 0 ),
( '985633', '2020-01-04 08:29:34.456', 'Product A', 0 ),
( '985633', '2020-01-05 08:29:34.456', 'Product A', 0 ),
( '985633', '2020-01-05 08:29:34.456', 'Product B', 0 ),
( '985633', '2020-01-06 08:29:34.456', 'Product A', 2 ),
( '985633', '2020-01-07 08:29:34.456', 'Product B', 0 ),
( '985633', '2020-01-08 08:29:34.456', 'Product A', 0 ),
( '985633', '2020-01-09 08:29:34.456', 'Product A', 0 ),
( '985633', '2020-01-10 08:29:34.456', 'Product A', 0 ),
( '985633', '2020-01-11 08:29:34.456', 'Product A', 0 ),
( '985633', '2020-01-12 08:29:34.456', 'Product A', 0 ),
( '985633', '2020-01-13 08:29:34.456', 'Product A', 0 ),
( '985633', '2020-01-14 08:29:34.456', 'Product A', 1 ),
( '985633', '2020-01-15 08:29:34.456', 'Product A', 0 ),
( '985633', '2020-01-16 08:29:34.456', 'Product A', 0 ),
( '985633', '2020-01-17 08:29:34.456', 'Product A', 0 ),
( '985633', '2020-01-18 08:29:34.456', 'Product A', 0 ),
( '985633', '2020-01-19 08:29:34.456', 'Product A', 0 ),
( '985633', '2020-01-20 08:29:34.456', 'Product A', 0 )
go
So now, consider this the baseline of YOUR production data. My first query will be doing a bunch of things, but storing the pre-aggregations INTO #tmpPartDailyCounts result table. This way you can look at them at the different stages to apply sanity check to my approach.
Here, on a per machine (MCSAP), and Date (without time portion), I am grabbing certain aggregates, and keeping them grouped by machine and date.
select
VMR.MCSAP,
cast(VMR.DateTime as Date) as InputDate,
min( VMR.ProductName ) ProductName,
max( VMR.ProductName ) LastProductName,
count( distinct VMR.ProductName ) as MultipleProductsSameDay,
sum( case when VMR.InspectionResult = 0 then 1 else 0 end ) OKParts,
sum( case when NOT VMR.InspectionResult = 0 then 1 else 0 end ) BadParts,
count(*) TotalParts
into
#tmpPartDailyCounts
from
VisionMachineResults VMR
group by
VMR.MCSAP,
cast(VMR.DateTime as Date)
You were joining to an asset table and dont think you really need that. If the machine made the product, does it matter if a final assembly is complete? Dont know, you would know better.
Now, the aggregates and why. The min( VMR.ProductName ) ProductName and max( VMR.ProductName ) LastProductName, this is just to carry-forward the product name created on the date in question for any final output. If on a given day, only one product was made, it would be the same anyhow, just pick one. However, if on any day there are multiple products, the MIN() and MAX() will be of different values. If the same product across all that are built, then both values would be the same -- ON ANY SINGLE GIVEN DATE.
The rest are simple aggregates of OK parts, BAD parts (something was wrong), but also the TOTAL Parts created, regardless of any inspection failure. This is the primary qualifier for you to hit you 10,000, but if you wanted to change to 10,000 GOOD parts, change accordingly.
select
VMR.MCSAP,
cast(VMR.DateTime as Date) as InputDate,
min( VMR.ProductName ) ProductName,
max( VMR.ProductName ) LastProductName,
count( distinct VMR.ProductName ) as MultipleProductsSameDay,
sum( case when VMR.InspectionResult = 0 then 1 else 0 end ) OKParts,
sum( case when NOT VMR.InspectionResult = 0 then 1 else 0 end ) BadParts,
count(*) TotalParts
into
#tmpPartDailyCounts
from
VisionMachineResults VMR
group by
VMR.MCSAP,
cast(VMR.DateTime as Date)
Now, at this point, I have a pre-aggregation done on a per machine and date basis. Now, I want to get some counter that is sequentially applied on a per date that a product was done. I will pull this result into a temp table #tmpPartDays. By using the over/partition, this will create a result that first puts the records in order of MCSAP, then by the date and dumps an output with whatever the ROW_NUMBER() is to that. So, if there is no activity for a given machine such as over a weekend or holiday that the machine is not running, the SEQUENTIAL counter via OVER/PARTITION will keep them sequentially 1 through however many days... Again, query the result of this table and you'll see it.
By querying against the pre-aggregated table, that may account for 500k records and results down to say 450 via per machine/day, This query is now only querying against the 450 and will be very quick.
SELECT
PDC.MCSAP,
PDC.InputDate,
MultipleProductsSameDay,
ROW_NUMBER() OVER(PARTITION BY MCSAP
ORDER BY [InputDate] )
AS CapDay
into
#tmpPartDays
FROM
#tmpPartDailyCounts PDC
ORDER BY
PDC.MCSAP;
Now, is the kicker, tying this all together. I'm starting with just the #tmpPartDays JOINED to itself on the same MCSAP AND a MUST-HAVE matching record 10 days out... So this resolves issues of weekend / holidays since serial consecutive.
This now give me the begin/end date range such as 1-10, 2-11, 3-12, 4-13, etc.
I then join to the tmpPartDailyCounts result on the same part AND the date is at the respective begin (PD.InputDate) and END (PD2.InputDate). I re-apply the same aggregates to get the total counts WITHIN EACH Part + 10 day period. Run this query WITHOUT the "HAVING" clause to see what is coming out.
select
PD.MCSAP,
PD.InputDate BeginDate,
PD2.InputDate EndDate,
SUM( PDC.MultipleProductsSameDay ) as TotalProductsMade,
sum( PDC.OKParts ) OKParts,
sum( PDC.BadParts ) BadParts,
sum( PDC.TotalParts ) TotalParts,
min( PDC.ProductName ) ProductName,
max( PDC.LastProductName ) LastProductName
from
#tmpPartDays PD
-- join again to get 10 days out for the END cycle
JOIN #tmpPartDays PD2
on PD.MCSAP = PD2.MCSAP
AND PD.CapDay +9 = PD2.CapDay
-- Now join to daily counts for same machine and within the 10 day period
JOIN #tmpPartDailyCounts PDC
on PD.MCSAP = PDC.MCSAP
AND PDC.InputDate >= PD.InputDate
AND PDC.InputDate <= PD2.InputDate
group by
PD.MCSAP,
PD.InputDate,
PD2.InputDate
having
SUM( PDC.MultipleProductsSameDay ) = 10
AND min( PDC.ProductName ) = max( PDC.LastProductName )
AND SUM( PDC.TotalParts ) >= 10
Finally, the elimination of the records you DONT want. Since I dont have millions of records to simulate, just follow along. I am doing a HAVING on
SUM( PDC.TotalParts ) >= 10
SUM( PDC.MultipleProductsSameDay ) = 10
If on ANY day there are MORE than 1 product created, the count would be 11 or more, thus indicating not the same product, so that would cause an exclusion. But also, if at the tail-end of data such as only 7 days of production, it would never HIT 10 which was your 10-day qualifier also.
2. AND min( PDC.ProductName ) = max( PDC.LastProductName )
Here, since we are spanning back to the DAILY context, if ANY product changes on any date, the Product Name (via min) and LastProductName (via max) will change, regardless of the day, and regardless of the name context. So, by making sure both the min() and max() are the same, you know it is the same product across the entire span.
3. AND SUM( PDC.TotalParts ) >= 10
Finally, the count of things made. In this case, I did >= 10 because I was only testing with 1 item per day, thus 10 days = 10 items. In your scenario, you may have 987 in one day, but 1100 in another, thus balancing low and high production days to get to that 10,000, but for sample of data, just change YOUR context to the 10,000 limit minimum.
This SQLFiddle shows the results as it gets down to the per machine/day and showing the sequential activity. The last MCSAP machine starts on Jan 4th, but has a sequential day row assignment starting at 1 to give proper context to the 1-10, 2-11, etc.
First SQL Fiddle showing machine/day
Second fiddle shows final query WITHOUT the HAVING clause and you can see the first couple rows of TotalProductsMade is 11 which means SOMETHING on any of the day-span in question created different products and would be excluded from final. For the begin and end dates of Jan 6-15 and Jan 7-16, you will see the MIN/MAX products showing Product A and Product B, thus indicating that SOMEWHERE within its 10-day span a product switched... These too will be excluded.
The FINAL query This query shows the results with the HAVING clause applied.
One option that comes to my mind is the use of a numbers table (google Jeff Moden on SQL Server Central for more background).
The number table then uses a start date (from the range of dates to investigate) that in addition to generate a date to link to also generates a "bucket" by which to group afterwards.
Similar to:
-- generate date frame from and to
DECLARE
#date_start date = Convert( date, '20211110', 112 ),
#date_end date = Convert( date, '20220110', 112 )
;
WITH
cteN
(
Number
)
AS
( -- build a list of 10 single digit numbers
SELECT Cast( 0 AS int ) AS Number UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9
)
,
cteNumbers
(
Number
)
AS
( -- splice single digit numbers to list from 0 to 99999
SELECT
cN10000.Number * 10000 + cN1000.Number * 1000 + cN100.Number * 100 + cN10.Number * 10 + cN1.Number
FROM
cteN AS cN10000
CROSS JOIN cteN AS cN1000
CROSS JOIN cteN AS cN100
CROSS JOIN cteN AS cN10
CROSS JOIN cteN AS cN1
)
,
cteBucketOffset
(
DatediffNum,
Offset
)
AS
( -- determine the offset in datediffs to number buckets later correctly
SELECT
Cast( Datediff( dd, #date_start, #date_end ) AS int ) - 1 AS DatediffNum,
Cast( Datediff( dd, #date_start, #date_end ) % 10 AS tinyint ) - 1 AS Offset
)
,
cteDates
(
Dated,
Bucket,
BucketNumber,
BucketOffset,
DatediffNum
)
AS
( -- generate list of dates with bucket batches and numbers
SELECT
Dateadd( dd, cN.Number * -1, #date_end ) AS Dated,
Cast( ( cBO.Offset + cN.Number ) / 10 AS int ) AS Bucket,
Cast( ( cBO.Offset + cN.Number ) % 10 AS tinyint ) AS BucketNumber,
cBO.Offset,
cBO.DatediffNum
FROM
cteNumbers AS cN
CROSS JOIN cteBucketOffset AS cBO
WHERE
cN.Number <= Datediff( dd, #date_start, #date_end )
)
SELECT
*
FROM
cteDates AS cD
ORDER BY
cD.Dated ASC
;
Long winded due to showing each step. The result is a table-on-the-fly usable to join back to the raw data. "Bucket" can then be used instead of the date itself to group raw data.
Once this data is built then decisions can be made on the grouped conditions like having a minimum number of rows.
Seems just a matter of grouping on the year and the day of the year divided by 10.
SELECT
CONCAT(CONVERT(VARCHAR(10),MIN([DateTime]),105), ' to ', CONVERT(VARCHAR(10), MAX([DateTime]), 105)) AS InputDateRange
, MCSAP
, MAX(ZAssetRegister.LocalName) AS LocalName
, ProductName
, SUM(CASE WHEN InspectionResult = 0 THEN 1 END) AS OKParts
, COUNT(CASE WHEN InspectionResult > 0 THEN 1 END) AS NGParts
, COUNT(DISTINCT CAST([Datetime] AS DATE)) AS total_days
FROM VisionMachineResults
JOIN ZAssetRegister
ON VisionMachineResults.MCSAP = ZAssetRegister.SAP_Number
GROUP BY
DATEPART(YEAR, [DateTime]),
CEILING(DATEPART(DAYOFYEAR, [DateTime])/10.0),
MCSAP,
ProductName
ORDER BY
MIN([DateTime]),
MAX(ZAssetRegister.LocalName);
Simplified test on db<>fiddle here
I have a database that has customer, product, date and volume/revenue data. I'd like to create two NEW columns to show the previous year volume and revenue based on the date/customer/product.
I've tried unioning two views, one that has dates (unchanged) and a second view that creates a CTE where I select the dates minus one year with another select statement off of that where VOL and REV are renamed VOL_PY and REV_PY but the data is incomplete. Basically what's happening is the PY data is only pulling volume and revenue if there is data in the prior year (for example if a customer didn't sell a product in 2021 but DID in 2020, it wouldn't pull for the VOL_PY for 2020 - because it didn't sell in 2021). How do I get my code to include matches in dates but also the instances where there isn't data in the "current" year?
Here's what I'm going for:
[EXAMPLE DATA WITH NEW COLUMNS]
CURRENT YEAR VIEW:
SELECT
CUSTOMER
,PRODUCT
,DATE
,VOL
,REV
,0 AS VOL_HL_PY
,0 AS REV_DOLS_PY
,DATEADD(YEAR, -1, DATE) AS DATE_PY FROM dbo.vwReporting
PREVIOUS YEAR VIEW:
WITH CTE_PYFIGURES
([AUTONUMBER]
,CUSTOMER
,PRODUCT
,DATE
,VOL
,REV
,DATE_PY
) AS
(
SELECT b.*
, DATEADD(YEAR, 1, DATE) AS DATE_PY
FROM dbo.basetable b
)
SELECT
v.CUSTOMER
,v.PRODUCT
,v.DATE
,0 AS v.VOL
,0 AS v.REV
,CTE.VOL_HL AS VOL_HL_PY
,CTE.REV_DOLS AS REV_DOLS_PY
,DATEADD(YEAR,-1,CTE.PERIOD_DATE_PY) AS PERIOD_DATE_PY
FROM dbo.vwReporting AS v
FULL OUTER JOIN CTE_PYFIGURES AS CTE ON CTE.CUSTOMER=V.CUSTOMER AND CTE.PRODUCT=V.PRODCUT AND CTE.DATE_PY=V.DATE
You need to offset your current year's data to one year forward and then union it with the current data, placing zeroes for "other" measures (VOL and REV for previous year and VOL_PY and REV_PY for current year). Then do aggregation. This way you'll have all the dimensions' values that were in current or previous year.
with a as (
select
CUSTOMER
, PRODUCT
, [DATE]
, VOL
, REV
, 0 as vol_py
, 0 as rev_py
from dbo.vwReporting
union all
select
CUSTOMER
, PRODUCT
, dateadd(year, 1, [DATE]) as [DATE]
, 0 as VOL
, 0 as REV
, vol as vol_py
, rev as rev_py
from dbo.vwReporting
)
select
CUSTOMER
, PRODUCT
, [DATE]
, VOL
, sum(vol) as vol
, sum(rev) as rev
, sum(vol_py) as vol_py
, sum(rev_py) as rev_py
from a
group by
CUSTOMER
, PRODUCT
, [DATE]
, VOL
Would you help me please, to solve the task below in SQL (MS SQL Server 2017). It is simple in Excel, but seems very complicated in SQL.
There is a table with clients and their activities split by days:
client 1may 2may 3may 4may 5may other days
client1 0 0 0 0 0 ...
client2 0 0 0 0 0 ...
client3 0 0 0 0 0 ...
client4 1 1 1 1 1 ...
client5 1 1 1 0 0 ...
It is necessary to create the same table (the same quantity of rows and columns), but turn the values into new one according to the rule:
Current day value =
A) If all everyday values during a week before the day, including the current one = 1, then 1
B) If all everyday values during a week before the day, including the current one = 0, then 0
C) If the values are different, then we leave the status of the previous day (if the status of the previous day is not known, for example, the Client is new, then 0)
In Excel, I do this using the formula: = IF (AND (AF2 = AE2; AE2 = AD2; AD2 = AC2; AC2 = AB2; AB2 = AA2; AA2 = Z2); current_day_value; IF (previous_day_value = ""; 0; previous_day_value )).
The example with excel file is attached.
Thank you very much.
First thing, it's NEVER a good idea to have dates as columns.
So step #1 transpose your columns to rows. In other world to build a table with three columns
```
client date Value
client1 May1 0
client1 May2 0
client1 May3 0
.... ... ..
client4 May1 1
client4 May2 1
client4 May3 1
.... ... ..
```
step #2 perform all the calculations you need by using the date field.
Basically you put always the status of the previous day, in any case (except null).
So, i would do something like this (oracle syntax, working in sql server too), supposing the first columns is 1may
Insert into newTable (client, 1may,2may,....) select (client, 0, coalesce(1may,0), coalesce (2may,0), .... from oldTable;
Anyway me too i believe is not a good practice to put the days as columns of a relational table.
You're going to struggle with this because most brands of SQL don't allow "arbitrary pivoting", that is, you need to specify the columns you want to be displayed on a pivot - Whereas Excel will just do this for you. SQL can do this but it required dynamic SQL which can get pretty complicated and annoying pretty fast.
I would suggest you use sql just to construct the data, and then excel or SSRS (As you're in TSQL) to actually do the visualization.
Anyway. I think this does what you want:
WITH Data AS (
SELECT * FROM (VALUES
('Client 1',CONVERT(DATE, '2020-05-04'),1)
, ('Client 1',CONVERT(DATE, '2020-05-05'),1)
, ('Client 1',CONVERT(DATE, '2020-05-06'),1)
, ('Client 1',CONVERT(DATE, '2020-05-07'),0)
, ('Client 1',CONVERT(DATE, '2020-05-08'),0)
, ('Client 1',CONVERT(DATE, '2020-05-09'),0)
, ('Client 1',CONVERT(DATE, '2020-05-10'),1)
, ('Client 1',CONVERT(DATE, '2020-05-11'),1)
, ('Client 1',CONVERT(DATE, '2020-05-12'),1)
, ('Client 2',CONVERT(DATE, '2020-05-04'),1)
, ('Client 2',CONVERT(DATE, '2020-05-05'),0)
, ('Client 2',CONVERT(DATE, '2020-05-06'),0)
, ('Client 2',CONVERT(DATE, '2020-05-07'),1)
, ('Client 2',CONVERT(DATE, '2020-05-08'),0)
, ('Client 2',CONVERT(DATE, '2020-05-09'),1)
, ('Client 2',CONVERT(DATE, '2020-05-10'),0)
, ('Client 2',CONVERT(DATE, '2020-05-11'),1)
) x (Client, RowDate, Value)
)
SELECT
Client
, RowDate
, Value
, CASE
WHEN OnesBefore = DaysInWeek THEN 1
WHEN ZerosBefore = DaysInWeek THEN 0
ELSE PreviousDayValue
END As FinalCalculation
FROM (
-- This set uses windowing to calculate the intermediate values
SELECT
*
-- The count of the days present in the data, as part of the week may be missing we can't assume 7
-- We only count up to this day, so its in line with the other parts of the calculation
, COUNT(RowDate) OVER (PARTITION BY Client, WeekCommencing ORDER BY RowDate) AS DaysInWeek
-- Count up the 1's for this client and week, in date order, up to (and including) this date
, COUNT(IIF(Value = 1, 1, NULL)) OVER (PARTITION BY Client, WeekCommencing ORDER BY RowDate) AS OnesBefore
-- Count up the 0's for this client and week, in date order, up to (and including) this date
, COUNT(IIF(Value = 0, 1, NULL)) OVER (PARTITION BY Client, WeekCommencing ORDER BY RowDate) AS ZerosBefore
-- get the previous days value, or 0 if there isnt one
, COALESCE(LAG(Value) OVER (PARTITION BY Client, WeekCommencing ORDER BY RowDate), 0) AS PreviousDayValue
FROM (
-- This set adds a few simple values in that we can leverage later
SELECT
*
, DATEADD(DAY, -DATEPART(DW, RowDate) + 1, RowDate) As WeekCommencing
FROM Data
) AS DataWithExtras
) AS DataWithCalculations
As you haven't specified your table layout, I don't know what table and field names to use in my example. Hopefully if this is correct you can figure out how to click it in place with what you have - If not, leave a comment
I will note as well, I've made this purposely verbose. If you don't know what the "OVER" clause is, you'll need to do some reading: https://www.sqlshack.com/use-window-functions-sql-server/. The gist is they do aggregations without actually crunching the rows together.
Edit: Adjusted the calculation to be able to account for an arbitrary number of days in the week
Thank you so much to everyone, especially to David and Massimo, which prompted me to restructure the data.
--we join clients and dates each with each and label clients with 'active' or 'inactive'
with a as (
select client, dates
from (select distinct client from dbo.clients) a
cross join (select dates from dates) b
)
, b as (
select date
,1 end active
,client
from clients a
join dbo.dates b on a.id = b.id
)
select client
,a.dates
,isnull(b.active, 0) active
into #tmp2
from a
left join b on a.client= b.client and a.dates = b.dates
--declare variables - for date start and for loop
declare #min_date date = (select min(dates) from #tmp2);
declare #n int = 1
declare #row int = (select count(distinct dates) from #tmp2) --number of the loop iterations
--delete data from the final results
delete from final_results
--fill the table with final results
--run the loop (each iteration = analyse of each 1-week range)
while #n<=#row
begin
with a as (
--run the loop
select client
,max(dates) dates
,sum (case when active = 1 then 1 else null end) sum_active
,sum (case when active = 0 then 1 else null end) sum_inactive
from #tmp2
where dates between dateadd(day, -7 + #n, #min_date) and dateadd(day, -1 + #n, #min_date)
group by client
)
INSERT INTO [dbo].[final_results]
(client
,[dates]
,[final_result])
select client
,dates
,case when sum_active = 7 then 1 --rule A
when sum_inactive = 7 then 0 -- rule B
else
(case when isnull(sum_active, 0) + isnull(sum_inactive, 0) < 7 then 0
else
(select final_result
from final_results b
where b.dates = dateadd(day, -1, a.dates)
and a.client= b.client) end
) end
from a
set #n=#n+1
end
if object_id(N'tempdb..#tmp2', 'U') is not null drop table #tmp2
Good afternoon,
Hope that you're all well and wish you a happy new year.
I'm experiencing some curious behaviour with a query that I've written in that the LAG function is inconsistent.
Essentially, I have a dataset (made up of 2 CTEs) which each contain the month (in MMM-YYYY format) and then one holds a count of tickets opened, and the other contains the same but for tickets closed.
What I am then doing is adding in a 'Backlog' column (which will be 0 for the first month in all cases) and a 'Carried Forward' column. The Carried Forward amount will be the balance of that month ( Created + Backlog ) and will be reflected as the Backlog for the following month.
I had this ticking over quite nicely until I realised that negative backlogs were fudging the numbers a bit. What I mean is, for example:
10 Tickets Created
12 Tickets Resolved
0 Ticket Backlog
-2 Tickets Carried Forward
In this circumstance, I've had to zero any negative backlog for our reporting purposes.
This is seemingly where the problems come into play. For the first few months, everything will be fine - the values will be right, carrying forward the correct numbers and factoring them into the calculations accordingly. But then it will carry over a number of (seemingly) indeterminable origin which of course, has a knock-on effect on the accuracy past this point.
With the Window Functions introduced with SQL Server 2012, this should be quite basic - but evidently not!
Whilst I'm quite happy to post code (I have tried a fair few ways of skinning this cat), I feel as though if someone is able to give a high-level overview of how it should be written, I'll see where I went wrong immediately. In doing so, I'll then respond accordingly with my attempt/s for completeness.
Thank you very much in advance!
Picture of result error:
, OpenClosed AS
(
SELECT
c.[Created Month] 'Month'
, c.Tickets 'Created'
, r.Tickets 'Resolved'
, IIF( ( c.Tickets - r.Tickets ) < 0, 0, ( c.Tickets - r.Tickets ) ) 'Balance'
FROM
Created c
JOIN Resolved r ON
c.[Created Month] = r.[Resolved Month]
)
, CarryForward AS
(
SELECT
ROW_NUMBER() OVER( ORDER BY CAST( '1.' + Month AS DATETIME ) ) 'Row No'
, Month 'Month'
, Created 'Created'
, Resolved 'Resolved'
, LAG( Balance, 1, 0 ) OVER( ORDER BY CAST( '1.' + Month AS DATETIME ) ) 'Backlog'
, IIF( ( ( Created + LAG( Balance, 1, 0 ) OVER( ORDER BY CAST( '1.' + Month AS DATETIME ) ) ) - Resolved ) < 0
, 0
, ( ( Created + LAG( Balance, 1, 0 ) OVER( ORDER BY CAST( '1.' + Month AS DATETIME ) ) ) - Resolved )
) 'Carry Forward'
FROM
OpenClosed
)
SELECT
c1.Month 'Month'
, c1.Created 'Created'
, c1.Resolved 'Resolved'
, c2.[Carry Forward] 'Backlog'
, IIF( ( c1.Created + c2.[Carry Forward] ) - c1.Resolved < 0
, 0
, ( c1.Created + c2.[Carry Forward] ) - c1.Resolved
) 'Carried Forward'
FROM
CarryForward c1
JOIN CarryForward c2 ON
c2.[Row No] = c1.[Row No]-1
From comments on question. Incidentally, the Created Month column should be redone somehow so that the year is placed before the month - like 2015-01. This will ensure correct ordering by default sort algorithms.
If the date must be presented as Jan-2015 in the final report, do that presentational work as the very final step in the query.
WITH ticket_account AS
(
SELECT
c.[Created Month] AS Month
,c.Tickets AS Created
,r.Tickets AS Resolved
FROM
Created AS c
INNER JOIN
Resolved AS r
ON c.[Created Month] = r.[Resolved Month]
)
SELECT
*
,(SUM(Created) OVER (ORDER BY Month ASC) - SUM(Resolved) OVER (ORDER BY Month ASC)) AS Balance
FROM
ticket_account
I am currently working on SSIS. I have a table with two columns Start and End dates. I need to calculate the days in between (including the start date and end date) and generate a row for each day with the other data repeating. The resulting dates should be stored in a new column.
The trick to making this work is to have a table that contains a list of all the days that are in the possible range.
In the following query, I fake it out by using the smallest date in our set MIN(Start).
I then generate a sequence of numbers from 1 to N based on the number of columns in our sys.all_columns views. That might be sufficient, it might not but based on the paucity of data, it works for now. If you need more dates generated, CROSS APPLY the sys.all_coumns table against itself.
I then use the numbers generated to build a list of dates via dateadd
I then take my ALLDATES derived table and perform an INNER JOIN to the original table, pinning the date generated for ALLDATES between Start and End columns (end points inclusive).
CREATE TABLE dbo.so_36392684
(
WeekNo int NOT NULL
, Start datetime NOT NULL
, [End] datetime NOT NULL
, SpecialEvents varchar(20) NULL
);
INSERT INTO
dbo.so_36392684
(WeekNo, Start, [End], SpecialEvents)
VALUES
(
1
, '1989-09-14'
, '1989-09-20'
, NULL
);
SELECT
S.WeekNo
, S.Start
, S.[End]
, S.SpecialEvents
, ALLDATES.ConsecutiveDays
FROM
(
SELECT
DATEADD(DAY, D.rn, S.Start) AS ConsecutiveDays
FROM
(
-- Find the first date in our table
SELECT
MIN(S.Start) AS Start
FROM
dbo.so_36392684 AS S
) AS S
CROSS APPLY
(
-- Generate a (hopefully) sufficiently large enough set of dates
SELECT
ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) -1 AS rn
FROM
sys.all_columns AS AC
) D
) AS ALLDATES
INNER JOIN
dbo.so_36392684 AS S
ON ALLDATES.ConsecutiveDays >= S.Start
AND ALLDATES.ConsecutiveDays <= S.[End];
Result should look something like this
WeekNo Start End SpecialEvents ConsecutiveDays
1 1989-09-14 1989-09-20 NULL 1989-09-14
1 1989-09-14 1989-09-20 NULL 1989-09-15
1 1989-09-14 1989-09-20 NULL 1989-09-16
1 1989-09-14 1989-09-20 NULL 1989-09-17
1 1989-09-14 1989-09-20 NULL 1989-09-18
1 1989-09-14 1989-09-20 NULL 1989-09-19
1 1989-09-14 1989-09-20 NULL 1989-09-20