Stretch the table of balances, for all dates of the calendar - sql

I have a table of stock balances, as of the date of their change
SQLFiddle
I need to stretch these Remains to missing dates including 0 leftovers
I created a calendar
CREATE TABLE Calendar
("Date" DATE)
GO
DECLARE #start datetime
DECLARE #end datetime
SET #start = (SELECT MIN (a.Date) FROM Remains a)
SET #end = GETDATE();
WITH cte AS
(
SELECT #start "Date"
UNION all
SELECT "Date" + 1
FROM cte
WHERE "Date" < #end
)
INSERT INTO Calendar
SELECT cast("Date" AS date) AS "Date"
FROM cte
WHERE "Date" < GETDATE()
option(MAXRECURSION 0)
Where I take the minimum date from the Remains table and drag it until today
Calendar
Next, I join the Calendar to the table of Remains using OUTER APPLY,
SELECT
b.Date
,x.W_Code
,x.Prod_Code
,x.Quality
,x.Count
FROM Calendar b
OUTER APPLY (
SELECT
a.Date
,a.W_Code
,a.Prod_Code
,a.Quality
,a.Count
FROM Remains a
WHERE a.Date = b.Date AND a.Prod_Code = N'00005026957' AND a.W_Code = N'000000017' ) x
and if my query applies to 1 specific warehouse, product and quality, then I get the desired result
Result
But if I remove the condition for the warehouse and the product, the table built incorrectly
I need for each group W_Code, Prod_Code, Quality
have each date from the Calendar
Please help me to find a way to implement this
Maybe I don't need a Calendar
I read about recursive CTE but did not understand how to apply it
How to fill the table with values further, I know, the problem is in the correct joining of the table with dates
Thanks

Related

Include zero counts for grouping date ranged based SQL query

I'm trying to group and order the number of sales made in each day from a single 'sales' table with a created_at column and an id column. Each of the records might be created through out the day at any time. I've managed to do this with the following query:
SELECT date_trunc('day', created_at::date) AS period, COUNT(id)
FROM sales
GROUP BY period
ORDER BY period
However, the days with 0 sales are not shown up. Any ideas?
Most of the answers I've found use LEFT JOIN but I can't seem to get it to work, so I might seem to be misunderstanding how to use it. :(
Thank you in advance!
Create a temporary table that returns the required dates and then join to it
DECLARE #StartDateTime DATETIME
DECLARE #EndDateTime DATETIME
SET #StartDateTime = '2015-01-01'
SET #EndDateTime = '2015-01-12';
WITH DateRange(DateData) AS
(
SELECT #StartDateTime as Date
UNION ALL
SELECT DATEADD(d,1,DateData)
FROM DateRange
WHERE DateData < #EndDateTime
)
SELECT DateRange.DateData, Count(sales.id)
FROM sales
right join DateRange on sales.date = DateRange.DateData
group by DateRange.DateData
OPTION (MAXRECURSION 0)

How to add a set of dates for each category in a dimension?

I have data that looks like this where there is a monthly count of a particular animal for each month. By default, it aggregates in the month where there is data.
However, I would like to like to have a default set of dates for each animal up to the current month date with 0 if there's no data. Desired Result -
Is there a way to handle with a on sql server and not in Excel?
Much appreciated in advance.
You can generate the months you want using a numbers table or recursive CTE (or calendar table). Then cross join with the animals to generate the rows and use left join to bring in the existing data:
with dates as (
select min(date) as dte
from t
union all
select dateadd(month, 1 dte)
from dates
where dte < getdate()
)
select a.animal, d.dte, coalesce(t.monthly_count, 0) as monthly_count
from dates d cross join
(select distinct animal from t) a left join
data t
on t.date = d.dte and t.animal = a.animal
order by a.animal, d.dte;

adding a row for missing data

Between a date range 2017-02-01 - 2017-02-10, i'm calculating a running balance.
I have days where we have missing data, how would I include these missing dates with the previous days balance ?
Example data:
we are missing data for 2017-02-04,2017-02-05 and 2017-02-06, how would i add a row in the query with the previous balance?
The date range is a parameter, so could change....
Can i use something like the lag function?
I would be inclined to use a recursive CTE and then fill in the values. Here is one approach using outer apply:
with dates as (
select mind as dte, mind, maxd
from (select min(date) as mind, max(date) as maxd from t) t
union all
select dateadd(day, 1, dte), mind, maxd
from dates
where dte < maxd
)
select d.dte, t.balance
from dates d outer apply
(select top 1 t.*
from t
where t.date <= d.dte
order by t.date desc
) t;
You can generate dates using tally table as below:
Declare #d1 date ='2017-02-01'
Declare #d2 date ='2017-02-10'
;with cte_dates as (
Select top (datediff(D, #d1, #d2)+1) Dates = Dateadd(day, Row_Number() over (order by (Select NULL))-1, #d1) from
master..spt_values s1, master..spt_values s2
)
Select * from cte_dates left join ....
And do left join to your table and get running total
Adding to the date range & CTE solutions, I have created Date Dimension tables in numerous databases where I just left join to them.
There are free scripts online to create date dimension tables for SQL Server. I highly recommend them. Plus, it makes aggregation by other time periods much more efficient (e.g. Quarter, Months, Year, etc....)

I have 3 tables with dates to which I want to join to a date dimension table but it is returning many duplicates with left joins

I have 4 tables with dates to which I want to join to a date dimension table but it is returning many duplicates with left joins.
Tables are basically a date field which I want to count.
mdate datetime, mordate varchar(10),fteam varchar(20)
sdate datetime,fteam varchar(20)
bdate datetime,fteam varchar(20)
These are actually one table with the separate dates which I am joining 3 times to the dimension table to get one dataset. Also this table
compdate datetime, fteam varchar(20)
and the date dimension table as date in yyyymmdd,which I join on the date field.
as
select cp.fteam,md.mdate,sd.sdate,bd.bdate,cp.cpdate,d.date
into #resultstable
from datedimension d
left join mdate md
on d.date = convert(date,md.mdate,103)
left join sentdate sd
on d.date = convert(date,sd.sdate,103)
left join bacdate bd
on d.date = convert(date,bd.bdate,103)
left join compdate cp
on d.date = convert(date,cp.cdate,103)
Doing this I want the dates in the date dimension to give me one date I can use a where clause on to get counts of each date from the 4 different tables for a report.
However it is giving me many repeats as each time a there is a matching date you get the same line repeated repeated for the matching date on all tables.
This gives many counts which are wrong.
ie
if md table has a record 2 records for 2016/06/29 and cp has 3 and bd has six
The dimension date result will be 36! for md when it should only be showing 2!, ie 6x3x2.
How can I join these tables with causing repeats and incorrect results.
I thought it would be a standard way to join fact tables with a dimension table to give accurate results and not duplicates as you are join sets together.
I have tried picking only the dates from each table only but it still gives repeats.
I cannot show a schema as company details but you can put together a hypothetical one from the tables shown.
What you are seeing is that because none of your fact tables relate to each other, you are essentially creating a Cartesian product for the fact tables--where they only relate to each other by date.
Consider this simplified version of your example, where I also include some sample data for "today":
CREATE TABLE #fact1 (id int identity, dt datetime, val varchar(5));
CREATE TABLE #fact2 (id int identity, dt datetime, val varchar(5));
CREATE TABLE #fact3 (id int identity, dt datetime, val varchar(5));
CREATE TABLE #fact4 (id int identity, dt datetime, val varchar(5));
CREATE TABLE #date (dt datetime, val varchar(5));
GO
INSERT INTO #fact1 (dt, val) VALUES (GETDATE(),'fact1');
INSERT INTO #fact2 (dt, val) VALUES (GETDATE(),'fact2');
INSERT INTO #fact3 (dt, val) VALUES (GETDATE(),'fact3');
INSERT INTO #fact4 (dt, val) VALUES (GETDATE(),'fact4');
WAITFOR DELAY '00:00:01';
GO 5
INSERT INTO #date (dt, val) VALUES (CAST(GETDATE() AS date),'Today');
GO
SELECT *
FROM #date d
JOIN #fact1 AS f1 ON d.dt = CAST(f1.dt AS date)
JOIN #fact2 AS f2 ON d.dt = CAST(f2.dt AS date)
JOIN #fact3 AS f3 ON d.dt = CAST(f3.dt AS date)
JOIN #fact4 AS f4 ON d.dt = CAST(f4.dt AS date);
GO
DROP TABLE #fact1;
DROP TABLE #fact2;
DROP TABLE #fact3;
DROP TABLE #fact4;
DROP TABLE #date;
GO
Note that 625 rows are returned. This is the Cartesian product of the four fact tables, which is then joined to the dimension table. This happens because there is no relation between the fact tables other than the date. As a result, any one row for "today" in one fact table is joined to every row for "today" in every other fact table.
Instead, consider how your four fact tables related WITHOUT the join to the date dimension table. Re-write your query so that the data makes sense before joining to the date dimension. Do the tables relate on something like an order_id or any other aspect?
If the fact tables only relate insomuch as you are aggregating them by date--then yes, you'll need to take another approach:
a) Aggregate by date first, then join the aggregated sets together. This option makes the most sense if you only need the aggregated values, and don't need the full details for your report.
SELECT *
FROM #date d
JOIN (SELECT CAST(dt AS date) AS dt, count(*) AS dt_count
FROM #fact1 GROUP BY CAST(dt AS date)) AS f1 ON d.dt = f1.dt
JOIN (SELECT CAST(dt AS date) AS dt, count(*) AS dt_count
FROM #fact2 GROUP BY CAST(dt AS date)) AS f2 ON d.dt = f2.dt
JOIN (SELECT CAST(dt AS date) AS dt, count(*) AS dt_count
FROM #fact3 GROUP BY CAST(dt AS date)) AS f3 ON d.dt = f3.dt
JOIN (SELECT CAST(dt AS date) AS dt, count(*) AS dt_count
FROM #fact4 GROUP BY CAST(dt AS date)) AS f4 ON d.dt = f4.dt
b) Assign an arbitrary row_number() for each calendar day, then use that as a secondary join criterion. If the data doesn't actually relate, this option might work, but the detailed result set is largely meaningless when all data in a single row doesn't refer to a single entity. This might give you the right numbers, but is logically a useless result set.
SELECT *
FROM #date d
JOIN (SELECT *,
ROW_NUMBER() OVER(PARTITION BY CAST(dt AS date) ORDER BY dt) AS row_num
FROM #fact1 ) AS f1 ON d.dt = CAST(f1.dt AS date)
JOIN (SELECT *,
ROW_NUMBER() OVER(PARTITION BY CAST(dt AS date) ORDER BY dt) AS row_num
FROM #fact2 ) AS f2 ON d.dt = CAST(f2.dt AS date) AND f1.row_num = f2.row_num
JOIN (SELECT *,
ROW_NUMBER() OVER(PARTITION BY CAST(dt AS date) ORDER BY dt) AS row_num
FROM #fact3 ) AS f3 ON d.dt = CAST(f3.dt AS date) AND f1.row_num = f3.row_num
JOIN (SELECT *,
ROW_NUMBER() OVER(PARTITION BY CAST(dt AS date) ORDER BY dt) AS row_num
FROM #fact4 ) AS f4 ON d.dt = CAST(f4.dt AS date) AND f1.row_num = f4.row_num
c) Break this up into separate statements: one for each fact table. Optionally UNION those results into a single result set. This result set could then be further aggregated/grouped to give you the results you want.
SELECT *, 'Fact 1' AS SourceTable
FROM #date d
JOIN #fact1 AS f1 ON d.dt = CAST(f1.dt AS date)
UNION ALL
SELECT *, 'Fact 2' AS SourceTable
FROM #date d
JOIN #fact2 AS f2 ON d.dt = CAST(f2.dt AS date)
UNION ALL
SELECT *, 'Fact 3' AS SourceTable
FROM #date d
JOIN #fact3 AS f3 ON d.dt = CAST(f3.dt AS date)
UNION ALL
SELECT *, 'Fact 4' AS SourceTable
FROM #date d
JOIN #fact4 AS f4 ON d.dt = CAST(f4.dt AS date);
In my opinion, options a & c offer the best solutions when the fact tables don't otherwise relate to each other. Option b might work, but you would need to be very careful that your data is meaningful & doesn't create confusing or erroneous results.
Additionally, while it is orthogonal to the question asked, keep in mind that applying a function to a join criteria (in this case, CONVERTing the date column) will prevent index usage, resulting in a table scan.
You should have a single date table that you join to from the fact table multiple times. This is called a role playing dimension. Your query will look like this:
SELECT fact.*
,COALESCE(moc.datekey, #unknownDateKey)
,COALESCE(sent.datekey, #unknownDateKey)
FROM factTable fact
LEFT OUTER JOIN date moc
ON fact.mocdate = moc.date
LEFT OUTER JOIN date sent
ON fact.sentdate = sent.date
You would have the #unknownDateKey as a variable above that is set to the key of the unknown member for the dimension.

Spread a table in a date time interval

Hello everyone it's been some days that I use sql to make analysis and I meet all kinds of problems that I solves thanks to your forum.
Now I'd like to create a view that recuperates the interval of time and shows in detail the dates in this interval.
I have the following table:
And I want to create the view that displays the result:
For example in the player1 MyTable to play five days from 01/01/2012
to 05/01/2012. So the view displays 5 lines for player1 with the date 01/01/2012, 02/01/2012, 03/01/2012, 04/01/2012, 05/01/2012.
Thank you in advance for your help.
You have to create a common table expression that give you the date range ( i have created a date range of the current month but you can choice another range) :
WITH DateRange(dt) AS
(
SELECT CONVERT(datetime, '2012-01-01') dt
UNION ALL
SELECT DATEADD(dd,1,dt) dt FROM DateRange WHERE dt < CONVERT(datetime, '2012-01-31')
)
SELECT dates.dt AS DatePlaying, PlayerName
FROM MyTable t
JOIN DateRange dates ON dt BETWEEN t.BeginDate AND t.DateEnd
ORDER BY PlayerName, DatePlaying
Another approach to this is simply to create an enumeration table to add values to dates:
with enumt as (select row_number() over (order by (select NULL)) as seqnum
from mytable
)
select dateadd(d, e.seqnum, mt.DateBegin) as DatePlaying, mt.PlayerName
from MyTable mt join
enum e
on enumt.seqnum <= e.NumberOfPlayingDay
The only purpose of the "with" clause is to generate a sequence of integers starting at 1.