Check if a month is skipped then add values dynamically? - sql

I have a set of data from a table that would only be populated if a user has data for a certain month just like this:
Month | MonthName | Value
3 | March | 136.00
4 | April | 306.00
7 | July | 476.00
12 | December | 510.48
But what I need is to check if a month is skipped then adding the value the month before so the end result would be like this:
Month | MonthName | Value
3 | March | 136.00
4 | April | 306.00
5 | May | 306.00 -- added data
6 | June | 306.00 -- added data
7 | July | 476.00
8 | August | 476.00 -- added data
9 | September | 476.00 -- added data
10 | October | 476.00 -- added data
11 | November | 476.00 -- added data
12 | December | 510.48
How can I do this dynamically on SQL Server?

One method is a recursive CTE:
with cte as (
select month, value, lead(month) over (order by month) as next_month
from t
union all
select month + 1, value, next_month
from cte
where month + 1 < next_month
)
select month, datename(month, datefromparts(2020, month, 1)) as monthname, value
from cte
order by month;
Here is a db<>fiddle.

you can use spt_values to get continuous number 1-12, and then left join your table by max(month)
select t1.month
,datename(month,datefromparts(2020, t1.month, 1)) monthname
,t2.value
from (
select top 12 number + 1 as month from master..spt_values
where type = 'p'
) t1
left join t t2 on t2.month = (select max(month) from t tmp where tmp.month < = t1.month)
where t2.month is not null
CREATE TABLE T
([Month] int, [MonthName] varchar(8), [Value] numeric)
;
INSERT INTO T
([Month], [MonthName], [Value])
VALUES
(3, 'March', 136.00),
(4, 'April', 306.00),
(7, 'July', 476.00),
(12, 'December', 510.48)
;
Demo Link SQL Server 2012 | db<>fiddle
note
if you have year column then you need to fix the script.

Related

Variable value as column name in Snowflake

can I obtain in a query variable value as column name in Snowflake?
SET "CURRENT_YEAR"=YEAR(CURRENT_DATE());
SELECT SUM("AMOUNT") AS "$CURRENT_YEAR" (here I want the value 2021)
FROM "DB"."SCHEMA"."TABLE"
WHERE YEAR("DATE") = $CURRENT_YEAR;
Please try below:
create or replace table test (
date date,
amount int
);
insert into test values
('2021-01-01', 100),
('2022-01-01', 56),
('2022-02-01', 67),
('2021-05-01', 38),
('2023-01-01', 150),
('2021-01-06', 400),
('2021-07-11', 120)
;
SET "CURRENT_YEAR"=YEAR(CURRENT_DATE());
with year_tbl as (
select year(date) as year, amount from test
where year = $CURRENT_YEAR
)
select *
from year_tbl
pivot(sum(amount) for year in ($CURRENT_YEAR)) as yr
;
+------+
| 2021 |
|------|
| 658 |
+------+
If you want different years:
with year_tbl as (
select year(date) as year, amount from test
)
select *
from year_tbl
pivot(sum(amount) for year in (2020, 2021, 2022, 2023)) as yr
;
+------+------+------+------+
| 2020 | 2021 | 2022 | 2023 |
|------+------+------+------|
| NULL | 658 | 123 | 150 |
+------+------+------+------+
did you mean something like this
create or replace table fld_year as
(SELECT current_date() dt, 2021 as fld_year, 1 as AMT UNION ALL
SELECT current_date(),2021 as fld_year, 2 as r_num UNION ALL
SELECT current_date()- 900,2019 as fld_year, 3 as r_num UNION ALL
SELECT current_date()-400,2020 as fld_year, 4 as r_num );
SET "CURRENT_YEAR"=YEAR(CURRENT_DATE());
SELECT SUM(AMT) FROM fld_year WHERE YEAR(dt) = $CURRENT_YEAR;
SELECT * FROM fld_year WHERE YEAR(dt) = $CURRENT_YEAR;

How to return same row multiple times with multiple conditions

My knowledge is pretty basic so your help would be highly appreciated.
I'm trying to return the same row multiple times when it meets the condition (I only have access to select query).
I have a table of more than 500000 records with Customer ID, Start Date and End Date, where end date could be null.
I am trying to add a new column called Week_No and list all rows accordingly. For example if the date range is more than one week, then the row must be returned multiple times with corresponding week number. Also I would like to count overlapping days, which will never be more than 7 (week) per row and then count unavailable days using second table.
Sample data below
t1
ID | Start_Date | End_Date
000001 | 12/12/2017 | 03/01/2018
000002 | 13/01/2018 |
000003 | 02/01/2018 | 11/01/2018
...
t2
ID | Unavailable
000002 | 14/01/2018
000003 | 03/01/2018
000003 | 04/01/2018
000003 | 08/01/2018
...
I cannot pass the stage of adding week no. I have tried using CASE and UNION ALL but keep getting errors.
declare #week01start datetime = '2018-01-01 00:00:00'
declare #week01end datetime = '2018-01-07 00:00:00'
declare #week02start datetime = '2018-01-08 00:00:00'
declare #week02end datetime = '2018-01-14 00:00:00'
...
SELECT
ID,
'01' as Week_No,
'2018' as YEAR,
Start_Date,
End_Date
FROM t1
WHERE (Start_Date <= #week01end and End_Date >= #week01start)
or (Start_Date <= #week01end and End_Date is null)
UNION ALL
SELECT
ID,
'02' as Week_No,
'2018' as YEAR,
Start_Date,
End_Date
FROM t1
WHERE (Start_Date <= #week02end and End_Date >= #week02start)
or (Start_Date <= #week02end and End_Date is null)
...
The new table should look like this
ID | Week_No | Year | Start_Date | End_Date | Overlap | Unavail_Days
000001 | 01 | 2018 | 12/12/2017 | 03/01/2018 | 3 |
000002 | 02 | 2018 | 13/01/2018 | | 2 | 1
000003 | 01 | 2018 | 02/01/2018 | 11/01/2018 | 6 | 2
000003 | 02 | 2018 | 02/01/2018 | 11/01/2018 | 4 | 1
...
business wise i cannot understand what you are trying to achieve. You can use the following code though to calculate your overlapping days etc. I did it the way you asked, but i would recommend a separate table, like a Time dimension to produce a "cleaner" solution
/*sample data set in temp table*/
select '000001' as id, '2017-12-12'as start_dt, ' 2018-01-03' as end_dt into #tmp union
select '000002' as id, '2018-01-13 'as start_dt, null as end_dt union
select '000003' as id, '2018-01-02' as start_dt, '2018-01-11' as end_dt
/*calculate week numbers and week diff according to dates*/
select *,
DATEPART(WK,start_dt) as start_weekNumber,
DATEPART(WK,end_dt) as end_weekNumber,
case
when DATEPART(WK,end_dt) - DATEPART(WK,start_dt) > 0 then (DATEPART(WK,end_dt) - DATEPART(WK,start_dt)) +1
else (52 - DATEPART(WK,start_dt)) + DATEPART(WK,end_dt)
end as WeekDiff
into #tmp1
from
(
SELECT *,DATEADD(DAY, 2 - DATEPART(WEEKDAY, start_dt), CAST(start_dt AS DATE)) [start_dt_Week_Start_Date],
DATEADD(DAY, 8 - DATEPART(WEEKDAY, start_dt), CAST(start_dt AS DATE)) [startdt_Week_End_Date],
DATEADD(DAY, 2 - DATEPART(WEEKDAY, end_dt), CAST(end_dt AS DATE)) [end_dt_Week_Start_Date],
DATEADD(DAY, 8 - DATEPART(WEEKDAY, end_dt), CAST(end_dt AS DATE)) [end_dt_Week_End_Date]
from #tmp
) s
/*cte used to create duplicates when week diff is over 1*/
;with x as
(
SELECT TOP (10) rn = ROW_NUMBER() --modify the max you want
OVER (ORDER BY [object_id])
FROM sys.all_columns
ORDER BY [object_id]
)
/*final query*/
select --*
ID,
start_weekNumber+ (r-1) as Week,
DATEPART(YY,start_dt) as [YEAR],
start_dt,
end_dt,
null as Overlap,
null as unavailable_days
from
(
select *,
ROW_NUMBER() over (partition by id order by id) r
from
(
select d.* from x
CROSS JOIN #tmp1 AS d
WHERE x.rn <= d.WeekDiff
union all
select * from #tmp1
where WeekDiff is null
) a
)a_ext
order by id,start_weekNumber
--drop table #tmp1,#tmp
The above will produce the results you want except the overlap and unavailable columns. Instead of just counting weeks, i added the number of week in the year using start_dt, but you can change that if you don't like it:
ID Week YEAR start_dt end_dt Overlap unavailable_days
000001 50 2017 2017-12-12 2018-01-03 NULL NULL
000001 51 2017 2017-12-12 2018-01-03 NULL NULL
000001 52 2017 2017-12-12 2018-01-03 NULL NULL
000002 2 2018 2018-01-13 NULL NULL NULL
000003 1 2018 2018-01-02 2018-01-11 NULL NULL
000003 2 2018 2018-01-02 2018-01-11 NULL NULL

How to make a query of the max month/year of a group of columns

I don't really know how to make a MySQL query where I get all the data(Columns) of the most present month + year.
Being a little bit more specific I need to group the columns (description, data1, data2, data3) and only get the one that has the highest month + year.
The table contains data like this one:
description data1 data2 data3 year month adddate
desc1 0 7 1 2019 5 2019-05-23
desc2 0 7 1 2019 5 2019-05-23
desc3 1 7 1 2019 5 2019-05-23
desc4 0 2 1 2018 12 2019-05-23
I've tried using max on the month, year and adddate.
select description, data1, data2, data3, max(year) as year, max(month) as month, max(adddate) as adddate
from tabledata
group by description, data1, data2, data3
But with this I'm getting that the max register is desc2 with the month 12 which is not correct since the month is desc2 is 6.
You need to get the max value of an expression like:
100 * year + month
or
12 * year + month
So do this:
select *
from tabledata
where 100 * year + month = (
select max(100 * year + month)
from tabledata
)
See the demo.
Results:
> description | data1 | data2 | data3 | year | month | adddate
> :---------- | ----: | ----: | ----: | ---: | ----: | :---------
> desc1 | 0 | 7 | 1 | 2019 | 5 | 23/05/2019
> desc2 | 0 | 7 | 1 | 2019 | 5 | 23/05/2019
> desc3 | 1 | 7 | 1 | 2019 | 5 | 23/05/2019
You need to find the max combination of year and month (MaxYearMonth) and then join to your table again to find all the rows that match that MaxYearMonth. Here is the solution for SQL Server...
IF OBJECT_ID('tempdb.dbo.#tabledata', 'U') IS NOT NULL DROP TABLE #tabledata;
CREATE TABLE #tabledata
(
description VARCHAR(10)
, data1 INT
, data2 INT
, data3 INT
, year INT
, month INT
, adddate DATE
);
INSERT INTO #tabledata VALUES ('desc1', 0, 7, 1, 2019, 5, '2019-05-23')
INSERT INTO #tabledata VALUES ('desc2', 0, 7, 1, 2019, 5, '2019-05-23')
INSERT INTO #tabledata VALUES ('desc3', 1, 7, 1, 2019, 5, '2019-05-23')
INSERT INTO #tabledata VALUES ('desc4', 0, 2, 1, 2018, 12, '2019-05-23')
select a.description, a.data1, a.data2, a.data3
from #tabledata a
inner join
(
select max(convert(char(4), year) + right('0' + convert(varchar(2), month), 2)) as [MaxYearMonth]
from #tabledata
) b on convert(char(4), a.year) + right('0' + convert(varchar(2), a.month), 2) = b.MaxYearMonth
Yields this...
description data1 data2 data3
----------- ----------- ----------- -----------
desc1 0 7 1
desc2 0 7 1
desc3 1 7 1
Click here to see it in action.
You mention "mysql" so if you are really using mySQL then the syntax will likely need to change a bit.

Postgresql - How to get value from last record of each month

I have a view like this:
Year | Month | Week | Category | Value |
2017 | 1 | 1 | A | 1
2017 | 1 | 1 | B | 2
2017 | 1 | 1 | C | 3
2017 | 1 | 2 | A | 4
2017 | 1 | 2 | B | 5
2017 | 1 | 2 | C | 6
2017 | 1 | 3 | A | 7
2017 | 1 | 3 | B | 8
2017 | 1 | 3 | C | 9
2017 | 1 | 4 | A | 10
2017 | 1 | 4 | B | 11
2017 | 1 | 4 | C | 12
2017 | 2 | 5 | A | 1
2017 | 2 | 5 | B | 2
2017 | 2 | 5 | C | 3
2017 | 2 | 6 | A | 4
2017 | 2 | 6 | B | 5
2017 | 2 | 6 | C | 6
2017 | 2 | 7 | A | 7
2017 | 2 | 7 | B | 8
2017 | 2 | 7 | C | 9
2017 | 2 | 8 | A | 10
2017 | 2 | 8 | B | 11
2017 | 2 | 8 | C | 12
And I need to make a new view which needs to show average of value column (let's call it avg_val) and the value from the max week of the month (max_val_of_month). Ex: max week of january is 4, so the value of category A is 10. Or something like this to be clear:
Year | Month | Category | avg_val | max_val_of_month
2017 | 1 | A | 5.5 | 10
2017 | 1 | B | 6.5 | 11
2017 | 1 | C | 7.5 | 12
2017 | 2 | A | 5.5 | 10
2017 | 2 | B | 6.5 | 11
2017 | 2 | C | 7.5 | 12
I have use window function, over partition by year, month, category to get the avg value. But how can I get the value of the max week of each month?
Assuming that you need a month average and a value for the max week not the max value per month
SELECT year, month, category, avg_val, value max_week_val
FROM (
SELECT *,
AVG(value) OVER (PARTITION BY year, month, category) avg_val,
ROW_NUMBER() OVER (PARTITION BY year, month, category ORDER BY week DESC) rn
FROM view1
) q
WHERE rn = 1
ORDER BY year, month, category
or more verbose version without window functions
SELECT q.year, q.month, q.category, q.avg_val, v.value max_week_val
FROM (
SELECT year, month, category, avg(value) avg_val, MAX(week) max_week
FROM view1
GROUP BY year, month, category
) q JOIN view1 v
ON q.year = v.year
AND q.month = v.month
AND q.category = v.category
AND q.max_week = v.week
ORDER BY year, month, category
Here is a dbfiddle demo for both queries
And here is my NEW version.
My thanks to #peterm for pointing me about the prior false value of val_from_max_week_of_month. So, I corrected it:
SELECT
a.Year,
a.Month,
a.Category,
max(a.Week) AS max_week,
AVG(a.Value) AS avg_val,
(
SELECT b.Value
FROM decades AS b
WHERE
b.Year = a.Year AND
b.Month = a.Month AND
b.Week = max(a.Week) AND
b.Category = a.Category
) AS val_from_max_week_of_month
FROM decades AS a
GROUP BY
a.Year,
a.Month,
a.Category
;
The new results:
First, you might need to check, how do you handle the first week in January. If 1st of January are not a Monday, there are several interpretations & not every one of them will fit the solutions here. You'll either need to use:
the ISO week concept, ie. the week column should hold the ISO week & the year column should hold the ISO year (week-year, rather). Note: in this concept, 1st of January actually sometimes belongs to the previous year
use your own concept, where the first week of the year is "split" into two if 1st of January is not a Monday.
Note: the solutions below will not work if (in your table) the first week of January can be 52 or 53.
Given that avg_val is just a simple aggregation, while max_val_of_month can be calculated with typical greatest-n-per-group queries. It has a lot of possible solutions in PostgreSQL, with varying performance. Fortunately, your query will naturally have an easily determined selectivity: you'll always need (approx.) a quarter of your data.
Usual winners (in performance) are:
(These are not surprise though, as these 2 should perform more and more as you need more portion of the original data.)
array_agg() with order by variant:
select year, month, category, avg(value) avg_val,
(array_agg(value order by week desc))[1] max_val_of_month
from table_name
group by year, month, category;
distinct on variant:
select distinct on (year, month, category) year, month, category,
avg(value) over (partition by year, month, category) avg_val,
value max_val_of_month
from table_name
order by year, month, category, week desc;
The pure window function variant is not that bad either:
row_number() variant:
select year, month, category, avg_val, max_val_of_month
from (select year, month, category, value max_val_of_month,
avg(value) over (partition by year, month, category) avg_val,
row_number() over (partition by year, month, category order by week desc) rn
from table_name) w
where rn = 1;
But the LATERAL variant is only viable with an index:
LATERAL variant:
create index idx_table_name_year_month_category_week_desc
on table_name(year, month, category, week desc);
select year, month, category,
avg(value) avg_val,
max_val_of_month
from table_name t
cross join lateral (select value max_val_of_month
from table_name
where (year, month, category) = (t.year, t.month, t.category)
order by week desc
limit 1) m
group by year, month, category, max_val_of_month;
But most of the solutions above can actually utilize this index, not just this last one.
Without the index: http://rextester.com/WNEL86809
With the index: http://rextester.com/TYUA52054
with data (yr, mnth, wk, cat, val) as
(
-- begin test data
select 2017 , 1 , 1 , 'A' , 1 from dual union all
select 2017 , 1 , 1 , 'B' , 2 from dual union all
select 2017 , 1 , 1 , 'C' , 3 from dual union all
select 2017 , 1 , 2 , 'A' , 4 from dual union all
select 2017 , 1 , 2 , 'B' , 5 from dual union all
select 2017 , 1 , 2 , 'C' , 6 from dual union all
select 2017 , 1 , 3 , 'A' , 7 from dual union all
select 2017 , 1 , 3 , 'B' , 8 from dual union all
select 2017 , 1 , 3 , 'C' , 9 from dual union all
select 2017 , 1 , 4 , 'A' , 10 from dual union all
select 2017 , 1 , 4 , 'B' , 11 from dual union all
select 2017 , 1 , 4 , 'C' , 12 from dual union all
select 2017 , 2 , 5 , 'A' , 1 from dual union all
select 2017 , 2 , 5 , 'B' , 2 from dual union all
select 2017 , 2 , 5 , 'C' , 3 from dual union all
select 2017 , 2 , 6 , 'A' , 4 from dual union all
select 2017 , 2 , 6 , 'B' , 5 from dual union all
select 2017 , 2 , 6 , 'C' , 6 from dual union all
select 2017 , 2 , 7 , 'A' , 7 from dual union all
select 2017 , 2 , 8 , 'A' , 10 from dual union all
select 2017 , 2 , 8 , 'B' , 11 from dual union all
select 2017 , 2 , 7 , 'B' , 8 from dual union all
select 2017 , 2 , 7 , 'C' , 9 from dual union all
select 2018 , 2 , 7 , 'C' , 9 from dual union all
select 2017 , 2 , 8 , 'C' , 12 from dual
-- end test data
)
select * from
(
select
-- data.*: all columns of the data table
data.*,
-- avrg: partition by a combination of year,month and category to work out -
-- the avg for each category in a month of a year
avg(val) over (partition by yr, mnth, cat) avrg,
-- mwk: partition by year and month to work out -
-- the max week of a month in a year
max(wk) over (partition by yr, mnth) mwk
from
data
)
-- as OP's interest is in the max week of each month of a year, -
-- "wk" column value is matched against
-- the derived column "mwk"
where wk = mwk
order by yr,mnth,cat;

List of years between two dates

I have a table with columns for a start- and enddate.
My goal is to get a list of each year in that timespan for each row, so
+-------------------------+
| startdate | enddate |
+------------+------------+
| 2004-08-01 | 2007-01-08 |
| 2005-06-02 | 2007-05-08 |
+------------+------------+
should output this:
+-------+
| years |
+-------+
| 2004 |
| 2005 |
| 2006 |
| 2007 |
| 2005 |
| 2006 |
| 2007 |
+-------+
I have problems now to create the years in between the two dates. My first approach was to use a UNION (order of dates is irrelevant), but the years in between are missing in this case...
Select
Extract(Year From startdate)
From
table1
Union
Select
Extract(Year From enddate)
From
table1
Thanks for any advises!
Row Generator technique
SQL> WITH DATA1 AS(
2 SELECT TO_DATE('2004-08-01','YYYY-MM-DD') STARTDATE, TO_DATE('2007-01-08','YYYY-MM-DD') ENDDATE FROM DUAL UNION ALL
3 SELECT TO_DATE('2005-06-02','YYYY-MM-DD') STARTDATE, TO_DATE('2007-05-08','YYYY-MM-DD') ENDDATE FROM DUAL
4 ),
5 DATA2 AS(
6 SELECT EXTRACT(YEAR FROM STARTDATE) ST, EXTRACT(YEAR FROM ENDDATE) ED FROM DATA1
7 ),
8 data3
9 AS
10 (SELECT level-1 line
11 FROM DUAL
12 CONNECT BY level <=
13 (SELECT MAX(ed-st) FROM data2
14 )
15 )
16 SELECT ST+LINE FROM
17 DATA2, DATA3
18 WHERE LINE <= ED-ST
19 ORDER BY 1
20 /
ST+LINE
----------
2004
2005
2005
2006
2006
2007
6 rows selected.
SQL>
Try this Query
; with CTE as
(
select datepart(year, '2005-12-25') as yr
union all
select yr + 1
from CTE
where yr < datepart(year, '2013-11-14')
)
select yr
from CTE
Try this:
Create a table with years as follow:
CREATE TABLE tblyears(y int)
INSERT INTO tblyears VALUES (1900);
INSERT INTO tblyears VALUES (1901);
INSERT INTO tblyears VALUES (1902);
and so on until
INSERT INTO tblyears VALUES (2100)
So, you'll write this query:
SELECT y.y
FROM tblyears y
JOIN table1 t
ON y.y >= EXTRACT(year from startdate)
AND y.y <= EXTRACT(year from enddate)
ORDER BY y.y
Show SqlFiddle