Creating a fix table with common time period across 1 year period - sql

I need to create a table with common time interval across a year, with the percentage completion variable as well as the serial number, using SQL query:
S/N Percentage Month
1 8% June
2 17% July
3 25% August
...
...
12 100% May
I would like to ask if there is a more cleaner/ efficient way of doing it.
My original approach is to first create the time interval and the serial number using a recursive CTE, following by creating the percentage attribute next.
Thank you!

Alternate solution for DB2 Z/OS
SELECT
rownum "S/N" ,
100*( DAYS(stdt + (rownum-1) MONTH ) - DAYS(stdt -1 MONTH ) ) /365 "Percentage" ,
VARCHAR_FORMAT(stdt + (rownum-1) MONTH,'Month') "Month"
FROM (
SELECT ROW_NUMBER() OVER() , DATE('2018-06-01')
FROM SYSIBM.SYSCOLUMNS
) T(rownum,stdt)
WHERE rownum <=12

You can use syscat.columns table to generate months and row_number() function to get percentage
Here is the Query:-
SELECT rn "S/N",
ROUND((rn /count(1) over())*100,0)|| '%' "Percentage",
my_month "Month" from (
-- #startdate '04/01/2017' (MM/DD/YYYY) format
SELECT VARCHAR_FORMAT(DATE(#startdate) + (ROW_NUMBER()OVER() - 1) MONTH,'MON') my_month,
ROW_NUMBER()OVER() rn
FROM SYSCAT.COLUMNS where rownum <=12
)

Related

Finding id's available in previous weeks but not in current week

How to find if an id which was present in previous weeks but not available in current week on a rolling basis. For e.g
Week1 has id 1,2,3,4,5
Week2 has id 3,4,5,7,8
Week3 has id 1,3,5,10,11
So I found out that id 1 and 2 are missing in week 2 and id 2,4,7,8 are missing in week 3 from previous 2 weeks But how to do this on a rolling window for a large amount of data distributed over a period of 20+ years
Please find the sample dataset and expected output. I am expecting the output to be partitioned based on the week_end Date
Dataset
ID|WEEK_START|WEEK_END|APPEARING_DATE
7152|2015-12-27|2016-01-02|2015-12-27
8350|2015-12-27|2016-01-02|2015-12-27
7152|2015-12-27|2016-01-02|2015-12-29
4697|2015-12-27|2016-01-02|2015-12-30
7187|2015-12-27|2016-01-02|2015-01-01
8005|2015-12-27|2016-01-02|2015-12-27
8005|2015-12-27|2016-01-02|2015-12-29
6254|2016-01-03|2016-01-09|2016-01-03
7962|2016-01-03|2016-01-09|2016-01-04
3339|2016-01-03|2016-01-09|2016-01-06
7834|2016-01-03|2016-01-09|2016-01-03
7962|2016-01-03|2016-01-09|2016-01-05
7152|2016-01-03|2016-01-09|2016-01-07
8350|2016-01-03|2016-01-09|2016-01-09
2403|2016-01-10|2016-01-16|2016-01-10
0157|2016-01-10|2016-01-16|2016-01-11
2228|2016-01-10|2016-01-16|2016-01-14
4697|2016-01-10|2016-01-16|2016-01-14
Excepted Output
Partition1: WEEK_END=2016-01-02
ID|MAX(LAST_APPEARING_DATE)
7152|2015-12-29
8350|2015-12-27
4697|2015-12-30
7187|2015-01-01
8005|2015-12-29
Partition1: WEEK_END=2016-01-09
ID|MAX(LAST_APPEARING_DATE)
7152|2016-01-07
8350|2016-01-09
4697|2015-12-30
7187|2015-01-01
8005|2015-12-29
6254|2016-01-03
7962|2016-01-05
3339|2016-01-06
7834|2016-01-03
Partition3: WEEK_END=2016-01-10
ID|MAX(LAST_APPEARING_DATE)
7152|2016-01-07
8350|2016-01-09
4697|2016-01-14
7187|2015-01-01
8005|2015-12-29
6254|2016-01-03
7962|2016-01-05
3339|2016-01-06
7834|2016-01-03
2403|2016-01-10
0157|2016-01-11
2228|2016-01-14
Please use below query,
select ID, MAX(APPEARING_DATE) from table_name
group by ID, WEEK_END;
Or, including WEEK)END,
select ID, WEEK_END, MAX(APPEARING_DATE) from table_name
group by ID, WEEK_END;
You can use aggregation:
select t.*, max(week_end)
from t
group by id
having max(week_end) < '2016-01-02';
Adjust the date in the having clause for the week end that you want.
Actually, your question is a bit unclear. I'm not sure if a later week end would keep the row or not. If you want "as of" data, then include a where clause:
select t.id, max(week_end)
from t
where week_end < '2016-01-02'
group by id
having max(week_end) < '2016-01-02';
If you want this for a range of dates, then you can use a derived table:
select we.the_week_end, t.id, max(week_end)
from (select '2016-01-02' as the_week_end union all
select '2016-01-09' as the_week_end
) we cross join
t
where t.week_end < we.the_week_end
group by id, we.the_week_end
having max(t.week_end) < we.the_week_end;

Rolling 12 month filter criteria in SQL

Having an issue in SQL script where I’m trying to achieve filter criteria of rolling 12 months in the day column which stored data as a text in server.
Goal is to count sizes for product at retail store location over the last 12 months from the current day. Currently, in my query I'm using the criteria of year 2019 which only counts the sizes for that year but not for rolling 12 months from current date.
CALENDARDAY column is in text field in the data set and data stores in yyyymmdd format.
When trying to run below script in Tableau with GETDATE and DATEADD function it is giving me a functional error. I am trying to access SAP HANA server with below query.
Any help would be appreciated
Select
SKU, STYLE_ID, Base_Style_ID, COLOR, SIZEKEY, STORE, Year,
count(SIZEKEY)over(partition by STYLE_ID,COLOR,STORE,Year) as SZ_CNT
from
(
select
a."RAW" As SKU,
a."STYLENUM" As STYLE_ID,
mat."BASENUM" AS Base_Style_ID,
a."COLORNUM" AS COLOR,
a."SIZE" AS SIZEKEY,
a."STORENUM" AS STORE,
substring(a."CALENDARDAY",1,4) As year
from PRTRPT_XRE as a
JOIN ZAT_SKU As mat On a."RAW" = mat."SKU"
where a."ORGANIZATION" = 'M20'
and a."COLORNUM" is not null
and substring(a."CALENDARDAY",1,4) = '2019'
Group BY
a."RAW",
a."STYLENUM",
mat."BASENUM",
a."ZCOLORCD",
a."SIZE",
a."STORENUM",
substring(a."CALENDARDAY",1,4)
)
I have never worked on that DB / Server, so I don't have a way to test this.
But hopefully this will work (expecting exact 12 months before today's date)
AND ADD_MONTHS (TO_DATE (a."CALENDARDAY", 'YYYY-MM-DD'), 12) > CURRENT_DATE
or
AND ADD_MONTHS (a."CALENDARDAY", 12) > CURRENT_DATE
Below condition from one of our CALENDAR table also worked same way as ADD_MONTHS mentioned in above response
select distinct CALENDARDAY
from
(
select FISCALWEEK, CALENDARDAY, CNST, row_number()over(partition by CNST order by FISCALWEEK desc) as rnum
from
(
select distinct FISCALWEEK, CALENDARDAY, 'A' as CNST
from CALENDARTABLE
where CALENDARDAY < current_date
order by 1,2
)
) where rnum < 366

Find Distinct IDs when the due date is always on the last day of each month

I have to find distinct IDs throughout the whole history of each ID whose due dates are always on the last day of each month.
Suppose I have the following dataset:
ID DUE_DT
1 1/31/2014
1 2/28/2014
1 3/31/2014
1 6/30/2014
2 1/30/2014
2 2/28/2014
3 1/29/2016
3 2/29/2016
I want to write a code in SQL so that it gives me ID = 1 as for this specific ID the due date is always on the last day of each given month.
What would be the easiest way to approach it?
You can do:
select id
from t
group by id
having sum(case when extract(day from due_dt + interval '1 day') = 1 then 1 else 0 end) = count(*);
This uses ANSI/ISO standard functions for date arithmetic. These tend to vary by database, but the idea is the same in all databases -- add one day and see if the day of the month is 1 for all the rows.
If your using SQL Server 2012+ you can use the EOMONTH() function to achieve this:
SELECT DISTINCT ID FROM [table]
WHERE DUE_DT = EOMONTH(DUE_DT)
http://rextester.com/VSPQR78701
The idea is quite simple:
you are on the last day of the month if (the month of due date) is not the same as (the month of due date + 1 day). This covers all cases across year, leap year and so on.
from there on, if (the count of rows for one id) is the same as (the count of rows for this id which are the last day of the month) you have a winner.
I tried to write an example (not tested). You do not specify which DB so I will assume that cte (common table expression) are available. If not just put the cte as subquery.
In the same way, I am not sure that dateadd and interval work the same in all dialect.
with addlastdayofmonth as (
select
id
-- adding a 'virtualcolumn', 1 if last day of month 0 otherwise
, if(month(dateadd(due_date, interval '1' day)) != month(due_date), 1 ,0) as onlastday
from
table
)
select
id
, count(*) - sum(onlastday) as alwayslastday
from
addlastdayofmonth
group by
id
having
-- if count(rows) == count(rows with last day) we have a winner
halwayslastday = 0
MySQL-Version (credits to #Gordon Linoff)
SELECT
ID
FROM
<table>
GROUP BY
ID
HAVING
SUM(IF(day(DUE_DT + interval 1 Day) = 1, 1, 0)) = COUNT(ID);
Original Answer:
SELECT MAX(DUE_DT) FROM <table> WHERE ID = <the desired ID>
or if you want all MAX(DUE_DT) for each unique ID
SELECT ID, MAX(DATE) FROM <table> GROUP BY ID

sum based on max production date and min production date MTD,WTD, YTD SQL Server

Hello I am trying to create a automated query that displays month to date, year to date, and week to date and creates a column for each. I need to sum balance amount if the production date is the maximum - the minimum production date sum of deposits. This will give me a YTD column. I also need to do month to date and week to date if anyone has any ideas. Any help with this would be appreciated. Thanks!
P.S. I am using microsoft sql server management studio
Here is what I have so far:
select SUM([curr_bal_amt]) as total_amt , [prod_dt] as date123
from [dbo].[DEPOSIT_TEST]
group by [prod_dt];
this results in a chart like:
Overall I need to calculate year to date as subtracting the max date i have minus the min date i have. Later on when i import more data i need to do mtd and wtd. Thanks
Edit: I am looking to use my current table so maybe it would help to edit this table as I forgot to mention that I have 3 day gaps in data.
-also for my prod_dt column i have multiple balances that I must sum if the prod_dt is the same. Is there a simple query to just subtract the most recent date's sum of curr_balance amt - the first date of the last month's sum of curr_balance amt. Thanks for your help Shawn it is greatly appreciated!
this is an example of one of my data imports for one of my days
Please if you could use the names of my columns it would be very beneficial so that I could learn better. Thank you! the name of my table is Deposit_Test and the column names are just like the ones in the picture. Thank you again
This should give you a good idea of how to get at those totals. I don't know what other data you're after in your tables, but you should be able to modify the below query to get at it.
SQL Fiddle
MS SQL Server 2017 Schema Setup:
/********************************CALENDAR********************************/
/*
My original answer made use of a Calendar Table, but I realized it
was overkill for this situation. I still think every database should
have both a Calendar Table and a Numbers Table. They are both very
useful. I use the ct here just to populate my test table, but I've
left some very basic creation to show you how it can be done. Calcs
done here allow your final query to JOIN to it and avoid RBAR to be
more set-based, and save a lot of processing for large tables.
NOTE: This original date table concept is from Aaron Bertrand.
*/
CREATE TABLE datedim (
theDate date PRIMARY KEY
, theDay AS DATEPART(day, theDate) --int
, theWeek AS DATEPART(week, theDate) --int
, theMonth AS DATEPART(month, theDate) --int
, theYear AS DATEPART(year, theDate) --int
, yyyymmdd AS CONVERT(char(8), theDate, 112) /* yyyymmdd */
);
/************************************************************************/
/*
Use the catalog views to generate as many rows as we need. This example
creates a date dimension for all of 2018.
*/
INSERT INTO datedim ( theDate )
SELECT d
FROM (
SELECT d = DATEADD(day, rn - 1, '20180101')
FROM
(
SELECT TOP (DATEDIFF(day, '20180101', '20190101'))
rn = ROW_NUMBER() OVER (ORDER BY s1.object_id)
FROM sys.all_objects AS s1
CROSS JOIN sys.all_objects AS s2
ORDER BY s1.object_id
) AS x
) AS y;
/************************************************************************/
/***** TEST TABLE SETUP *****/
CREATE TABLE t1 ( id int identity, entryDate date, cnt int) ;
INSERT INTO t1 (entryDate, cnt)
SELECT theDate, 2
FROM datedim
;
/* Remove a few "random" records to test our counts. */
DELETE FROM t1
WHERE datePart(day,entryDate) IN (10,6,14,22) OR datepart(month,entryDate) = 6
;
Main Query:
/* Make sure the first day or our week is consistent. */
SET DATEFIRST 7 ; /* SUNDAY */
/* Then build out our query needs with CTEs. */
; WITH theDate AS (
SELECT d.dt FROM ( VALUES ( '2018-05-17' ) ) d(dt)
)
, base AS (
SELECT t1.entryDate
, t1.cnt
, theDate.dt
, datepart(year,theDate.dt) AS theYear
, datepart(month,theDate.dt) AS theMonth
, datepart(week,theDate.dt) AS theWeek
FROM t1
CROSS APPLY theDate
WHERE t1.EntryDate <= theDate.dt
AND datePart(year,t1.EntryDate) = datePart(year,theDate.dt)
)
/* Year-to-date totals */
, ytd AS (
SELECT b.theYear, sum(cnt) AS s
FROM base b
GROUP BY b.theYear
)
/* Month-to-date totals */
, mtd AS (
SELECT b2.theYear, b2.theMonth, sum(cnt) AS s
FROM base b2
WHERE b2.theMonth = datePart(month,b2.EntryDate)
GROUP BY b2.theYear, b2.theMonth
)
/* Week-to-date totals */
, wtd AS (
SELECT b3.theYear, b3.theMonth, sum(cnt) AS s
FROM base b3
WHERE b3.theWeek = datePart(week,b3.EntryDate)
GROUP BY b3.theYear, b3.theMonth
)
SELECT blah = 'CountRow'
, ytd.s AS ytdAmt
, mtd.s AS mtdAmt
, wtd.s AS wtdAmt
FROM ytd
CROSS APPLY mtd
CROSS APPLY wtd
Results:
| blah | ytdAmt | mtdAmt | wtdAmt |
|----------|--------|--------|--------|
| CountRow | 236 | 28 | 8 |
Again, the data that you need to get will likely change the overall query, but this should point in the right direction. You can use each CTE to verify the YTD, MTD and WTD totals.

Last day of the month with a twist in SQLPLUS

I would appreciate a little expert help please.
in an SQL SELECT statement I am trying to get the last day with data per month for the last year.
Example, I am easily able to get the last day of each month and join that to my data table, but the problem is, if the last day of the month does not have data, then there is no returned data. What I need is for the SELECT to return the last day with data for the month.
This is probably easy to do, but to be honest, my brain fart is starting to hurt.
I've attached the select below that works for returning the data for only the last day of the month for the last 12 months.
Thanks in advance for your help!
SELECT fd.cust_id,fd.server_name,fd.instance_name,
TRUNC(fd.coll_date) AS coll_date,fd.column_name
FROM super_table fd,
(SELECT TRUNC(daterange,'MM')-1 first_of_month
FROM (
select TRUNC(sysdate-365,'MM') + level as DateRange
from dual
connect by level<=365)
GROUP BY TRUNC(daterange,'MM')) fom
WHERE fd.cust_id = :CUST_ID
AND fd.coll_date > SYSDATE-400
AND TRUNC(fd.coll_date) = fom.first_of_month
GROUP BY fd.cust_id,fd.server_name,fd.instance_name,
TRUNC(fd.coll_date),fd.column_name
ORDER BY fd.server_name,fd.instance_name,TRUNC(fd.coll_date)
You probably need to group your data so that each month's data is in the group, and then within the group select the maximum date present. The sub-query might be:
SELECT MAX(coll_date) AS last_day_of_month
FROM Super_Table AS fd
GROUP BY YEAR(coll_date) * 100 + MONTH(coll_date);
This presumes that the functions YEAR() and MONTH() exist to extract the year and month from a date as an integer value. Clearly, this doesn't constrain the range of dates - you can do that, too. If you don't have the functions in Oracle, then you do some sort of manipulation to get the equivalent result.
Using information from Rhose (thanks):
SELECT MAX(coll_date) AS last_day_of_month
FROM Super_Table AS fd
GROUP BY TO_CHAR(coll_date, 'YYYYMM');
This achieves the same net result, putting all dates from the same calendar month into a group and then determining the maximum value present within that group.
Here's another approach, if ANSI row_number() is supported:
with RevDayRanked(itemDate,rn) as (
select
cast(coll_date as date),
row_number() over (
partition by datediff(month,coll_date,'2000-01-01') -- rewrite datediff as needed for your platform
order by coll_date desc
)
from super_table
)
select itemDate
from RevDayRanked
where rn = 1;
Rows numbered 1 will be nondeterministically chosen among rows on the last active date of the month, so you don't need distinct. If you want information out of the table for all rows on these dates, use rank() over days instead of row_number() over coll_date values, so a value of 1 appears for any row on the last active date of the month, and select the additional columns you need:
with RevDayRanked(cust_id, server_name, coll_date, rk) as (
select
cust_id, server_name, coll_date,
rank() over (
partition by datediff(month,coll_date,'2000-01-01')
order by cast(coll_date as date) desc
)
from super_table
)
select cust_id, server_name, coll_date
from RevDayRanked
where rk = 1;
If row_number() and rank() aren't supported, another approach is this (for the second query above). Select all rows from your table for which there's no row in the table from a later day in the same month.
select
cust_id, server_name, coll_date
from super_table as ST1
where not exists (
select *
from super_table as ST2
where datediff(month,ST1.coll_date,ST2.coll_date) = 0
and cast(ST2.coll_date as date) > cast(ST1.coll_date as date)
)
If you have to do this kind of thing a lot, see if you can create an index over computed columns that hold cast(coll_date as date) and a month indicator like datediff(month,'2001-01-01',coll_date). That'll make more of the predicates SARGs.
Putting the above pieces together, would something like this work for you?
SELECT fd.cust_id,
fd.server_name,
fd.instance_name,
TRUNC(fd.coll_date) AS coll_date,
fd.column_name
FROM super_table fd,
WHERE fd.cust_id = :CUST_ID
AND TRUNC(fd.coll_date) IN (
SELECT MAX(TRUNC(coll_date))
FROM super_table
WHERE coll_date > SYSDATE - 400
AND cust_id = :CUST_ID
GROUP BY TO_CHAR(coll_date,'YYYYMM')
)
GROUP BY fd.cust_id,fd.server_name,fd.instance_name,TRUNC(fd.coll_date),fd.column_name
ORDER BY fd.server_name,fd.instance_name,TRUNC(fd.coll_date)