Tracking Changes in Retail Store Ownership Over Time - sql

Sample Data Set and Desired Query Result
I have a list of retail stores that have changed ownership over time (from company owned, to licensed or franchised and back to company owned). I'm trying to track the count of stores open each month under each ownership type. For each change in ownership I have a new line item in the data set, each of which has a corresponding Start and End date for the change.
I'm trying to count the stores by ownership type for each month of the data and I'm having trouble figuring out how to count the stores for the all the months IN BETWEEN the dates of the ownership start and end dates. Hopefully the picture makes it perfectly clear what I'm looking to do.
select
b.fscl_yr_num
,b.fscl_per_in_yr_num
,a.ownr_type_cd
,sum(case when a.line_start_dt < b.end_dt and a.line_end_dt <= b.End_Dt then 1 else 0 end)
from
(
(
select *
from
(select
store_num
,ownr_type_cd
,case when store_term_dt is not null then 'Closed' else 'Open' end as Status
,case when to_date(trim(store_open_dt),'DD-MON-YY') > to_date(trim(eff_from_dt),'DD-MON-YY') then to_date(trim(store_open_dt),'DD-MON-YY') else to_date(trim(eff_from_dt),'DD-MON-YY') end as Line_Start_Dt
,case when store_term_dt is null then eff_to_dt
when to_date(trim(store_term_dt),'DD-MON-YY') < to_date(trim(eff_to_dt),'DD-MON-YY') then to_date(trim(store_term_dt),'DD-MON-YY') else to_date(trim(eff_to_dt),'DD-MON-YY') end as Line_End_Dt
from
(select
store_num
,store_open_dt
,store_term_dt
,eff_from_dt
,eff_to_dt
,ownr_type_cd
from
appca.d_store_vers
where
upper(cntry_cd_2_dgt_iso) = 'GB'
and postal_cd not like ('BT%')
and store_open_dt is not null
group by
store_num
,store_open_dt
,store_term_dt
,eff_from_dt
,eff_to_dt
,ownr_type_cd
order by
store_num
,eff_from_dt)
group by
store_num
,ownr_type_cd
,case when store_term_dt is not null then 'Closed' else 'Open' end
,case when to_date(trim(store_open_dt),'DD-MON-YY') > to_date(trim(eff_from_dt),'DD-MON-YY') then to_date(trim(store_open_dt),'DD-MON-YY') else to_date(trim(eff_from_dt),'DD-MON-YY') end
,case when store_term_dt is null then eff_to_dt
when to_date(trim(store_term_dt),'DD-MON-YY') < to_date(trim(eff_to_dt),'DD-MON-YY') then to_date(trim(store_term_dt),'DD-MON-YY') else to_date(trim(eff_to_dt),'DD-MON-YY') end
order by
1 asc
,2 asc
,3 asc)
where
to_date(trim(line_start_dt),'DD-MON-YY') < to_date(trim(line_end_dt),'DD-MON-YY')
) A
right join
--Calendar Table--
(
select
fscl_yr_num, fscl_per_in_yr_num, Cal_dt min(to_date(trim(cal_dt),'DD-MON-YY')) as Start_Dt, max(to_date(trim(cal_dt),'DD-MON-YY')) as End_Dt
from
appca.d_cal
where
fscl_yr_num is between 1990 and 2018
group by
fscl_yr_num, fscl_per_in_yr_num
order by
1 asc, 2 asc
) B
on A.line_end_dt = B.cal_dt
)
group by
b.fscl_yr_num
,b.fscl_per_in_yr_num
,a.ownr_type_cd
order by
b.fscl_yr_num
,b.fscl_per_in_yr_num
;

Try this case:
sum(case when createTime >(cast(year(createTime) as varchar) +'-'+cast( MONTH(createTime) as varchar)+'-1') and createTime <(cast(year(createTime) as varchar) +'-'+cast( MONTH(createTime) as varchar)+'-31')
then 1 else 0 end) 'Company'

Related

How to nest multiple case when expressions and add a condition

I am trying to divide customers (contact_key) column who shopped in 2021 (A.TXN_MTH) into new and 'returning' with returning meaning that they had not shopped in the last 12 months (YYYYMM in X.Fiscal_mth_idnt column).
I am using CASE WHEN A.TXN_MTH = MIN(X.FISCAL_MTH_IDNT) THEN 'NEW' which is correct. The next case when should be when the max month before X.TXN_MTH is 12 or more months previous. I have added the 12 months part in the Where statement. Should I be nesting 3 CASE WHEN'S instead of WHERE?
SELECT
T.CONTACT_KEY
, A.TXN_MTH
, CASE WHEN A.TXN_MTH = MIN(X.FISCAL_MTH_IDNT) THEN 'NEW'
WHEN (MAX(CASE WHEN X.FISCAL_MTH_IDNT < A.TXN_MTH THEN X.FISCAL_MTH_IDNT ELSE NULL END)) THEN 'RETURNING'
END AS CUST_TYPE
FROM B_TRANSACTION T
INNER JOIN B_TIME X
ON T.TRANSACTION_DT_KEY = X.DATE_KEY
INNER JOIN A
ON A.CONTACT_KEY = T.CONTACT_KEY AND A.BU_KEY = T.BU_KEY
WHERE (MAX(CASE WHEN X.FISCAL_MTH_IDNT < A.TXN_MTH THEN X.FISCAL_MTH_IDNT ELSE NULL END)) < A.TXN_MTH - (date_format(add_months(concat_ws('-',substr(yearmonth,1,4),substr(yearmonth,5,2),'01'),-12),'yyyyMM')
GROUP BY
T.CONTACT_KEY
, TXN_MTH;
You have not described your tables, so assuming fiscal_mth_idnt is a DATE column then you can use the LAG analytic function to find the previous row's value:
SELECT contact_key,
txn_mth,
CASE
WHEN prev_fiscal_mth_idnt IS NULL
THEN 'NEW'
WHEN ADD_MONTHS(prev_fiscal_mth_idnt, 12) < fiscal_mth_idnt
THEN 'RETURNING'
ELSE 'CURRENT'
END AS cust_type
FROM (
SELECT T.CONTACT_KEY,
A.TXN_MTH,
yearmonth,
X.FISCAL_MTH_IDNT,
LAG(X.FISCAL_MTH_IDNT) OVER (
PARTITION BY T.CONTACT_KEY
ORDER BY X.FISCAL_MTH_IDNT
) AS prev_fiscal_mth_idnt
FROM B_TRANSACTION T
INNER JOIN B_TIME X
ON T.TRANSACTION_DT_KEY = X.DATE_KEY
INNER JOIN A
ON A.CONTACT_KEY = T.CONTACT_KEY AND A.BU_KEY = T.BU_KEY
)
WHERE yearmonth LIKE '2021%';

How select second max value from a subquery

I need to select a max date (DATA_RIF_PATR_KU) for the key PROT, CODICE_COM, but there are also 9999-12-31 values. So when there are others date than 9999-12-31 I need to select the "real max date" else 9999-12-31.
For example:
For the combination:
PROT = '202000060300' AND
CODICE_COM='Z01'
I have 2 dates of DATA_RIF_PATR_KU: 03-11-20, 9999-12-31. I need to select the first one.
I wrote this code:
'''
SELECT
A.GRADO,
A.COMM,
A.PROT,
MAX(A.FLAG_VAL_IND) AS FLAG_VAL_IND,
SUM(A.CONT_VERS) AS CONT_VERS,
SUM(A.IMP_PREN) AS IMP_PREN,
SUM(A.CONT_DIFF) AS CONT_DIFF,
MIN(A.DATA_VER) AS DATA_VER,
MAX(A.PREN_DEB) AS PREN_DEB,
MAX(A.PREN_DEB_VER) AS PREN_DEB_VER,
MAX(A.GRAT_PATR_KU) AS GRAT_PATR_KU,
MAX(CASE WHEN (A.MAX_DATA_RIF_PATR_NON_9999)='SI' THEN A.DATA_RIF_PATR_KU
WHEN (A.MAX_DATA_RIF_PATR_NON_9999_2)='SI2' THEN A.DATA_RIF_PATR_KU END) AS DATA_RIF_PATR_KU,
SUM(A.MAGG_PEC) AS MAGG_PEC,
MAX(A.ASS_PEC) AS ASS_PEC,
MAX(A.ASS_CF) AS ASS_CF,
MAX(A.ASS_VAL) AS ASS_VAL,
MAX(A.FLAG_LAV) AS FLAG_LAV,
SUM(A.CONT_DIC) AS CONT_DIC,
SUM(A.CONT_TOT) AS CONT_TOT,
SUM(A.CONT_DIFF_VERIF) AS CONT_DIFF_VERIF,
MAX(A.DATA_AGG) AS DATA_ULTIMA_VALIDAZIONE,
MAX(CASE WHEN (A.DATA_INV)<'9999-12-31'
THEN '1'
ELSE '0' END) AS PRESENZA_INVITO_PAG,
MAX(CASE WHEN (A.DATA_VERS_INV)<'9999-12-31'
THEN '1'
ELSE '0' END) AS PRESENZA_VERS_INVITO
FROM
(
SELECT
distinct BILS05_GRADO AS GRADO,
BILS05_CODICE_COM AS COMM,
BILS05_PROT AS PROT,
CASE WHEN BILS05_FLAG_VAL_IND IN ('0','9')
THEN 'D'
ELSE 'I' END AS FLAG_VAL_IND,
bils05_tipo_doc, bils05_prog_alleg, bils05_pren_deb as pren_deb,
BILS05_PREN_DEB_VERIF AS PREN_DEB_VER,
CASE WHEN BILS05_PREN_DEB='1' AND BILS05_PREN_DEB_VERIF='1'
THEN 0
ELSE BILS05_CONT_VERS_VERIF END AS CONT_VERS,
CASE WHEN BILS05_PREN_DEB='1' AND BILS05_PREN_DEB_VERIF='1'
THEN BILS05_CONT_VERS_VERIF
ELSE 0 END AS IMP_PREN,
BILS05_CONT_DIFF AS CONT_DIFF,
BILS05_DATA_ACQ_KU AS DATA_VER,
BILS05_MAGG_PEC AS MAGG_PEC,
BILS05_ASS_PEC AS ASS_PEC,BILS05_ASS_CF AS ASS_CF,BILS05_ASS_VAL AS ASS_VAL,
BILS05_FLAG_LAV AS FLAG_LAV,
BILS05_CONT_VERS AS CONT_DIC,
BILS05_CONT_TOT AS CONT_TOT,
BILS05_CONT_DIFF_VERIF AS CONT_DIFF_VERIF,
BILS05_DATA_AGG_KU AS DATA_AGG,
BILS05_DATA_INV_PAG AS DATA_INV,
BILS05_DATA_VERS_INV AS DATA_VERS_INV,
BILS05_GRAT_PATR as GRAT_PATR_KU,
BILS05_DATA_RIF_PATR AS DATA_RIF_PATR_KU,
ROW_NUMBER() OVER(PARTITION BY BILS05_PROT, BILS05_CODICE_COM ORDER BY BILS05_DATA_RIF_PATR) AS RN,
LEAD(DATA_RIF_PATR_KU) OVER(PARTITION BY BILS05_PROT, BILS05_CODICE_COM ORDER BY DATA_RIF_PATR_KU) AS FOLLOW_DATA,
LAG(DATA_RIF_PATR_KU) OVER(PARTITION BY BILS05_PROT, BILS05_CODICE_COM ORDER BY DATA_RIF_PATR_KU) AS PREV_DATA,
CASE WHEN (DATA_RIF_PATR_KU < FOLLOW_DATA AND DATA_RIF_PATR_KU NOT = '9999-12-31' AND FOLLOW_DATA='9999-12-31' )THEN 'SI'
ELSE 'NO' END AS MAX_DATA_RIF_PATR_NON_9999,
CASE WHEN (DATA_RIF_PATR_KU NOT = '9999-12-31' AND FOLLOW_DATA IS NULL) OR (DATA_RIF_PATR_KU = '9999-12-31' AND FOLLOW_DATA IS NULL)
THEN 'SI2' ELSE 'NO' END AS MAX_DATA_RIF_PATR_NON_9999_2
FROM ZUCOW.BILS05
WHERE
BILS05_FLAG_LAV='2'
) A
WHERE
A.PROT='202000060300' AND
A.COMM='Z01'
GROUP BY A.GRADO, A.COMM, A.PROT;
'''
but it doesn't work properly. It select 9999-12-31 instead of 03-11-20 in the above example
Your example is rather long and complex, but the general idea would be
(assuming DATA_RIF_PATR_KU is a date field):
select
max(case when DATA_RIF_PATR_KU='9999-12-31' then null else DATA_RIF_PATR_KU end),
PROT, CODICE_COM
FROM FROM ZUCOW.BILS05
GROUP BY 2,3
yes that's the general idea.. Anyway at the end, I solved it in another way:
MAX(CASE WHEN TO_CHAR(A.DATA_RIF_PATR_KU, 'YYYY-MM-DD' ) LIKE ('9999-12-31')THEN '0000-12-31' ELSE TO_CHAR(A.DATA_RIF_PATR_KU, 'YYYY-MM-DD') END) AS DATA_RIF_PATR_KU_STR
So, I converted the 9999-12-31 values in 0000-12-31 (I had to to transform the date value in string), so then I could do the max and then I converted the values back in the ETL software.
Thank You all for helping

Return the latest date based from a range of columns

I have a query that counts the number of completed tasks as well as returning the original values.
I'd like to add a new column which returns the most recent date (in this case task_1_completed_date or task_2_completed_date but in reality there are 20 task fields)
(CASE WHEN task_1_completed_date IS NOT NULL THEN 1 ELSE 0 END +
CASE WHEN task_2_completed_date IS NOT NULL THEN 1 ELSE 0 END
) AS task_completed_total
from (select JSON_EXTRACT_SCALAR(data, '$.task1.date') as task_1_completed_date
JSON_EXTRACT_SCALAR(data, '$.task2.date') as task_2_completed_date
from table
WHERE pet_store = 'london'
)
Not sure how to proceed, should I use a subquery here to order the task completion dates?
Use order by
(CASE WHEN task_1_completed_date IS NOT
NULL THEN 1 ELSE 0 END +
CASE WHEN task_2_completed_date IS NOT
NULL THEN 1 ELSE 0 END
) AS task_completed_total
from (select JSON_EXTRACT_SCALAR(data,
'$.task1.date') as task_1_completed_date
JSON_EXTRACT_SCALAR(data,
'$.task2.date') as task_2_completed_date
from table
WHERE pet_store = 'london'
)where rownum=1 order by
task_completed_total desc
-- if rownum doesn't work use Limit 1
I think you could use GROUP BY and MAX to get the most recent date. see here: https://learn.microsoft.com/en-us/sql/t-sql/queries/select-group-by-transact-sql?view=sql-server-ver15

How to show 0 value using COUNT and SELECTon a SQL query

I have ONLY 1 table called Meeting that stores all meeting requests.
This table can be EMPTY.
It has several columns including requestType (which can only be "MT") meetingStatus (can only be either pending, approved, denied or canceled) and meetingCreatedTime
I want to count how many requests of each status's type (in other words how many requests are pending, how many are approved, denied and canceled) for the last 30 days
Problem is that if there is no request then nothing display but I want to display 0, how do I do it? Here is my query now:
SELECT [requestType],
( SELECT COUNT ([requestType]) FROM [Meeting] WHERE CAST([meetingCreatedTime] AS DATE) >= CAST(DateAdd(DAY,-30,Getdate()) AS DATE) AND [meetingStatus] = 'Approved') As 'Approved',
( SELECT COUNT ([requestType]) FROM [Meeting] WHERE CAST([meetingCreatedTime] AS DATE) >= CAST(DateAdd(DAY,-30,Getdate()) AS DATE) AND [meetingStatus] = 'Pending') As 'Pending',
( SELECT COUNT ([requestType]) FROM [Meeting] WHERE CAST([meetingCreatedTime] AS DATE) >= CAST(DateAdd(DAY,-30,Getdate()) AS DATE) AND [meetingStatus] = 'Canceled') As 'Canceled',
( SELECT COUNT ([requestType]) FROM [Meeting] WHERE CAST([meetingCreatedTime] AS DATE) >= CAST(DateAdd(DAY,-30,Getdate()) AS DATE) AND [meetingStatus] = 'Denied') As 'Denied'
FROM [Meeting]
WHERE CAST([meetingCreatedTime] AS DATE) >= CAST(DateAdd(DAY,-30,Getdate()) AS DATE) GROUP BY [requestType]
Result:
What I want is:
SELECT
RT.requestType,
SUM(CASE WHEN M.meetingStatus = 'Approved' THEN 1 ELSE 0 END) AS Approved,
SUM(CASE WHEN M.meetingStatus = 'Pending' THEN 1 ELSE 0 END) AS Pending,
SUM(CASE WHEN M.meetingStatus = 'Canceled' THEN 1 ELSE 0 END) AS Canceled,
SUM(CASE WHEN M.meetingStatus = 'Denied' THEN 1 ELSE 0 END) AS Denied,
FROM
(SELECT DISTINCT requestType FROM Meeting) RT
LEFT OUTER JOIN Meeting M ON
M.requestType = RT.requestType AND
M.meetingCreatedTime >= DATEADD(DAY, -30, GETDATE())
GROUP BY
RT.requestType
The SUMs are a much clearer (IMO) and much more efficient way of getting the counts that you need. Using the requestType table (assuming that you have one) lets you get results for every request type even if there are no meetings of that type in the date range. The LEFT OUTER JOIN to the meeting table allows the request type to still show up even if there are no meetings for that time period.
All of your CASTs between date values seem unnecessary.
Move those subqueries into simple sum/case statements:
select rt.request_type,
sum(case when [meetingStatus] = 'Approved' then 1 else 0 end),
sum(case when [meetingStatus] = 'Pending' then 1 else 0 end),
sum(case when [meetingStatus] = 'Canceled' then 1 else 0 end),
sum(case when [meetingStatus] = 'Denied' then 1 else 0 end)
from ( select 'MT' ) rt (request_type) --hopefully you have lookup table for this
left
join [Meeting] m on
rt.request_type = m.request_type and
CAST([meetingCreatedTime] AS DATE) >= CAST(DateAdd(DAY,-30,Getdate()) AS DATE)
group
by rt.request_type;
This is one possible approach to force one line to be visible in any case. Adapt this to your needs...
Copy it into an empty query window and execute... play around with the WHERE part...
DECLARE #Test TABLE (ID INT IDENTITY, GroupingKey VARCHAR(100));
INSERT INTO #Test VALUES ('a'),('a'),('b');
SELECT TOP 1 tbl.CountOfA
,tbl.CountOfB
,tbl.CountOfC
FROM
(
SELECT 1 AS Marker
,(SELECT COUNT(*) FROM #Test WHERE GroupingKey='a') AS CountOfA
,(SELECT COUNT(*) FROM #Test WHERE GroupingKey='b') AS CountOfB
,(SELECT COUNT(*) FROM #Test WHERE GroupingKey='c') AS CountOfC
WHERE (1=1) --play here with (1=0) and (1=1)
UNION ALL
SELECT 2,0,0,0
) AS tbl
ORDER BY Marker

counting events over flexible ranges

I am trying to count events (which are rows in the event_table) in the year before and the year after a particular target date for each person. For example, say I have a person 100 and target date is 10/01/2012. I would like to count events in 9/30/2011-9/30/2012 and in 10/02/2012-9/30/2013.
My query looks like:
select *
from (
select id, target_date
from subsample_table
) as i
left join (
select id, event_date, count(*) as N
, case when event_date between target_date-365 and target_date-1 then 0
when event_date between target_date+1 and target_date+365 then 1
else 2 end as after
from event_table
group by id, target_date, period
) as h
on i.id = h.id
and i.target_date = h.event_date
The output should look something like:
id target_date after N
100 10/01/2012 0 1000
100 10/01/2012 1 0
It's possible that some people do not have any events in the before or after periods (or both), and it would be nice to have zeros in that case. I don't care about the events outside the 730 days.
Any suggestions would be greatly appreciated.
I think the following may approach what you are trying to accomplish.
select id
, target_date
, event_date
, count(*) as N
, SUM(case when event_date between target_date-365 and target_date-1
then 1
else 0
end) AS Prior_
, SUM(case when event_date between target_date+1 and target_date+365
then 1
else 0
end) as After_
from subsample_table i
left join
event_table h
on i.id = h.id
and i.target_date = h.event_date
group by id, target_date, period
This is a generic answer. I don't know what date functions teradata has, so I will use sql server syntax.
select id, target_date, sum(before) before, sum(after) after, sum(righton) righton
from yourtable t
join (
select id, target_date td
, case when yourdate >= dateadd(year, -1, target_date)
and yourdate < target_date then 1 else 0 end before
, case when yourdate <= dateadd(year, 1, target_date)
and yourdate > target_date then 1 else 0 end after
, case when yourdate = target_date then 1 else 0 end righton
from yourtable
where whatever
group by id, target_date) sq on t.id = sq.id and target_date = dt
where whatever
group by id, target_date
This answer assumes that an id can have more than one target date.