Identifying premature expiration - sql

The dataset I have is a bit tricky. It’s a rolling calendar for a period of 24 months, and the data is published only once a month.
The relevant data points are as follows:
• CaseNumber (int)
• Start_date (date)
• Reporting_month (date)
• Months_old (int)
The above ‘CaseNumber’ has the ‘potential’ to appear in a ‘Reporting_month’ as many as 24 times (0-23). However each ‘CaseNumber’ will only appear one time in each ‘Reporting_month’.
So if you list any # of months in chronological order (Jan, Feb, Mar, Apr, etc.) a single ‘CaseNumber’ is will show up in each one of those ‘Reporting_months’ as long as the ‘CaseNumber’ is < 23 ‘Months_old’.
However, once a ‘CaseNumber’ = 24 ‘Months_old’ it will no longer report in this data set. So the oldest any particular ‘CaseNumber’ will ever be in this reporting cycle is 23 ‘Months_old’. Any older, than it will not appear on this report.
What I’m interested doing is tracking these ‘CaseNumbers’ to see if any are dropping off of this report prematurely. So in doing so I need to be able to compare the current ‘Reporting_month’ to the previous ‘Reporting_month’ to determine if any of the ‘CaseNumbers’ prematurely dropped off.
Example:
Case # Previous Current
Months_old Months_old Status
1234 22 23 Correct age
5678 23 NULL Dropped due to age
9101 18 NULL Premature drop
only means i've been able to achieve this is via a VLOOKUP formula in excel done manually. I'd like to get away from having to complete this manually.
SELECT
a.[CaseNumber]
,CONVERT(DATE,MAX(a.[Month]),111) 'Month'
,CASE WHEN m2.[CaseNumber] IS NOT NULL
AND m1.[CaseNumber] IS NULL
THEN 'Yes'
ELSE 'No'
END as 'New Default'
FROM
[dbo].['v2-2yrTotalDefault$'] a
LEFT OUTER JOIN (
SELECT DISTINCT
[CaseNumber]
FROM
[dbo].['v2-2yrTotalDefault$']
WHERE
LEFT(CONVERT(varchar,[Month],112),6) = '201902') m1
ON m1.CaseNumber = a.CaseNumber --most current month
LEFT OUTER JOIN (
SELECT DISTINCT
[CaseNumber]
FROM
[dbo].['v2-2yrTotalDefault$']
WHERE
LEFT(CONVERT(varchar,[Month],112),6) = '201903') m2
ON m2.CaseNumber = a.CaseNumber --previous month
WHERE
a.[Month] > '12/01/2018'
GROUP BY
a.[CaseNumber]
ORDER BY
a.[CaseNumber]
/the continually errors out due to the following error:
Msg 8120, Level 16, State 1, Line 8
Column 'm2.CaseNumber' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
Msg 8120, Level 16, State 1, Line 9
Column 'm1.CaseNumber' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause./
additionally with the above i don't want to have to hard code the months in the SELECT statement. I'd like to be able to control what month i'm looking to view in the WHERE clause.
in the end i'd like the results to return to columns, one reflecting previous months 'age' in months, and the second showing current month 'age'. if a CaseNumber dropped off prematurely, i'd like the current month to say 'premature_expiration'.

Related

How to carry over latest observed record when grouping by on SQL?

(I've created a similar question before, but I messed it up beyond repair. Hopefully, I can express myself better this time.)
I have a table containing records that change through time, each row representing a modification in Stage and Amount. I need to group these records by Day and Stage, summing up the Amount.
The tricky part is: ids might not change in some days. Since there won't be any record in those days, so I need to carry over the latest record observed.
Find below the records table and the expected result. MRE on dbfiddle (PostgreSQL)
Records
Expected Result
I created this basic visualization to demonstrate how the Amounts and Stages change throughout the days. Each number/color change represents a modification.
The logic behind the expected result can be found below.
Total Amount by Stage on Day 2
Id A was modified on Day 2, let's take that Amount: Negotiation 60.
Id B wasn't modified on Day 2, so we carry over the most recent modification (Day 1): Open 10.
Open 10
Negotiation 60
Closed 0
Total Amount by Stage on Day 3
Id A wasn't modified on Day 3, so we carry over the most recent modification (Day 2): Negotiation 60.
Id A was modified on Day 3: Negotiation 30
Total Amount by Stage on Day 3
Open 0
Negotiation 90
Closed 0
Basically, you seem to want the most recent value for each id --- and it only gets counted for the most recent stage.
You can get this using a formulation like this:
select d.DateDay, s.stage, coalesce(sh.amount, 0)
from (select distinct sh.DateDay from stage_history sh) d cross join
(select distinct sh.stage from stage_history sh) s left join lateral
(select sum(sh.amount) as amount
from (select distinct on (sh.id) sh.*
from stage_history sh
where sh.DateDay <= d.DateDay
order by sh.id, sh.DateDay desc
) sh
where sh.stage = s.stage
) sh
on 1=1
order by d.DateDay, s.stage;
Here is a db<>fiddle.

How to find if there is a match in a certain time interval between two different date fields SQL?

I have a column in my fact table that defines whether a Supplier is old or new based on the following case-statement:
CASE
WHEN (SUBSTRING([Reg date], 1, 6) = SUBSTRING([Invoice date], 1, 6)
THEN ('New supplier')
ELSE('Old supplier')
END as [Old/New supplier]
So for example, if a Supplier was registered 201910 and Invoice date was 201910 then the Supplier would be considered a 'New supplier' that month. Now I want to calculate the number of Old/New suppliers for each month by doing an distinct count on Supplier no, which is not a problem. The last step is where it gets tricky, now I want to count the number of New/Old suppliers over a 12-month period(if there has been a match on Invoice date and reg date in any of the lagging 12 months). So I create the following mdx expression:
aggregate(parallelperiod([D Time].[Year-Month-Day].[Year],1,[D Time].[Year-Month-Day].currentmember).lead(1) : [D Time].[Year-Month-Day].currentmember ,[Measures].[Supplier No Distinct Count])
The issue I am facing is that it will count Supplier no "1234" twice since it has been both new and old during that time period. What I wish is that, if it finds one match it would be considered a "New" Supplier for that 12- month period.
This is how the result ends up looking but I want it to be zero for "Old" since Reg date and Invoice date matched once during that 12-month period it should be considered new for the whole Rolling 12 month on 201910
Any help, possible approaches or ideas are highly appreciated.
Best regards,
Rubrix
Aggregate first at the supplier level and then at the type level:
select type, count(*)
from (select supplierid,
(case when min(substring(regdate, 1, 6)) = min(substring(invoicedate, 1, 6))
then 'new' else 'old'
end) as type
from t
group by supplierid
) s
group by type;
Note: I assume your date columns are in some obscure string format for your code to work. Otherwise, you should be using appropriate date functions.
SELECT COUNT(*) OVER () AS TotalCount
FROM Facts
WHERE Regdate BETWEEN(olddate, newdate) OR InvoiceDate BETWEEN(olddate, newdate)
GROUP BY
Supplier
The above query will return all the suppliers within that time period and then group them. Thus COUNT(*) will only include unique subscribers.
You might wanna change the WHERE clause because I didn't quite understand how you are getting the 12 month period. Generally if your where clause returns the suppliers within that time period(they don't have to be unique) then the group by and count will handle the rest.

Oracle - Count the same value used on consecutive days

Date jm Text
-------- ---- ----
6/3/2015 ne Good
6/4/2015 ne Good
6/5/2015 ne Same
6/8/2015 ne Same
I want to count how often the "same" value occurs in a set of consecutive days.
I dont want to count the value for the whole database. Now on the current date it is 2 (above example).
It is very important for me that "Same" never occurs...
The query has to ignore the weekend (6 and 7 june).
Date jm Text
-------- ---- ----
6/3/2015 ne Same
6/4/2015 ne Same
6/5/2015 ne Good
6/8/2015 ne Good
In this example the count is zero
Okay, I'm starting to get the picture, although at first I thought you wanted to count by jm, and now it seems you want to count by Text = 'Same'. Anyway, that's what this query should do. It gets the row for the current date. Is connects all previous rows and counts them. Also, it shows whether the current text (and that of the connected rows).
So the query will return one row (if there is one for today), which will show the date, jm and Text of the current date, the number of consecutive days for which the Text has been the same (just in case you want to know how many days it is 'Good'), and the number of days (either 0 or the same as the other count) for which the Text has been 'Same'.
I hope this query is right, or at least it gives you an idea of how to solve the problem using CONNECT BY. I should mention I based the 'Friday-detection' on this question.
Also, I don't have Oracle at hand, so please forgive me for any minor syntax errors.
WITH
VW_SAMESTATUSES AS
( SELECT t.*
FROM YourTable t
START WITH -- Start with the row for today
t.Date = trunc(sysdate)
CONNECT BY -- Connect to previous row that have a lower date.
-- Note that PRIOR refers to the prior record, which is
-- actually the NEXT day. :)
t.Date = PRIOR t.Date +
CASE MOD(TO_CHAR(t.Date, 'J'), 7) + 1
WHEN 5 THEN 3 -- Friday, so add 3
ELSE 1 -- Other days, so add one
END
-- And the Text also has to match to the one of the next day.
AND t.Text = PRIOR t.Text)
SELECT s.Date,
s.jm,
MAX(Text) AS CurrentText, -- Not really MAX, they are actually all the same
COUNT(*) AS ConsecutiveDays,
COUNT(CASE WHEN Text = 'Same' THEN 1 END) as SameCount
FROM VW_SAMESTATUSES s
GROUP BY s.Date,
s.jm
This recursive query (available from Oracle version 11g) might be useful:
with s(tcode, tdate) as (
select tcode, tdate from test where tdate = date '2015-06-08'
union all
select t.tcode, t.tdate from test t, s
where s.tcode = t.tcode
and t.tdate = s.tdate - decode(s.tdate-trunc(s.tdate, 'iw'), 0, 3, 1) )
select count(1) cnt from s
SQLFiddle
I prepared sample data according to your original question, without further edits, you can see them in attached SQLFiddle. Additional conditions for column 'Text'
are very simple, just add something like ... and Text ='Same' in where clauses.
In current version query counts number of previous days starting from given date (change it in line 2) where dates are consecutive (excluding weekend days) and values in column tcode is the same for all days.
Part: decode(s.tdate-trunc(s.tdate, 'iw'), 0, 3, 1) is for substracting days depending if it's Monday or other day, and should work independently from NLS settings.

To display only previous three months even the months before is not exist in database

Below is my new sql so far as i do not manage to use Dale M advice,
SELECT
all_months.a_month_id AS month,
year($P{date}) as year,
count(case when clixsteraccount.rem_joindate between DATE_FORMAT($P{date}-INTERVAL 2 MONTH, '%Y-%m-01') AND $P{date} THEN clixsteraccount.rem_registerbycn end) AS
total_activation,
'ACTIVATION(No)' AS fake_column
FROM clixsteraccount right join all_months on all_months.a_month_id = date_format(clixsteraccount.rem_joindate,'%m') and
(clixsteraccount.rem_registrationtype = 'Normal')and(clixsteraccount.rem_kapowstatus='pending' or clixsteraccount.rem_kapowstatus='success')
GROUP BY year,month
HAVING month BETWEEN month(date_sub($P{date},interval 2 month)) and month($P{date})
So, what i do is create a table with two fields, a_month_id(1,2,3...,12) and a_month(name of months). Sql above does give me what i want which is to display previous 3 months even the months before is not exist.
exp: data start on July. So, i want to display May,June and July data like 0,0,100.
The problem occur when it comes to next months or next year. When i try to generate sql based on parameter on Jan, it doesn't work like i thought. I do realize the problem are with 'Having' condition. Do anyone have idea how to improvised this sql to make it continue generate in the next,next year.
Thank you in advanced.
OK, I will make a few suggestions and give you an answer that will work on SQL Server - you will need to make any translations yourself.
I note that your query will aggregate all years together, i.e. Dec 2012 + Dec 2013 + Dec 2014 etc. Based on your question I don't think that is your intention so I will keep each distinct. You can change the query if that was your intention. I have also not included your selection criteria (other than by the month).
I suggest that you utilize an index table. This is a table stored in your database (or the master database if possible) with an clustered indexed integer column running from 0 to n where n is a sufficiently large number - 10,000 will be more than enough for this application (there are 12 months in a year so 10,000 represents 833 years). This table is so useful everyone should have one.
SELECT DATEADD(month, it.id, 0 ) AS month
,ISNULL(COUNT(clixsteraccount.rem_registerbycn), 0) AS registration
,'REGISTRATION(No)' AS fake_column
FROM cn
INNER JOIN ON ca.rem_registerbycn = cn.cn_id
clixsteraccount ca
RIGHT JOIN
IndexTable it ON it.id = DATEDIFF(month, 0, clixsteraccount.rem_joindate)
WHERE it.id BETWEEN DATEDIFF(month, 0, #StartDate) - 3 AND DATEDIFF(month, 0, GETDATE())
GROUP BY it.id
The way it works is by converting the clixsteraccount.rem_joindate to an integer that represents the number of months since date 0 (01-01-1900 in SQL Server). This is then matched to the id column of the IndexTable and limited by the dates you select. Because every number exists in the index table and we are using an outer join it doesn't matter if there are months missing from your data.

Confused on count(*) and self joins

I want to return all application dates for the current month and for the current year. This must be simple, however I can not figure it out. I know I have 2 dates for the current month and 90 dates for the current year. Right, Left, Outer, Inner I have tried them all, just throwing code at the wall trying to see what will stick and none of it works. I either get 2 for both columns or 180 for both columns. Here is my latest select statement.
SELECT count(a.evdtApplication) AS monthApplicationEntered,
count (b.evdtApplication) AS yearApplicationEntered
FROM tblEventDates a
RIGHT OUTER JOIN tblEventDates b ON a.LOANid = b.loanid
WHERE datediff(mm,a.evdtApplication,getdate()) = 0
AND datediff(yy,a.evdtApplication, getdate()) = 0
AND datediff(yy,b.evdtApplication,getdate()) = 0
You don't need any joins at all.
You want to count the loanID column from tblEventDates, and you want to do it conditionally based on the date matching the current month or the current year.
SO:
SELECT SUM( CASE WHEN Month(a.evdtApplication) = MONTH(GEtDate() THEN 1 END) as monthTotal,
count(*)
FROM tblEventDates a
WHERE a.evdtApplication BETWEEN '2008-01-01' AND '2008-12-31'
What that does is select all the event dates this year, and add up the ones which match your conditions. If it doesn't match the current month it won't add 1. Actually, don't even need to do a condition for the year because you're just querying everything for that year.