I have a table PolicyStatusLog as shown below:
IdPolicyStatusLog
IdPolicy
IdStatusChangedFrom
IdStatusChangedTo
DateChanged
28834
24142
3
10
2020-11-19
28847
24142
10
1
2020-11-20
If the last IdStatusChangedTo of a Policy is 1, then the Policy is still active.
Let's say I want to get all active policies(i.e IdPolicys) for the month of January. This will include polices where the status was last changed to active(i.e 1) before or during January.
I hope I explained the problem clearly, but I can always give more details as required.
How do I write a query for this in SQL? Thanks.
Assuming you have a policy table...
Get the latest PolicyStatusLog record (prior to Feb 1st) for each policy, and keep only policies where the latest status is. 1.
SELECT
*
FROM
policy AS p
CROSS APPLY
(
SELECT TOP 1 *
FROM PolicyStatusLog
WHERE IDPolicy = policy.ID
AND DateChanged < '2022-02-01'
ORDER BY DateChanged DESC
)
AS s
WHERE
s.IdStatusChangedTo = 1
Assumes all policies have at least one row in the log. For example, if there's a new policy, is there an initial row in the log with status 1?
Related
(I've created a similar question before, but I messed it up beyond repair. Hopefully, I can express myself better this time.)
I have a table containing records that change through time, each row representing a modification in Stage and Amount. I need to group these records by Day and Stage, summing up the Amount.
The tricky part is: ids might not change in some days. Since there won't be any record in those days, so I need to carry over the latest record observed.
Find below the records table and the expected result. MRE on dbfiddle (PostgreSQL)
Records
Expected Result
I created this basic visualization to demonstrate how the Amounts and Stages change throughout the days. Each number/color change represents a modification.
The logic behind the expected result can be found below.
Total Amount by Stage on Day 2
Id A was modified on Day 2, let's take that Amount: Negotiation 60.
Id B wasn't modified on Day 2, so we carry over the most recent modification (Day 1): Open 10.
Open 10
Negotiation 60
Closed 0
Total Amount by Stage on Day 3
Id A wasn't modified on Day 3, so we carry over the most recent modification (Day 2): Negotiation 60.
Id A was modified on Day 3: Negotiation 30
Total Amount by Stage on Day 3
Open 0
Negotiation 90
Closed 0
Basically, you seem to want the most recent value for each id --- and it only gets counted for the most recent stage.
You can get this using a formulation like this:
select d.DateDay, s.stage, coalesce(sh.amount, 0)
from (select distinct sh.DateDay from stage_history sh) d cross join
(select distinct sh.stage from stage_history sh) s left join lateral
(select sum(sh.amount) as amount
from (select distinct on (sh.id) sh.*
from stage_history sh
where sh.DateDay <= d.DateDay
order by sh.id, sh.DateDay desc
) sh
where sh.stage = s.stage
) sh
on 1=1
order by d.DateDay, s.stage;
Here is a db<>fiddle.
I have a "daily changes" table that records when a customer "upgrades" or "downgrades" their membership level. In the table, let's say field 1 is customer ID, field 2 is membership type and field 3 is the date of change. Customers 123 and ABC each have two rows in the table. Values in field 1 (ID) are the same, but values in field 2 (TYPE) and 3 (DATE) are different. I'd like to write a SQL query to tell me how many customers "upgraded" from membership type 1 to membership type 2 how many customers "downgraded" from membership type 2 to membership type 1 in any given time frame.
The table also shows other types of changes. To identify the records with changes in the membership type field, I've created the following code:
SELECT *
FROM member_detail_daily_changes_new
WHERE customer IN (
SELECT customer
FROM member_detail_daily_changes_new
GROUP BY customer
HAVING COUNT(distinct member_type_cd) > 1)
I'd like to see an end report which tells me:
For Fiscal 2018,
X,XXX customers moved from Member Type 1 to Member Type 2 and
X,XXX customers moved from Member Type 2 to Member type 1
Sounds like a good time to use a LEAD() analytical function to look ahead for a given customer's member_Type; compare it to current record and then evaluate if thats an upgrade/downgrade then sum results.
DEMO
CTE AS (SELECT case when lead(Member_Type_Code) over (partition by Customer order by date asc) > member_Type_Code then 1 else 0 end as Upgrade
, case when lead(Member_Type_Code) over (partition by Customer order by date asc) < member_Type_Code then 1 else 0 end as DownGrade
FROM member_detail_daily_changes_new
WHERE Date between '20190101' and '20190201')
SELECT sum(Upgrade) upgrades, sum(downgrade) downgrades
FROM CTE
Giving us: using my sample data
+----+----------+------------+
| | upgrades | downgrades |
+----+----------+------------+
| 1 | 3 | 2 |
+----+----------+------------+
I'm not sure if SQL express on rex tester just doesn't support the sum() on the analytic itself which is why I had to add the CTE or if that's a rule in non-SQL express versions too.
Some other notes:
I let the system implicitly cast the dates in the where clause
I assume the member_Type_Code itself tells me if it's an upgrade or downgrade which long term probably isn't right. Say we add membership type 3 and it goes between 1 and 2... now what... So maybe we need a decimal number outside of the Member_Type_Code so we can handle future memberships and if it's an upgrade/downgrade or a lateral...
I assumed all upgrades/downgrades are counted and a user can be counted multiple times if membership changed that often in time period desired.
I assume an upgrade/downgrade can't occur on the same date/time. Otherwise the sorting for lead may not work right. (but if it's a timestamp field we shouldn't have an issue)
So how does this work?
We use a Common table expression (CTE) to generate the desired evaluations of downgrade/upgrade per customer. This could be done in a derived table as well in-line but I find CTE's easier to read; and then we sum it up.
Lead(Member_Type_Code) over (partition by customer order by date asc) does the following
It organizes the data by customer and then sorts it by date in ascending order.
So we end up getting all the same customers records in subsequent rows ordered by date. Lead(field) then starts on record 1 and Looks ahead to record 2 for the same customer and returns the Member_Type_Code of record 2 on record 1. We then can compare those type codes and determine if an upgrade or downgrade occurred. We then are able to sum the results of the comparison and provide the desired totals.
And now we have a long winded explanation for a very small query :P
You want to use lag() for this, but you need to be careful about the date filtering. So, I think you want:
SELECT prev_membership_type, membership_type,
COUNT(*) as num_changes,
COUNT(DISTINCT member) as num_members
FROM (SELECT mddc.*,
LAG(mddc.membership_type) OVER (PARTITION BY mddc.customer_id ORDER BY mddc.date) as prev_membership_type
FROM member_detail_daily_changes_new mddc
) mddc
WHERE prev_membership_type <> membership_type AND
date >= '2018-01-01' AND
date < '2019-01-01'
GROUP BY membership_type, prev_membership_type;
Notes:
The filtering on date needs to occur after the calculation of lag().
This takes into account that members may have a certain type in 2017 and then change to a new type in 2018.
The date filtering is compatible with indexes.
Two values are calculated. One is the overall number of changes. The other counts each member only once for each type of change.
With conditional aggregation after self joining the table:
select
2018 fiscal,
sum(case when m.member_type_cd > t.member_type_cd then 1 else 0 end) upgrades,
sum(case when m.member_type_cd < t.member_type_cd then 1 else 0 end) downgrades
from member_detail_daily_changes_new m inner join member_detail_daily_changes_new t
on
t.customer = m.customer
and
t.changedate = (
select max(changedate) from member_detail_daily_changes_new
where customer = m.customer and changedate < m.changedate
)
where year(m.changedate) = 2018
This will work even if there are more than 2 types of membership level.
I have the following data.
Clientid Accountype Dateapplied
1 Current 01/01/2018
1 Savings 03/01/2018
1 Current 17/01/2018
2 Current 01/04/2018
2 Current 15/04/2018
3 Savings 13/04/2018
3 Savings 15/04/2018
3 Current 14/04/2018
How do I select the latest dated entry in here per Client where the Accountype = Current. Basically I want to be able to flag to latest entry per client so when I can work out the select I would set a new field as True.
So results I want to bring back are:
Clientid Accountype Dateapplied
1 Current 17/01/2018
2 Current 15/04/2018
3 Current 14/04/2018
I've tried also sort of grouping by ClientID then selecting using max but whatever I try I can't pick out the latest one per clientid. Should be simple but racking my brains at it.
So tried things like this but not working. Appreciated if help anyone.
select Dateapplied,Clientid, Accountype
from Clienttable t1
WHERE EXISTS(SELECT 1
FROM Clienttable t2
WHERE Accountype = 'Current'
and t2.Clientid = t1.X_Clientid
GROUP BY t2.Clientid,
t2.Dateapplied
HAVING t1.Dateapplied= MAX(t2.Dateapplied))
You dont need to use subquery. Just get the max(dateapplied)
SELECT ClientId, AccountType, max(dateapplied) dateapplied
from clienttable
where accountType = 'Current'
Group by ClientId, AccountType
I'm using postgres to run some analytics on user activity. I have a table of all requests(pageviews) made by every user and the timestamp of the request, and I'm trying to find the number of distinct sessions for every user. For the sake of simplicity, I'm considering every set of requests an hour or more apart from others as a distinct session. The data looks something like this:
id| request_time| user_id
1 2014-01-12 08:57:16.725533 1233
2 2014-01-12 08:57:20.944193 1234
3 2014-01-12 09:15:59.713456 1233
4 2014-01-12 10:58:59.713456 1234
How can I write a query to get the number of sessions per user?
To start a new session after every gap >= 1 hour:
SELECT user_id, count(*) AS distinct_sessions
FROM (
SELECT user_id
,(lag(request_time, 1, '-infinity') OVER (PARTITION BY user_id
ORDER BY request_time)
<= request_time - '1h'::interval) AS step -- start new session
FROM tbl
) sub
WHERE step
GROUP BY user_id
ORDER BY user_id;
Assuming request_time NOT NULL.
Explain:
In subquery sub, check for every row if a new session begins. Using the third parameter of lag() to provide the default -infinity, which is lower than any timestamp and therefore always starts a new session for the first row.
In the outer query count how many times new sessions started. Eliminate step = FALSE and count per user.
Alternative interpretation
If you really wanted to count hours where at least one request happened (I don't think you do, but another answer assumes as much), you would:
SELECT user_id
, count(DISTINCT date_trunc('hour', request_time)) AS hours_with_req
FROM tbl
GROUP BY 1
ORDER BY 1;
I'm working with PostgreSql and trying to build reporting query for my logs, but unfortunately unsuccessfully...
Basically I have LOG table which logs status changes of other entity. So for the sake of simplicity lets say it has columns STATUS and STATUS_CHANGE_DATE. Now each status change updates this logging table with new status and time it was changed. What I need is the duration and number of times status in it for each status (same status can be used multiple times, e.g go from status 1 to 2 then back to 1). I would like to build a view for it and use in my java application reporting by mapping that view right to hibernate entity. Unfortunately I'm not that experienced with sql so maybe someone can give me some hints of whats best solution would be as I tried few things but basically don't know how to do it.
Lets say we have:
STATUS STATUS_CHANGE_DATE
1 2013 01 01
2 2013 01 03
1 2013 01 06
3 2013 01 07
My wanted result would be a table that contains status 1 with 2 times and 3 days duration and status 2 1 time with 3 days duration too (assuming status 3 is end(or close) and its duration is not required).
Any ideas?
if your statuses are changing in every row, you can do this
with cte as (
select
status,
lead(status_change_date) over(order by status_change_date) as next_date,
status_change_date
from Table1
)
select
status, count(*) as cnt,
sum(next_date - status_change_date) as duration
from cte
where next_date is not null
group by status
sql fiddle demo
Try this:
SELECT "STATUS", "STATUS_CHANGE_DATE" - lag("STATUS_CHANGE_DATE") OVER (ORDER BY "STATUS_CHANGE_DATE") AS "DURATION" FROM table ORDER BY "STATUS";
This works for me in a similar case, in my case i need to calculate the average time between sessions in a log table. I hope this works for you.