Get post status for each day from status change history - sql

There is a table post_status_changes, which is history of post status changes
post_id | created_at | status
---------+---------------------+---------
3 | 2016-09-02 04:00:00 | 1
3 | 2016-09-04 19:59:21 | 2
6 | 2016-09-03 15:00:00 | 5
6 | 2016-09-03 19:52:46 | 1
6 | 2016-09-04 20:53:22 | 2
What I wanna get is a list for each day from DayA till DayB of post status for end of date.
DayA = 2016-09-01
DayB = 2016-09-05
post_id | date | status
-----------+-------------+---------
3 | 2016-09-01 | null
3 | 2016-09-02 | 1
3 | 2016-09-03 | 1
3 | 2016-09-04 | 2
3 | 2016-09-05 | 2
6 | 2016-09-01 | null
6 | 2016-09-02 | null
6 | 2016-09-03 | 1
6 | 2016-09-04 | 2
6 | 2016-09-05 | 2
Any solutions?

solution was found here: PHP: Return all dates between two dates in an array
$period = new DatePeriod(
new DateTime('2010-10-01'),
new DateInterval('P1D'),
new DateTime('2010-10-05')
);
foreach ($period as $each){
//.. QUERY here, where "CREAtED_AT" = $each
}

with a as
(select convert(varchar(10), created_at, 102) [date], [status],
post_id, rank() over (partition by convert(varchar(10), created_at),
post_id order by created_at desc) as r
from post_status_changes)
select post_id, [date], [status] from a where r =
(select top 1 r from a as a2 where a.[date] =
a2.[date] and a.[post_id] = a2.[post_id])
and #DayA <= [date] and #DayB >= [date] order by post_id, [date];

For each post_id you want as many rows as there are days between the start and end date. This can be done by cross joining the list of dates with the post_ids and then join that result back to the table to get the status for each day:
select x.post_id, t.created, p.status
from generate_series(date '2016-09-01', date '2016-09-05', interval '1' day) as t(created)
cross join (
select distinct post_id
from post_status_changes
) x
left join post_status_changes p on p.created_at::date = t.created
order by 1,2;
Running example: http://rextester.com/CSX38222

Related

Get users who took ride for 3 or more consecutive dates

I have below table, it shows user_id and ride_date.
+---------+------------+
| user_id | ride_date |
+---------+------------+
| 1 | 2019-11-01 |
| 1 | 2019-11-03 |
| 1 | 2019-11-05 |
| 2 | 2019-11-03 |
| 2 | 2019-11-04 |
| 2 | 2019-11-05 |
| 2 | 2019-11-06 |
| 3 | 2019-11-03 |
| 3 | 2019-11-04 |
| 3 | 2019-11-05 |
| 3 | 2019-11-06 |
| 4 | 2019-11-05 |
| 4 | 2019-11-07 |
| 4 | 2019-11-08 |
| 4 | 2019-11-09 |
| 5 | 2019-11-11 |
| 5 | 2019-11-13 |
+---------+------------+
I want user_id who took rides for 3 or more consecutive days along with days on which they took consecutive rides
The desired result is as below
+---------+-----------------------+
| user_id | consecutive_ride_date |
+---------+-----------------------+
| 2 | 2019-11-03 |
| 2 | 2019-11-04 |
| 2 | 2019-11-05 |
| 2 | 2019-11-06 |
| 3 | 2019-11-03 |
| 3 | 2019-11-04 |
| 3 | 2019-11-05 |
| 3 | 2019-11-06 |
| 4 | 2019-11-08 |
| 4 | 2019-11-09 |
| 4 | 2019-11-10 |
+---------+-----------------------+
SQL Fiddle
With LAG() and LEAD() window functions:
with cte as (
select *,
datediff(
day,
lag([ride_date]) over (partition by [user_id] order by [ride_date]),
[ride_date]
) prev1,
datediff(
day,
lag([ride_date], 2) over (partition by [user_id] order by [ride_date]),
[ride_date]
) prev2,
datediff(
day,
[ride_date],
lead([ride_date]) over (partition by [user_id] order by [ride_date])
) next1,
datediff(
day,
[ride_date],
lead([ride_date], 2) over (partition by [user_id] order by [ride_date])
) next2
from Table1
)
select [user_id], [ride_date]
from cte
where
(prev1 = 1 and prev2 = 2) or
(prev1 = 1 and next1 = 1) or
(next1 = 1 and next2 = 2)
See the demo.
Results:
> user_id | ride_date
> ------: | :---------
> 2 | 03/11/2019
> 2 | 04/11/2019
> 2 | 05/11/2019
> 2 | 06/11/2019
> 3 | 03/11/2019
> 3 | 04/11/2019
> 3 | 05/11/2019
> 3 | 06/11/2019
> 4 | 07/11/2019
> 4 | 08/11/2019
> 4 | 09/11/2019
Here is one way to adress this gaps-and-island problem:
first, assign a rank to each user ride with row_number(), and recover the previous ride_date (aliased lag_ride_date)
then, compare the date of the previous ride to the current one in a conditional sum, that increases when the dates are successive ; by comparing this with the rank of the user ride, you get groups (aliased grp) that represent consecutive rides with a 1 day spacing
do a window count how many records belong to each group (aliased cnt)
filter on records whose window count is greater than 3
Query:
select user_id, ride_date
from (
select
t.*,
count(*) over(partition by user_id, grp) cnt
from (
select
t.*,
rn1
- sum(case when ride_date = dateadd(day, 1, lag_ride_date) then 1 else 0 end)
over(partition by user_id order by ride_date) grp
from (
select
t.*,
row_number() over(partition by user_id order by ride_date) rn1,
lag(ride_date) over(partition by user_id order by ride_date) lag_ride_date
from Table1 t
) t
) t
) t
where cnt >= 3
Demo on DB Fiddle
This is a typical gaps and island problems.
We can solve it as follows
with data
as (
select user_id
,ride_date
,dateadd(day
,-row_number() over(partition by user_id order by ride_date asc)
,ride_date) as grp_field
from Table1
)
,consecutive_days
as(
select user_id
,ride_date
,count(*) over(partition by user_id,grp_field) as cnt
from data
)
select *
from consecutive_days
where cnt>=3
order by user_id,ride_date
https://dbfiddle.uk/?rdbms=sqlserver_2017&fiddle=7bb851d9a12966b54afb4d8b144f3d46
There is no need to apply gaps-and-islands methodologies to this problem. The problem is much simpler to solve.
You can return the users and first date just by using LEAD():
SELECT t1.*
FROM (SELECT t1.*,
LEAD(ride_date, 2) OVER (PARTITION BY user_id ORDER BY ride_date) as ride_date_2
FROM table1 t1
) t1
WHERE ride_date_2 = DATEADD(day, 2, ride_date);
If you want the actual dates, you can unpivot the results:
SELECT DISTINCT t1.user_id, v.ride_date
FROM (SELECT t1.*,
LEAD(ride_date, 2) OVER (PARTITION BY user_id ORDER BY ride_date) as ride_date_2
FROM table1 t1
) t1 CROSS APPLY
(VALUES (t1.ride_date),
(DATEADD(day, 1, t1.ride_date)),
(DATEADD(day, 2, t1.ride_date))
) v(ride_date)
WHERE t1.ride_date_2 = DATEADD(day, 2, t1.ride_date)
ORDER BY t1.user_id, v.ride_date;

Join two tables based on date from first

I have two tables like below (date format: yyyy-MM-dd):
1) Table1 - SURGERY
P_ID | SURGERY_DATE
------------------------------------------------
1 | 2012-04-01
2 | 2012-08-14
1 | 2012-07-22
4 | 2012-10-30
3 | 2012-06-07
2) Table2 - VISIT
P_ID | VISIT_DATE
-----------------------------------------
1 | 2012-03-28
1 | 2012-04-14
1 | 2012-05-17
1 | 2012-09-12
3 | 2012-07-15
4 | 2012-10-10
3 | 2012-06-01
The tables SURGERY and VISIT are joined from other tables. I would like to find all records that meet the following criteria: VISIT_DATE >= SURGERY_DATE
3) Result table
EMPLOYEE_ID | SUGERY_DATE | NUMBER OF VISIT
-------------------------------------------------------
1 | 2012-04-01 | 4
2 | 2012-08-14 | 0
1 | 2012-07-22 | 2
4 | 2012-10-30 | 1
3 | 2012-06-07 | 1
Using group by and count can solve your problem.
Please try the code below.
(https://i.stack.imgur.com/NFzdf.jpg)
You can use a correlated subquery:
select s.*,
(select count(*)
from visit v
where v.p_id = s.p_id and v.visit_date > s.surgery_date
) as num_visits_after
from surgery s;
You need to use group by and count with a mentioned condition like the following:
SELECT
S.P_ID,
S.SURGERY_DATE,
SUM(CASE
WHEN V.VISIT_DATE > S.SURGERY_DATE THEN 1
END) AS NUM_VISITS_AFTER
FROM
SURGERY S
LEFT JOIN VISIT V ON ( S.P_ID = V.P_ID )
GROUP BY
S.P_ID,
S.SURGERY_DATE;
Cheers!!

How to get total number of users in each status at End of Day based on event log table?

I got an event log table which captures the change of status of all users, say status A, status B and Status C. They can change it whenever they want. How can I get the snapshot of how many users are in each status at every End of Day (from the earliest day in the event log table till the latest day)
Appreciate if anyone can show me how to do it by PostsgreSQL in an elegant way. Thanks!
Edit: the event log table captures a bunch of events (one of them is status change) of every user, log_id records the order of the event log of that particular user.
user_id | log_time | status | event_A | log_id |
----------------------------------------------------------
456 | 2019-01-05 15:00 | C | | 5 |
123 | 2019-01-05 14:00 | C | | 4 |
123 | 2019-01-05 13:00 | | xxx | 3 |
456 | 2019-01-04 22:00 | B | | 4 |
456 | 2019-01-04 10:00 | C | xxx | 3 |
987 | 2019-01-04 05:00 | C | | 3 |
123 | 2019-01-03 23:00 | B | | 2 |
987 | 2019-01-03 15:00 | | xxx | 2 |
456 | 2019-01-02 22:00 | A | xxx | 2 |
123 | 2019-01-01 23:00 | C | | 1 |
456 | 2019-01-01 09:00 | B | xxx | 1 |
987 | 2019-01-01 04:00 | A | | 1 |
So I want to get the total number of user in each status at End of Day:
Date | status A | status B | status C |
---------------------------------------------
2019-01-05 | 0 | 0 | 3 |
2019-01-04 | 0 | 2 | 1 |
2019-01-03 | 2 | 1 | 0 |
2019-01-02 | 2 | 0 | 1 |
2019-01-01 | 1 | 1 | 1 |
This was quiet challenging to do :). I tried to fragment the sub-queries for good readability. It is probably not an very efficient way to do what you want, but it does the job.
-- collect all days to make sure there are no missing days
WITH all_days_cte(dt) as (
SELECT
generate_series(
(SELECT min(date_trunc('day', log_time)) from your_table),
(SELECT max(date_trunc('day', log_time)) from your_table),
'1 day'
)::DATE
),
-- collect all useres
all_users_cte as (
select distinct
user_id
from your_table
),
-- setup the table with infos needed, i.e. only the last status by day and user_id
infos_to_aggregate_cte as (
select
s.user_id,
s.dt,
s.status
from (
select
user_id,
date_trunc('day', log_time)::DATE as dt,
status,
row_number() over (partition by user_id, date_trunc('day', log_time) order by log_time desc) rn
from your_table
where status is not null
) s
-- only the last status of the day
where s.rn = 1
),
-- now we still have a problem, we need to find the last status, if there was no change on a day
completed_infos_cte as (
select
u.user_id,
d.dt,
-- not very efficient, but found no other way (first_value(...) would be nice, but there is no simple way to exclude nulls
(select
status
from infos_to_aggregate_cte i2
where i2.user_id = u.user_id
and i2.dt <= d.dt
and i2.status is not null
order by i2.dt desc
limit 1) status
from all_days_cte d
-- cross product for all dates and users (that is what we need for our aggregation)
cross join all_users_cte u
left outer join infos_to_aggregate_cte i on u.user_id = i.user_id
and d.dt = i.dt
)
select
c.dt,
sum(case when status = 'A' then 1 else 0 end) status_a,
sum(case when status = 'B' then 1 else 0 end) status_b,
sum(case when status = 'C' then 1 else 0 end) status_c
from completed_infos_cte c
group by c.dt
order by c.dt desc

SQL query for top 3 sold out product based on Date in last 7 days

I want top 3 sold out products of last week.
Here is my sql query.
select ProductId,sum(Quantity) as quantity,createdOn from
(SELECT inv.Id,invd.ProductId, invd.Quantity ,cast (inv.CreatedOn as date) as createdOn FROM Invoice as inv
INNER JOIN
InvoiceDetail invd
ON
invd.InvoiceId = inv.id
WHERE inv.CreatedOn >= DATEADD(day,-11117, GETDATE()) ) as tbl
group by createdOn , ProductId
ORDER BY createdOn DESC
But I didn't getting top 3 products per date. If I use TOP 3 then it will give only top 3 products while I want top 3 products per day for last week.
This is the Out put I am having. But I want only 3 records per day.
EXPECTED OUTPUT :
If I understand correctly, you can use Row_number with windows function to get top 3 quantity per day.
Make row numbers by createdon per day, Based on the order of quantity columnfrom high to low.
;WITH CTE AS(
SELECT productid,quantity,createdon,Row_number() over(partition by createdon ORDER BY quantity DESC,productid DESC) as RN
FROM
(
SELECT invd.productid,
sum(invd.quantity) as quantity,
cast(inv.createdon AS date) AS createdon
FROM invoice AS inv INNER JOIN invoicedetail invd
ON invd.invoiceid = inv.id
WHERE inv.createdon >= dateadd(day,-11117, getdate())
GROUP BY cast(inv.createdon AS date), invd.productid
) AS tbl
)
SELECT *
FROM CTE
WHERE RN <= 3
sqlfiddle
[Results]:
| productID | quantity | createdon | rn |
|-----------|----------|------------|----|
| 94 | 7 | 2018-07-25 | 1 |
| 1119 | 2 | 2018-07-25 | 2 |
| 1115 | 2 | 2018-07-25 | 3 |
| 94 | 4 | 2018-07-26 | 1 |
| 1117 | 2 | 2018-07-26 | 2 |
| 1114 | 2 | 2018-07-26 | 3 |

Conditionally grouping by date

I'm having a bit trouble figure this one out.
I have two tables items and stocks
items
id | name
1 | item_1
2 | item_2
stocks
id | item_id | quantity | expired_on
1 | 1 | 5 | 2015-11-12
2 | 1 | 5 | 2015-11-13
3 | 2 | 5 | 2015-11-12
4 | 2 | 5 | 2015-11-14
I want to be able to retrieve a big table grouped by date, and for each date, group by item_id and show the sum of the quantity that's not expired.
result
date | item_id | unexpired
2015-11-11 | 1 | 10
2015-11-11 | 2 | 10
2015-11-12 | 1 | 5
2015-11-12 | 2 | 5
2015-11-13 | 1 | 0
2015-11-13 | 2 | 5
2015-11-14 | 1 | 0
2015-11-14 | 2 | 0
I'm able to retrieve the result if it's just one day
SELECT
items.id, SUM(stocks.quantity) as unexpired
FROM
items LEFT OUTER JOIN stocks
ON items.id = stocks.item_id
WHERE
stocks.expired_on > '2015-11-11'
GROUP BY
items.id, stocks.quantity
I searched around, found something called DatePart, but it doesn't seem like what I need.
Using the convenient cast from boolean to integer, which yields 0, 1 or null, to sum the unexpired only
select
to_char(d, 'YYYY-MM-DD') as date,
item_id,
sum(quantity * (expired_on > d)::int) as unexpired
from
stocks
cross join
generate_series(
'2015-11-11'::date, '2015-11-14', '1 day'
) d(d)
group by 1, 2
order by 1, 2
;
date | item_id | unexpired
------------+---------+-----------
2015-11-11 | 1 | 10
2015-11-11 | 2 | 10
2015-11-12 | 1 | 5
2015-11-12 | 2 | 5
2015-11-13 | 1 | 0
2015-11-13 | 2 | 5
2015-11-14 | 1 | 0
2015-11-14 | 2 | 0
The cross join to the generate_series supplies all dates in the given range.
The data used above:
create table stocks (
id int,
item_id int,
quantity int,
expired_on date
);
insert into stocks (id,item_id,quantity,expired_on) values
(1,1,5,'2015-11-12'),
(2,1,5,'2015-11-13'),
(3,2,5,'2015-11-12'),
(4,2,5,'2015-11-14');
You need to generate the list of dates and then use cross join to get the full combinations of dates and items. Then, a left join to the stock table gives the expired on each date. A cumulative sum -- in reverse -- calculated unexpired:
select d.dte, i.item_id,
sum(quantity) over (partition by i.item_id
order by d.dte desc
rows between unbounded preceding and 1 preceding
) as unexpired
from (select generate_series(min(expired_on) - interval '1 day', max(expired_on), interval '1 day') as dte
from stocks
) d cross join
items i left join
stocks s
on d.dte = s.expired_on and i.item_id = s.item_id;