Find most visited Hotel by month in PostgreSQL - sql

I have a table with couple of customers resided in a hotel for a month or months. I need to find 3 most visited hotels by month. In case one customer lived in a hotel for three months, then it refers for three month. To be more precise below table hotel I have:
id
usr_id
srch_ci
srch_co
hotel_id
1
13
2021-10-01
2021-11-22
200
2
12
2021-10-11
2021-10-22
300
3
11
2021-10-28
2021-11-05
200
4
10
2021-10-28
2021-12-03
100
Result should look like below:
mnth
hotel_id
rnk
visits
2021-10
200
1
2
2021-10
100
2
1
2021-10
300
2
1
2021-11
200
1
2
2021-11
100
2
1
2021-12
100
1
1
As we can see above, user_id = 10 stayed in a hotel = 100 for 3 different months. That means it is counted for 3 different month for a hotel as 1 count. And for 2021-12 month only user = 10 stayed, for this reason in 2021-12 month hotel = 100 is ranked as 1st.

I solved problem using generate_series function in Postgres. That is what I was looking for. This link helped me. Splitting single row into multiple rows based on date
SELECT hotel_id,mnth,visits,
ROW_NUMBER() OVER (PARTITION BY mnth ORDER BY visits DESC) AS rnk FROM (
SELECT hotel_id,to_char(live_mnth,'YYYY-MM') AS mnth,count(*) AS visits FROM (
SELECT id,usr_id,hotel_id,date_in,date_out,
generate_series(date_in, date_out, '1 MONTH')::DATE AS live_mnth
FROM (
SELECT *,TO_CHAR(srch_ci, 'yyyy-mm-01')::date AS date_in,
TO_CHAR(srch_co, 'yyyy-mm-01')::date AS date_out
FROM hotels
) s
) s GROUP BY hotel_id,to_char(live_mnth,'YYYY-MM')
) t

Related

How to get the last day of the month without LAST_DAY() or EOMONTH()?

I have a table t with:
DATE
LOCATION
PRODUCT_ID
AMOUNT
2021-10-29
1
123
10
2021-10-30
1
123
9
2021-10-31
1
123
8
2021-10-29
1
456
100
2021-10-30
1
456
90
2021-10-31
1
456
80
2021-10-29
2
123
18
2021-10-30
2
123
17
2021-11-29
2
456
18
I need to find the AMOUNT of each PRODUCT_ID for each combination of LOCATION + PRODUCT_ID.
If a PRODUCT_ID has no entry for that day the AMOUNT is NULL.
So the result should look like:
DATE
LOCATION
PRODUCT_ID
AMOUNT
2021-10-31
1
123
8
2021-10-31
1
456
80
2021-10-31
2
123
NULL
2021-11-30
2
456
NULL
Sadly EXASOL has no LAST_DAY() or EOMONTH() function. How can I solve this?
You can get to the last day of the month using a date_trunc function in combination with date_add:
case
when t.date = date_add('day', -1, date_add('month', 1, date_trunc('month', t.date)))
then 'Y' else 'N' end as end_of_month
That being said, if you group your table for all combinations of locations and products, you will not get NULLs for products without sales on the last day of the month as shown in your output table.
When you group your data, any value that does not exist will simply not show up in your output table. If you want to force nulls to show up, you can create a new table that contains all combinations of products, locations, and hard-coded end of month dates.
Then, you can left join your old table with this new hard-coded table by date, location, and product. This method will give you the NULL values you expect.

how to calculate occupancy on the basis of admission and discharge dates

Suppose I have patient admission/claim wise data like the sample below. Data type of patient_id and hosp_id columns is VARCHAR
Table name claims
rec_no
patient_id
hosp_id
admn_date
discharge_date
1
1
1
01-01-2020
10-01-2020
2
2
1
31-12-2019
11-01-2020
3
1
1
11-01-2020
15-01-2020
4
3
1
04-01-2020
10-01-2020
5
1
2
16-01-2020
17-01-2020
6
4
2
01-01-2020
10-01-2020
7
5
2
02-01-2020
11-01-2020
8
6
2
03-01-2020
12-01-2020
9
7
2
04-01-2020
13-01-2020
10
2
1
31-12-2019
10-01-2020
I have another table wherein bed strength/max occupancy strength of hospitals are stored.
table name beds
hosp_id
bed_strength
1
3
2
4
Expected Results I want to find out hospital-wise dates where its declared bed-strength has exceeded on any day.
Code I have tried Nothing as I am new to SQL. However, I can solve this in R with the following strategy
pivot_longer the dates
tidyr::complete() missing dates in between
summarise or aggregate results for each date.
Simultaneously, I also want to know that whether it can be done without pivoting (if any) in sql because in the claims table there are 15 million + rows and pivoting really really slows down the process. Please help.
You can use generate_series() to do something very similar in Postgres. For the occupancy by date:
select c.hosp_id, gs.date, count(*) as occupanyc
from claims c cross join lateral
generate_series(admn_date, discharge_date, interval '1 day') gs(date)
group by c.hosp_id, gs.date;
Then use this as a subquery to get the dates that exceed the threshold:
select hd.*, b.strength
from (select c.hosp_id, gs.date, count(*) as occupancy
from claims c cross join lateral
generate_series(c.admn_date, c.discharge_date, interval '1 day') gs(date)
group by c.hosp_id, gs.date
) hd join
beds b
using (hosp_id)
where h.occupancy > b.strength

Count first occurring record per time period

In my table trips , I have two columns: created_at and user_id
Unique users take many different trips. My goal is to count the very first trip made unique per each user_ids per year-month. I understand that in this case the min() function should be applied.
In a previous query, all unique users per year-month were aggregated:
SELECT to_char(created_at, 'YYYY-MM') as yyyymm, COUNT(DISTINCT user_id)
FROM trips
GROUP BY yyyymm
ORDER BY yyyymm;
Where in the above query should min() be integrated? In other words, instead of counting all unique user id's per month, I only need to count the first occurrence of unique user id per month.
The sample input would look like:
> routes
user_id created_at
1 1 2015-08-07 07:18:21
2 2 2015-05-06 20:43:52
3 3 2015-05-06 20:53:54
4 1 2015-03-30 20:09:07
5 2 2015-10-01 18:28:32
6 3 2015-08-07 07:29:29
7 1 2015-08-28 13:45:44
8 2 2015-08-07 07:37:31
9 3 2015-03-30 20:14:04
10 1 2015-08-07 07:08:50
And the output would be:
count Y-m
1 0 2015-01
2 0 2015-02
3 2 2015-03
4 0 2015-04
5 1 2015-05
Because the first occurrences of user_id 1 and 3 were in March and the first occurrence of user_id 2 was in May
You can do this with 2 levels of aggregation. Get the min time per user_id and then count.
SELECT to_char(first_time, 'YYYY-MM'),count(*)
from (
SELECT user_id,MIN(created_at) as first_time
FROM trips
GROUP BY user_id
) t
GROUP BY to_char(first_time, 'YYYY-MM')

Get the latest price SQLITE

I have a table which contain _id, underSubheadId, wefDate, price.
Whenever a product is created or price is edited an entry is made in this table also.
What I want is if I enter a date, I get the latest price of all distinct UnderSubheadIds before the date (or on that date if no entry found)
_id underHeadId wefDate price
1 1 2016-11-01 5
2 2 2016-11-01 50
3 1 2016-11-25 500
4 3 2016-11-01 20
5 4 2016-11-11 30
6 5 2016-11-01 40
7 3 2016-11-20 25
8 5 2016-11-15 52
If I enter 2016-11-20 as date I should get
1 5
2 50
3 25
4 30
5 52
I have achieved the result using ROW NUMBER function in SQL SERVER, but I want this result in Sqlite which don't have such function.
Also if a date like 2016-10-25(which have no entries) is entered I want the price of the date which is first.
Like for 1 we will get price as 5 as the nearest and the 1st entry is 2016-11-01.
This is the query for SQL SERVER which is working fine. But I want it for Sqlite which don't have ROW_NUMBER function.
select underSubHeadId,price from(
select underSubHeadId,price, ROW_NUMBER() OVER (Partition By underSubHeadId order by wefDate desc) rn from rates
where wefDate<='2016-11-19') newTable
where newTable.rn=1
Thank You
This is a little tricky, but here is one way:
select t.*
from t
where t.wefDate = (select max(t2.wefDate)
from t t2
where t2.underSubHeadId = t.underSubHeadId and
t2.wefdate <= '2016-11-20'
);
select underHeadId, max(price)
from t
where wefDate <= "2016-11-20"
group by underHead;

Isolating the date a value turns 0 and aggregating another value from that date back

I'm looking to see two things:
When a customer closed all of their accounts with us (date when
accounts goes to 0)
The total interactions a customer had with us up
until that point (sum of interactions from when accounts was a
number greater than one).
The total interactions a customer had with us up
until that point (sum of interactions from when accounts was a
number greater than one).
Basically I'm trying to get from the top table to the bottom table in the attached image.
Customer month Accounts Interactions
12345 Jan-15 3 5
12345 Feb-15 3 1
12345 Mar-15 2 7
12345 Apr-15 1 3
12345 May-15 1 9
12345 Jun-15 1 2
12345 Jul-15 0 3
67890 Feb-15 1 4
67890 Mar-15 1 4
67890 Apr-15 1 9
67890 May-15 0 5
Customer Month close date Interactions
12345 Jul-15 30
67890 May-15 23
When I first read the question it sounded like there would be a neat solution with window functions, but after re-reading it, I don't think that's necessary. Assuming that closing his last account would be the last interaction a customer would have with you, you just need the last interaction date per customer, which means this problem can be solved with simple aggregate functions:
SELECT customer, MAX(month), SUM(interactions)
FROM mytable
GROUP BY customer
To get the last three months you need an OLAP-function:
SELECT Customer, MAX(months), SUM(Interactions)
FROM
(
SELECT Customer, month, Interactions
FROM mytable
QUALIFY
-- only closed accounts
MIN(Accounts) OVER (PARTITION BY Customer) = 0
-- last three months
AND month >= oADD_MONTHS(MAX(month) OVER (PARTITION BY Customer), -3)
) AS dt
GROUP BY customer