concurrent bookings - sql

I have a table on booking orders
Bookings (order_no, user_id, booking_time,complete_time)
I try to write a query to return the order_no from all rows where customers made concurrent bookings (customer made a new booking before they completed the previous booking).
Explanation:
Customer X booked #000 at 1:15, and completed it at 1:25.
Customer X booked #001 at 1:20, and completed it at 1:25.
Customer X booked #002 at 5:30, and completed it at 6:00.
Customer Y booked #020 at 1:20, and completed it at 2:10.
Customer Y booked #021 at 6:55, and completed it at 7:16.
Only Customer X had a concurrent booking. The correct query would return order_no #000 and #001.
Output should be
000
001
I have tried using subquery in the criteria, but I still don’t get the logic
I need help with this, Please someone help me

If you want both bookings on separate rows, then one method is window functions:
select b.*
from (select b.*,
lag(booking_time) over (partition by user_id order by booking_time) as prev_booking_time,
lead(booking_time) over (partition by user_id order by booking_time) as next_booking_time,
lag(coalesce(complete_time, cancel_time) over (partition by user_id order by booking_time) as prev_end_time
from bookings b
) b
where (next_booking_time >= booking_time and
next_booking_time < coalesce(complete_time, cancel_time)
) or
(booking_time > prev_booking_time and
booking_time < prev_end_time
);
If you want the overlaps on one row, then you can do:
select b1.*, b2.*
from bookings b1 join
bookings b2
on b2.user_id = b1.user_id and
b2.booking_time >= b1.booking_time and
(b2.booking_time <= b1.complete_time) or
b2.booking_time <= b1.cancel_time
);
Note that for multiple overlaps on the same booking, this produces a row for each pair.

This is just the overlapping date range problem. You may solve this via a self join:
SELECT b1.*
FROM Bookings b1
INNER JOIN Bookings b2
ON b1.user_id = b2.user_id AND
b1.order_no <> b2.order_no
WHERE
b2.booking_time < b1.complete_time AND
b2.complete_time > b1.booking_time;

Related

how to filter data in sql based on percentile

I have 2 tables, the first one is contain customer information such as id,age, and name . the second table is contain their id, information of product they purchase, and the purchase_date (the date is from 2016 to 2018)
Table 1
-------
customer_id
customer_age
customer_name
Table2
------
customer_id
product
purchase_date
my desired result is to generate the table that contain customer_name and product who made purchase in 2017 and older than 75% of customer that make purchase in 2016.
Depending on your flavor of SQL, you can get quartiles using the more general ntile analytical function. This basically adds a new column to your query.
SELECT MIN(customer_age) as min_age FROM (
SELECT customer_id, customer_age, ntile(4) OVER(ORDER BY customer_age) AS q4 FROM table1
WHERE customer_id IN (
SELECT customer_id FROM table2 WHERE purchase_date = 2016)
) q
WHERE q4=4
This returns the lowest age of the 4th-quartile customers, which can be used in a subquery against the customers who made purchases in 2017.
The argument to ntile is how many buckets you want to divide into. In this case 75%+ equals 4th quartile, so 4 buckets is OK. The OVER() clause specifies what you want to sort by (customer_age in our case), and also lets us partition (group) the data if we want to, say, create multiple rankings for different years or countries.
Age is a horrible field to include in a database. Every day it changes. You should have date-of-birth or something similar.
To get the 75% oldest value in 2016, there are several possibilities. I usually go for row_number() and count(*):
select min(customer_age)
from (select c.*,
row_number() over (order by customer_age) as seqnum,
count(*) over () as cnt
from customers c join
where exists (select 1
from customer_products cp
where cp.customer_id = c.customer_id and
cp.purchase_date >= '2016-01-01' and
cp.purchase_date < '2017-01-01'
)
)
where seqnum >= 0.75 * cnt;
Then, to use this for a query for 2017:
with a2016 as (
select min(customer_age) as customer_age
from (select c.*,
row_number() over (order by customer_age) as seqnum,
count(*) over () as cnt
from customers c
where exists (select 1
from customer_products cp
where cp.customer_id = c.customer_id and
cp.purchase_date >= '2016-01-01' and
cp.purchase_date < '2017-01-01'
)
) c
where seqnum >= 0.75 * cnt
)
select c.*, cp.product_id
from customers c join
customer_products cp
on cp.customer_id = c.customer_id and
cp.purchase_date >= '2017-01-01' and
cp.purchase_date < '2018-01-01' join
a2016 a
on c.customer_age >= a.customer_age;

how to query a table date against a series of dates on another table

I have two tables, INVOICES and INV_PRICES. I am trying to find the Invoice table's part price from the Inv_Prices based upon the Invoice_Dt on the Invoice table; if the Invoice_Dt is between (greater than, but less than) or greater than the max EFF_DT on the Inv_Prices, then return that part's price.
I have tired variations on the following code, but no luck. I either do not get all the parts or multiple records.
SELECT DISTINCT A.INVOICE_NBR, A.INVOICE_DT, A.PART_NO,
CASE WHEN TRUNC(A.INVOICE_DT) >= TRUNC(B.EFF_DT) THEN B.DLR_NET_PRC_AM
WHEN (TRUNC(A.INVOICE_DT)||ROWNUM >= TRUNC(B.EFF_DT)||ROWNUM) AND (TRUNC(B.EFF_DT)||ROWNUM <= TRUNC(A.INVOICE_DT)||ROWNUM) THEN B.DLR_NET_PRC_AM
/*MAX(B.EFF_DT) THEN B.DLR_NET_PRC_AM*/
ELSE 0
END AS PRICE
FROM INVOICES A,
INV_PRICES B
WHERE A.PART_NO = B.PART_NO
ORDER BY A.INVOICE_NBR
Can someone assist? I have a sample of each table if needed.
Doesn't it work to put the condition in the JOIN conditions? You can calculate the period when a price is valid using LEAD():
SELECT i.INVOICE_NBR, i.INVOICE_DT, i.PART_NO,
COALESCE(ip.DLR_NET_PRC_AM, 0) as price
FROM INVOICES i LEFT JOIN
(SELECT ip.*, LEAD(eff_dt) OVER (PARTITION BY PART_NO ORDER BY eff_dt) as next_eff_dt
FROM INV_PRICES ip
) ip
ON i.PART_NO = ip.PART_NO AND
i.invoice_dt >= ip.eff_dt AND
(i.invoice_dt < ip.next_eff_dt or ip.next_eff_dt is null)
ORDER BY i.INVOICE_NBR

SQL Count avabiles rooms betwen interval

I'm working on a hotel booking module, and after a few days I'm stack.
Tables
+++++++++++++++++++++++++++++++
rooms bookings
======== =============
room_id b_id
avabile_rooms b_room_id
check_in
check_out
b_rooms
== Where ==
avabile_rooms - the number of rooms avabile
b_rooms = the number of rooms with this b_id booked
Values needed
++++++++++++++++++++++++++++++++
room_id and number_of_rooms_avabile foreach avabile room in interval A(check_in) - B(check_out)
Current Query
++++++++++++++++++++++++++++++++
SELECT r.* FROM rooms AS r WHERE r.room_id NOT IN
(
SELECT b.b_room_id FROM bookings AS b
WHERE (b.check_out >= ? AND b.check_in <= ?)
OR (b.check_out <= ? AND b.check_in >= ?)
)
Now I get the avabile rooms without taking into account the avabile_rooms/b_rooms
The Unkown Query (I thik It needs to be something like this)
++++++++++++++++++++++++++++++++
SELECT r.* FROM rooms AS r WHERE r.room_id IS IN
(
SELECT b.b_room_id FROM bookings AS b
OR b.b_id IS NULL
OR (b.check_out >= ? AND b.check_in <= ?)
OR (b.check_out <= ? AND b.check_in >= ?)
)
I'm not figured out how to get the room_id/number_of_rooms_avabile
P.S.: I did a deep search but did not find a solution that takes into account the number of rooms.
Thanks.
Here is how I would do it:
select rt.room_type_id, available = rt.available_rooms - isnull(sum(b.b_rooms), 0)
from room_types rt
left join bookings b
on rt.room_type_id = b.b_room_type_id
and b.check_in <= #end_date
and b.check_out >= #start_date
group by rt.room_type_id, rt.available_rooms
I renamed your rooms table to room_types as discussed in the comments, and renamed columns as appropriate.
This will take the number of available rooms and subtract all bookings that have any overlap with the selected date range, which I believe is what you want.
Depending on your business rules, you may want this instead:
where b.check_in < #end_date
and b.check_out > #start_date

Detecting duplicates which fall outside of a date interval

I searched in SO but couldnt find a direct answer.
There are patients, hospitals, medical branches(ER,urology,orthopedics,internal disease etc), medical operation codes (examination,surgical operation, MRI, ultrasound or sth. else) and patient visiting dates.
Patient visits doctor, doctor prescribes medicine and asks to come again for control check.
If patient returns after 10 days, (s)he has to pay another examination fee to the same hospital. Hospitals may appoint a date after 10 days telling there are no available slots in following 10 days, in order to get the examination fee.
Table structure is like:
Patient id.no Hospital Medical Branch Medical Op. Code Date
1 H1 M0 P1 01/05/2011
5 H1 M1 P9 03/05/2011
3 H2 M0 P2 09/05/2011
1 H1 M0 P1 14/05/2011
3 H1 M0 P2 20/05/2011
5 H1 M2 P9 25/05/2011
1 H1 M0 P3 26/05/2011
Here, visiting patients no. 3 and 5 does not constitute a problem as patient no. 3 visits different hospitals and patient no.5 visits different medical branches. They would pay the examination fee even if they visited within 10 days.
Patient no.1, however, visits same hospital, same branch and is subject to same process (P1: examination) on 01/05 and 14/05.
26/05 doesnt count because it is not medical examination.
What I want to flag is same patient, same hospital, same branch and same medical operation code (that is specifically medical examination : P1 ), with date range more than 10 days.
The format of resulting table:
HOSPITAL TOTAL NUM. of PATIENTS NUM. of PATIENTS OUT OF DATE RANGE
H1 x a
H2 y b
H3 z c
Thanks.
Once again, it's analytic functions to the rescue.
This query uses the LAG() function to link a record in YOUR_TABLE with the previous (defined by DATE) matching record (defined by PATIENT_ID) in the table.
select hospital_id
, count(*) as total_num_of_patients
, sum (out_of_range) as num_of_patients_out_of_range
from (
select patient_id
, hospital_id
, case
when hospital_id_1 = hospital_id_0
and visit_1 > visit_0 + 10
and med_op_code_1 = med_op_code_0
then 1
else 0
end as out_of_range
from (
select patient_id
, hospital_id as hospital_id_1
, date as visit_1
, med_op_code as med_op_code_1
, lag (date) over (partition by patient_id order by date) as visit_0
, lag (hopital_id) over (partition by patient_id order by date) as hopital_id_0
, lag (med_op_code) over (partition by patient_id order by date) as med_op_code_0
from your_table
where med_op_code = 'P1'
)
)
group by hospital_id
/
Caveat: I haven't tested this code, so it may contain syntax errors. I will check it the next time I can access an Oracle database.
This is a little rough, as I haven't got an Oracle DB to hand, but the key feature is the same: the analytical function LAG(). Along with its companion function, LEAD(), they're great for helping to deal with things like periods of activity.
Here's my attempt at the code:
select n.hospital, COUNT(n.patient_id) as patients_out_of_date_range
from (
select *
from (
select d.*, lag(date, 1) over (partition by d.patient_id, d.hospital, d.medical_branch, d.medical_op_code order by d.date) as prev_date
from datatable d inner join
(
select d.patient_id, d.hospital, d.medical_branch, d.medical_op_code
from datatable d
where d.medical_op_code = 'P1'
group by d.patient_id, d.hospital, d.medical_branch, d.medical_op_code
having COUNT(d.date) > 1
) t on d.patient_id = t.patient_id and d.hospital = t.hospital and d.medical_branch = t.medical_branch and d.medical_op_code = t.medical_op_code
) m
where date - prev_date > 10
) n
group by n.hospital
Like I say, this isn't tested, but it should at least get you started in the right direction.
Some references:
http://www.adp-gmbh.ch/ora/sql/analytical/lag.html
http://www.oracle-base.com/articles/misc/LagLeadAnalyticFunctions.php
I think this is what you're trying for:
WITH Patient_Visits (Patient_Id, Hospital_Id, Branch_Id, Visit_Date, Visit_Order) as (
SELECT Patient_Id, Hospital_Id, BranchId, Visit_Date,
ROW_NUMBER() OVER(PARTITION BY Patient_ID, Hospital_Id, Branch_Id,
ORDER_BY Patient_Id, Hospital_Id, Branch_Id, Visit_Date)
FROM Hospital_Visits
WHERE Procedure_Id = 'P1'),
Hospital_Recent_Visits (Hospital_Id, Recent_Visitor_Count) as (
SELECT a.Hospital_Id, COUNT(DISTINCT a.Patient_Id)
FROM Patient_Visits as a
JOIN Patient_Visits as b
ON b.Hospital_Id = a.Hospital_Id
AND b.Branch_Id = a.Branch_Id
AND b.Patient_Id = a.Patient_Id
AND b.Visit_Order = a.Visit_Order - 1
AND b.Visit_Date + 10 > a.Visit_Date
GROUP BY a.Hospital_Id, a.Patient_Id),
Hospital_Patient_Count (Hospital_Id, Patient_Count) as (
SELECT Hospital_Id, COUNT(DISTINCT Patient_Id)
FROM Hospital_Visits
GROUP BY Hospital_Id, Patient_Id)
SELECT a.Hospital_Id, b.Patient_Count, c.Recent_Visitor_Count
FROM Hospitals as a
LEFT JOIN Hospital_Patient_Count as b
ON b.Hospital_Id = a.Hospital_Id
LEFT JOIN Hospital_Recent_Visits as c
ON c.Hospital_id = a.Hospital_Id
Please note that this was written and tested against a DB2 system. I think Oracle databases have the relevant functionality, so the query should still work as written. However, DB2 appears to lack some of the OLAP functions Oracle has (my version, at least), which could be useful in knocking out some of the CTEs.

sql query to find customers who order too frequently?

My database isn't actually customers and orders, it's customers and prescriptions for their eye tests (just in case anyone was wondering why I'd want my customers to make orders less frequently!)
I have a database for a chain of opticians, the prescriptions table has the branch ID number, the patient ID number, and the date they had their eyes tested. Over time, patients will have more than one eye test listed in the database. How can I get a list of patients who have had a prescription entered on the system more than once in six months. In other words, where the date of one prescription is, for example, within three months of the date of the previous prescription for the same patient.
Sample data:
Branch Patient DateOfTest
1 1 2007-08-12
1 1 2008-08-30
1 1 2008-08-31
1 2 2006-04-15
1 2 2007-04-12
I don't need to know the actual dates in the result set, and it doesn't have to be exactly three months, just a list of patients who have a prescription too close to the previous prescription. In the sample data given, I want the query to return:
Branch Patient
1 1
This sort of query isn't going to be run very regularly, so I'm not overly bothered about efficiency. On our live database I have a quarter of a million records in the prescriptions table.
Something like this
select p1.branch, p1.patient
from prescription p1, prescription p2
where p1.patient=p2.patient
and p1.dateoftest > p2.dateoftest
and datediff('day', p2.dateoftest, p1.dateoftest) < 90;
should do... you might want to add
and p1.dateoftest > getdate()
to limit to future test prescriptions.
This one will efficiently use an index on (Branch, Patient, DateOfTest) which you of course should have:
SELECT Patient, DateOfTest, pDate
FROM (
SELECT (
SELECT TOP 1 DateOfTest AS last
FROM Patients pp
WHERE pp.Branch = p.Branch
AND pp.Patient = p.Patient
AND pp.DateOfTest BETWEEN DATEADD(month, -3, p.DateOfTest) AND p.DateOfTest
ORDER BY
DateOfTest DESC
) pDate
FROM Patients p
) po
WHERE pDate IS NOT NULL
On way:
select d.branch, d.patient
from data d
where exists
( select null from data d1
where d1.branch = d.branch
and d1.patient = d.patient
and "difference (d1.dateoftest ,d.dateoftest) < 6 months"
);
This part needs changing - I'm not familiar with SQL Server's date operations:
"difference (d1.dateoftest ,d.dateoftest) < 6 months"
Self-join:
select a.branch, a.patient
from prescriptions a
join prescriptions b
on a.branch = b.branch
and a.patient = b.patient
and a.dateoftest > b.dateoftest
and a.dateoftest - b.dateoftest < 180
group by a.branch, a.patient
This assumes you want patients who visit the same branch twice. If you don't, take out the branch part.
SELECT Branch
,Patient
FROM (SELECT Branch
,Patient
,DateOfTest
,DateOfOtherTest
FROM Prescriptions P1
JOIN Prescriptions P2
ON P2.Branch = P1.Branch
AND P2.Patient = P2.Patient
AND P2.DateOfTest <> P1.DateOfTest
) AS SubQuery
WHERE DATEDIFF(day, SubQuery.DateOfTest, SubQuery.DateOfOtherTest) < 90