SQL derived attribute - sql

I want to make the Room_Status in Table 1 (ROOM) table into a derived attribute base on the check_in and check_out date Table 2 (Booking Records) table, but I don't know if it is possible to determine the room status dynamically based on the check_in/check_out date in table 2 , eg. The room with room_no 103 suppose to be unavailable for day 19/02/2020 to 20/02/2020 because its already booked by someone else , so the room status will be displayed as unavailable or N, after date 20/02/2020 the room will be available again.
Another extra thing is that I want to calculate the days available for each room based on the table 2 check_in and check_out date eg. room 103 will be available for only 1 day if it is booked on 19/02/2020 to 20/02/2020 after one day which is 22/02/2020 the room is booked by another customer, how should I calculate the days available...
Table 1 (ROOM)
ROOM_NO ROOM_STATUS ('Y' represent 'Available' , 'N' represent 'Unavailable')
======= ============
1 Y
2 Y
3 Y
4 Y
5 Y
6 Y
7 Y
8 Y
9 Y
10 Y
more rooms.....
Table 2 (Booking Records)
BOOKING_ID CHECK_IN CHECK_OUT SPECIAL_REQ CANCEL_REASON DATE_BOOK ROOM_NO GUEST
========== ======== ========= =========== ============= ========= ======= =====
1 19/02/2020 20/02/2020 Prepare hot bath tub 17/02/2020 103 980315070652
when check in.
2 20/05/2020 27/05/2020 Prepare scented 10/05/2020 10 C00001549
candle and meal when
check in
3 21/05/2020 23/05/2020 Prepare latest news 10/05/2020 9 C00001894
paper in room
4 20/05/2020 24/05/2020 Prepare hot bath tub 17/05/2020 124 980315070652
when check in.

Not sure I'm following, something like this? Strictness of the inequality (<,<=) will depend on how you want to define available on a given day.
SELECT DISTINCT
Room_No,
CASE WHEN EXISTS
( SELECT 1
FROM Booking_Records BR2
WHERE BR.room_no = BR2.room_no
AND CURRENT_DATE >= BR2.Check_In
AND CURRENT_DATE < BR2.Check_Out
) THEN 'N'
ELSE 'Y'
END AS Currently_Available
FROM Booking_Records BR

This will get the rooms from the booking_records table that are available on 2020-05-22:
SELECT room_no,
CASE COUNT(
CASE
WHEN DATE '2020-05-22' BETWEEN CHECK_IN AND CHECK_OUT
AND cancel_reason IS NULL
THEN 1
END
)
WHEN 0
THEN 'Y'
ELSE 'N'
END AS room_status_20200522
FROM booking_records
GROUP BY room_no
Which, for your sample data:
CREATE TABLE booking_records ( BOOKING_ID, CHECK_IN, CHECK_OUT, CANCEL_REASON, ROOM_NO ) AS
SELECT 1, DATE '2020-02-19', DATE '2020-02-20', CAST( NULL AS VARCHAR2(50) ), 103 FROM DUAL UNION ALL
SELECT 2, DATE '2020-05-20', DATE '2020-05-27', NULL, 10 FROM DUAL UNION ALL
SELECT 3, DATE '2020-05-21', DATE '2020-05-23', NULL, 9 FROM DUAL UNION ALL
SELECT 4, DATE '2020-05-20', DATE '2020-05-24', NULL, 124 FROM DUAL;
Outputs:
ROOM_NO | ROOM_STATUS_20200522
------: | :-------------------
9 | N
103 | Y
10 | N
124 | N
db<>fiddle here

So, You need to fetch room status and number of day availability after room is free.
I am considering that booking_records table can contain more than one reservation for the same room_no.
Following query will give you desired details as of sysdate.
Select r.room_no,
Max(Case when b.bookin_id is not null then'N' else 'Y' end) as booking_status,
Min(bn.check_in - coalesce(b.check_out,sysdate)) as available_days
From rooms r
Left Join booking_records b
On b.room_no = r.room_no
And sysdate between b.check_in and b.check_out
Left join booking_records bn
On bn.room_no = r.room_no
And bn.check_in > sysdate
Group by r.room_no

Related

how to fetch count data of 2 date fields in same month in SQL

I am trying to create a query where I have 3 column.
C_Time: contains task Creation date time
Done_Time: Contains Task completion date time
User ID: Unique id of user
I want to get result where I want to get total count of created tasks in particular month and total number of done task at that same month grouped by user id
Output will be like:
UserID | CreatedCount | DoneCount
------------------------------------------
U12 | 12 | 12
-------------------------------------------
U13 | 7 | 5
here U12 user have created 12 tasks and completed 12 tasks in January 2020 month. But user U13 created 7 tasks in Jan 2020 and done 5 tasks in same month.
You can use apply to unpivot the data and then aggregation:
select t.user_id, sum(is_create), sum(is_complete)
from t cross apply
(values (t.c_time, 1, 0), (t.done_time, 0, 1)
) v(t, is_create, is_complete)
where v.t >= '2020-01-01' and v.t < '2020-02-01'
group by t.user_id;
You can also do this with conditional aggregation:
select user_id,
sum(case when c_time >= '2020-01-01' and c_time < '2020-02-01' then 1 else 0 end),
sum(case when done_time >= '2020-01-01' and done_time < '2020-02-01' then 1 else 0 end)
from t
group by user_id;
This is probably a little faster for your particular example. However, the first version is more generalizable -- for instance, it allows you to summarize easily by both user and month.

Vertica SQL for running count distinct and running conditional count

I'm trying to build a department level score table based on a deeper product url level score table.
Date is not consecutive
Not all urls got score updates at same day (independent to each other)
dist_url should be running count distinct (cumulative count distinct)
dist urls and urls score >=30 are both count distinct
What I have now is:
Date url Store Dept Page Score
10/1 a US A X 10
10/1 b US A X 30
10/1 c US A X 60
10/4 a US A X 20
10/4 d US A X 60
10/6 b US A X 22
10/9 a US A X 40
10/9 e US A X 10
Date Store Dept Page dist urls urls score >=30
10/1 US A X 3 2
10/4 US A X 4 3
10/6 US A X 4 2
10/9 US A X 5 2
I think the dist_url can be done by using window function, just not sure on query.
Current query is as below, but it's wrong since not cumulative count distinct:
SELECT
bm.AnalysisDate,
su.SoID AS Store,
su.DptCaID AS DTID,
su.PageTypeID AS PTID,
COUNT(DISTINCT bm.SeoURLID) AS NumURLsWithDupScore,
SUM(CASE WHEN bm.DuplicationScore > 30 THEN 1 ELSE 0 END) AS Over30Count
FROM csn_seo.tblBotifyMetrics bm
INNER JOIN csn_seo.tblSEOURLs su
ON bm.SeoURLID = su.ID
WHERE su.DptCaID IS NOT NULL
AND su.DptCaID <> 0
AND su.PageTypeID IS NOT NULL
AND su.PageTypeID <> -1
AND bm.iscompliant = 1
GROUP BY bm.AnalysisDate, su.SoID, su.DptCaID, su.PageTypeID;
Please let me know if anyone has any idea.
Based on your question, you seem to want two levels of logic:
select date, store, dept,
sum(sum(start)) over (partition by dept, page order by date) as distinct_urls,
sum(sum(start_30)) over (partition by dept, page order by date) as distinct_urls_30
from ((select store, dept, page, url, min(date) as date, 1 as start, 0 as start_30
from t
group by store, dept, page, url
) union all
(select store, dept, page, url, min(date) as date, 0, 1
from t
where score >= 30
group by store, dept, page, url
)
) t
group by date, store, dept, page;
I don't understand how your query is related to your question.
Try as I might, I don't get your output either:
But I think you can avoid UNION SELECTs - Does this do what you expect?
NULLS don't figure in COUNT DISTINCTs - and here you can combine an aggregate expression with an OLAP one ...
And Vertica has named windows to increase readability ....
WITH
input(Date,url,Store,Dept,Page,Score) AS (
SELECT DATE '2019-10-01','a','US','A','X',10
UNION ALL SELECT DATE '2019-10-01','b','US','A','X',30
UNION ALL SELECT DATE '2019-10-01','c','US','A','X',60
UNION ALL SELECT DATE '2019-10-04','a','US','A','X',20
UNION ALL SELECT DATE '2019-10-04','d','US','A','X',60
UNION ALL SELECT DATE '2019-10-06','b','US','A','X',22
UNION ALL SELECT DATE '2019-10-09','a','US','A','X',40
UNION ALL SELECT DATE '2019-10-09','e','US','A','X',10
)
SELECT
date
, store
, dept
, page
, SUM(COUNT(DISTINCT url) ) OVER(w) AS dist_urls
, SUM(COUNT(DISTINCT CASE WHEN score >=30 THEN url END)) OVER(w) AS dist_urls_gt_30
FROM input
GROUP BY
date
, store
, dept
, page
WINDOW w AS (PARTITION BY store,dept,page ORDER BY date)
;
-- out date | store | dept | page | dist_urls | dist_urls_gt_30
-- out ------------+-------+------+------+-----------+-----------------
-- out 2019-10-01 | US | A | X | 3 | 2
-- out 2019-10-04 | US | A | X | 5 | 3
-- out 2019-10-06 | US | A | X | 6 | 3
-- out 2019-10-09 | US | A | X | 8 | 4
-- out (4 rows)
-- out
-- out Time: First fetch (4 rows): 45.321 ms. All rows formatted: 45.364 ms

How to flag active customers who have at least one transaction per month?

Objective is to create a flag for active customers.
An active customer is someone who has atleast one transaction every month.
Time frame - May 2018 to May 2019
Data is at transaction level
-------------------------------------
txn_id | txn_date | name | amount
-------------------------------------
101 2018-05-01 ABC 100
102 2018-05-02 ABC 200
-------------------------------------
output should be like this -
----------------
name | flag
----------------
ABC active
BCF inactive
You can use aggregation to get the active customers:
select name
from t
where txn_date >= '2018-05-01' and txn_date < '2019-06-01'
group by name
having count(distinct last_day(txn_date)) = 13 -- all months accounted for
EDIT:
If you want a flag, just move the condition to a case expression:
select name,
(case when count(distinct case when txn_date >= '2018-05-01' and txn_date < '2019-06-01' then last_day(txn_date) end) = 13
then 'active' else 'inactive'
end) as flag
from t;

Find the monthly count, for every month from the start date till member is cancelled

Problem: Monthly distinct count of members from the first date of the gene reading, till the member is cancelled.
Members can have more than one reading per month. They can continue to have as many readings as they want.
Example:
member_id date gene_a_measurement_done gene_b_measurement_done
5557153 1/1/2010 y
5557153 2/1/2010 y
222458 2/1/2010 y y
222458 1/1/2011 y
707222 1/1/2011 y
Another table has members cancellation date:
member_id status date
5557153 Cancelled 5/1/2011
222458 Cancelled 12/1/9999
707222 Cancelled 12/1/9999
Expected result :
month distinct_count_of_member_with_gene_a_measurement distinct_count_of_member_with_gene_b_measurement
1/1/10 1 0
2/1/10 2 2
3/1/10 2 2
4/1/10 2 2
5/1/10 1 1
6/1/10 1 1
7/1/10 1 1
8/1/10 1 1
9/1/10 1 1
10/1/10 1 1
11/1/10 1 1
12/1/10 1 1
1/1/11 2 1
Query tried:
SELECT
sub.last_day,
sum(sub.distinct_count_of_member_with_gene_a_measurement) as distinct_count_of_member_with_gene_a_measurement,
sum(sub.distinct_count_of_member_with_gene_b_measurement) as distinct_count_of_member_with_gene_b_measurement,
FROM
(SELECT last_day(date),
COUNT(DISTINCT member_id) as distinct_count_of_member_with_gene_a_measurement,
null as distinct_count_of_member_with_gene_b_measurement,
FROM measurement
WHERE gene_a_measurement_done is not null
GROUP BY last_day(date)
UNION ALL
SELECT last_day(date),
null as distinct_count_of_member_with_gene_a_measurement,
COUNT(DISTINCT member_id) as distinct_count_of_member_with_gene_b_measurement,
FROM measurement
WHERE gene_b_measurement_done is not null
GROUP BY last_day(date)) as sub
GROUP BY sub.last_day(date)
Above query only gives distinct count of member for the month for which measurement was done and I am not sure how to best consider cancellation date? (inner join with member_status table on member_id and have condition to filter out cancelled member?)

Can I use Oracle SQL to plot actual dates from Schedule Information?

I asked this question in regard to SQL Server, but what's the answer for an Oracle environment (10g)?
If I have a table containing schedule information that implies particular dates, is there a SQL statement that can be written to convert that information into actual rows, using something like MSSQL's Commom Table Expressions, perhaps?
Consider a payment schedule table with these columns:
StartDate - the date the schedule begins (1st payment is due on this date)
Term - the length in months of the schedule
Frequency - the number of months between recurrences
PaymentAmt - the payment amount :-)
SchedID StartDate Term Frequency PaymentAmt
-------------------------------------------------
1 05-Jan-2003 48 12 1000.00
2 20-Dec-2008 42 6 25.00
Is there a single SQL statement to allow me to go from the above to the following?
Running
SchedID Payment Due Expected
Num Date Total
--------------------------------------
1 1 05-Jan-2003 1000.00
1 2 05-Jan-2004 2000.00
1 3 05-Jan-2005 3000.00
1 4 05-Jan-2006 4000.00
2 1 20-Dec-2008 25.00
2 2 20-Jun-2009 50.00
2 3 20-Dec-2009 75.00
2 4 20-Jun-2010 100.00
2 5 20-Dec-2010 125.00
2 6 20-Jun-2011 150.00
2 7 20-Dec-2011 175.00
Your thoughts are appreciated.
Oracle actually has syntax for hierarchical queries using the CONNECT BY clause. SQL Server's use of the WITH clause looks like a hack in comparison:
SELECT t.SchedId,
CASE LEVEL
WHEN 1 THEN
t.StartDate
ELSE
ADD_MONTHS(t.StartDate, t.frequency)
END 'DueDate',
CASE LEVEL
WHEN 1 THEN
t.PaymentAmt
ELSE
SUM(t.paymentAmt)
END 'RunningExpectedTotal'
FROM PaymentScheduleTable t
WHERE t.PaymentNum <= t.Term / t.Frequency
CONNECT BY PRIOR t.startdate = t.startdate
GROUP BY t.schedid, t.startdate, t.frequency, t.paymentamt
ORDER BY t.SchedId, t.PaymentNum
I'm not 100% on that - I'm more confident about using:
SELECT t.SchedId,
t.StartDate 'DueDate',
t.PaymentAmt 'RunningExpectedTotal'
FROM PaymentScheduleTable t
WHERE t.PaymentNum <= t.Term / t.Frequency
CONNECT BY PRIOR t.startdate = t.startdate
ORDER BY t.SchedId, t.PaymentNum
...but it doesn't include the logic to handle when you're dealing with the 2nd+ entry in the chain to add months & sum the amounts. The summing could be done with GROUP BY CUBE or ROLLUP depending on the detail needed.
I don't understand why 5 payment days for schedid = 1 and 7 for scheid = 2?
48 /12 = 4 and 42 / 6 = 7. So I expected 4 payment days for schedid = 1.
Anyway I use the model clause:
create table PaymentScheduleTable
( schedid number(10)
, startdate date
, term number(3)
, frequency number(3)
, paymentamt number(5)
);
insert into PaymentScheduleTable
values (1,to_date('05-01-2003','dd-mm-yyyy')
, 48
, 12
, 1000);
insert into PaymentScheduleTable
values (2,to_date('20-12-2008','dd-mm-yyyy')
, 42
, 6
, 25);
commit;
And now the select with model clause:
select schedid, to_char(duedate,'dd-mm-yyyy') duedate, expected, i paymentnum
from paymentscheduletable
model
partition by (schedid)
dimension by (1 i)
measures (
startdate duedate
, paymentamt expected
, term
, frequency)
rules
( expected[for i from 1 to term[1]/frequency[1] increment 1]
= nvl(expected[cv()-1],0) + expected[1]
, duedate[for i from 1 to term[1]/frequency[1] increment 1]
= add_months(duedate[1], (cv(i)-1) * frequency[1])
)
order by schedid,i;
This outputs:
SCHEDID DUEDATE EXPECTED PAYMENTNUM
---------- ---------- ---------- ----------
1 05-01-2003 1000 1
1 05-01-2004 2000 2
1 05-01-2005 3000 3
1 05-01-2006 4000 4
2 20-12-2008 25 1
2 20-06-2009 50 2
2 20-12-2009 75 3
2 20-06-2010 100 4
2 20-12-2010 125 5
2 20-06-2011 150 6
2 20-12-2011 175 7
11 rows selected.
I didn't set out to answer my own question, but I'm doing work with Oracle now and I have had to learn some new Oracle-flavored things.
Anyway, the CONNECT BY statement is really nice--yes, much nicer than MSSQL's hierchical query approach, and using that construct, I was able to produce a very clean query that does what I was looking for:
SELECT DISTINCT
t.SchedID
,level as PaymentNum
,add_months(T.StartDate,level - 1) as DueDate
,(level * t.PaymentAmt) as RunningTotal
FROM SchedTest t
CONNECT BY level <= (t.Term / t.Frequency)
ORDER BY t.SchedID, level
My only remaining issue is that I had to use DISTINCT because I couldn't figure out how to select my rows from DUAL (the affable one-row Oracle table) instead of from my table of schedule data, which has at least 2 rows. If I could do the above with FROM DUAL, then my DISTINCT indicator wouldn't be necessary. Any thoughts?
Other than that, I think this is pretty nice. Et tu?