Show specific rows in SQL by not existing in another table

Show specific rows in SQL by not existing in another table - sql

I'm not very experienced in advanced SQL and stackoverflow, so I'm trying my best to explain what I need.
Let's say I have a table called 'Shift' and a table called 'Schedule'.
'Shift' has columns 'shift_id, shift_start, shift_end, shift_day,
shift_function_id'.
shift_start represents a start time.
shift_end represents a end time.
shift_day represents the number of the day in a week (0-6 starting on
sunday).
shift_function_id represents a function_id belonging to the shift.
'Schedule' has columns 'schedule_id, schedule_date, schedule_start,
schedule_end, schedule_function_id'.
schedule_date represents the date of a schedule.
schedule_start represents start time.
schedule_end represents end time.
schedule_function_id represents a function_id belonging to the
schedule.
What I'm trying to do is if a row from 'Shift' doesn't exist in table 'Schedule' with a given specific date where shift_start = schedule_start AND shift_end = schedule_end AND shift_function_id = schedule_function_id, then show the row.
Here's an example:
SELECT shift_id, shift_day, shift_start, shift_end, function_id, function_name, function_color
FROM Shift
LEFT JOIN Function
ON function_id = shift_function_id
WHERE shift_day = $day
AND NOT EXISTS (
SELECT 1
FROM Schedule
WHERE schedule_date = '$date' AND schedule_start = shift_start AND schedule_end = shift_end AND schedule_function_id = shift_function_id
)
ORDER BY function_name, shift_start, shift_end;
The problem is
If I have 2 shifts with the same starting and end time and table 'Schedule' contains ONE row with the same function_id and starting and end time, BOTH the 2 shifts won't show up.
Here's an example of the table content:
Schedule
schedule_id: 310
schedule_date: 2020-01-11
schedule_start: 16:30:00
schedule_end: 20:00:00
schedule_function_id: 27
Shift
shift_id: 45
shift_day: 6
shift_start: 16:30:00
shift_end: 20:00:00
shift_function_id: 27
shift_id: 46
shift_day: 6
shift_start: 16:30:00
shift_end: 20:00:00
shift_function_id: 27
BOTH 2 rows from 'Shift' dont show up anymore'.
What I want
I want if 'Schedule' only has 1 row which contains the same information as the given data in 'Shift', I want the other row to show up.
If 'Schedule' has 2 rows with the same information, none to show up. It just needs to depend on how many rows 'Schedule' has with the same information.
IMAGES
When nothing is filled in, it shows 2 rows with same start and end
When I put a record with same start and end time, it removes both shift rows
I need this, when I only fill in one record with same start and end time

The only way I can think of is to let NOT EXISTS clause remove all rows, Just append 1 row from duplicate rows with MIN (Or MAX) Shift_id -
SELECT shift_id,
shift_day,
shift_start,
shift_end,
function_id,
function_name,
function_color
FROM shift
LEFT JOIN function
ON function_id = shift_function_id
WHERE shift_day = $day
AND NOT EXISTS (SELECT 1
FROM schedule
WHERE schedule_date = '$date'
AND schedule_start = shift_start
AND schedule_end = shift_end
AND schedule_function_id = shift_function_id)
UNION ALL
SELECT Min(S1.shift_id),
S1.shift_day,
S1.shift_start,
S1.shift_end,
function_id,
function_name,
function_color
FROM shift S1
JOIN shift S2
ON S1.shift_id <> S2.shift_id
AND S1.shift_day = S2.shift_day
AND S1.shift_start = S2.shift_start
AND S1.shift_end = S2.shift_end
LEFT JOIN function
ON function_id = S1.shift_function_id
GROUP BY S1.shift_day,
S1.shift_start,
S1.shift_end,
function_id,
function_name,
function_color
ORDER BY function_name,
shift_start,
shift_end;
Here is working example.

If your version of MariaDB supports window functions, you could use row_number() in a subquery to disambiguate records that have the same (shift_start, shift_end, shift_function) in the shift_table, or the same (schedule_start, schedule_end, schedule_function) in the schedule table.
You can then use the record rank in the join conditions (I changed the NOT EXITS subquery to a LEFT JOIN antipattern, but this is essentially the same logic).
select s.*, f.*
from (
select
s.*,
row_number() over(
partition by shift_start, shift_end, shift_function
order by shift_id
) rn
from shift s
where shift_day = #shift_day
) s
left join (
select
c.*,
row_number() over(
partition by schedule_start, schedule_end, schedule_function
order by schedule_id
) rn
from schedule c
where schedule_date = #schedule_date
) c
on c.schedule_start = s.shift_start
and c.schedule_end = s.shift_end
and c.schedule_function_id = s.shift_function_id
and c.rn = s.rn
left join function f
on f.function_id = s.shift_function_id
where c.rn is null
order by f.function_name, s.shift_start, s.shift_end

Related

Improving the performance of a query

My background is Oracle but we've moved to Hadoop on AWS and I'm accessing our logs using Hive SQL. I've been asked to return a report where the number of high severity errors on the system of any given type exceeds 9 in any rolling period of 30 days (9 but I use 2 in the example to keep the example data volumes down) by uptime. I've written code to do this but I don't really understand performance tuning in Hive. A lot of the stuff I learned in Oracle doesn't seem applicable.
Can this be improved?
Data is roughly
CREATE TABLE LOG_TABLE
(SYSTEM_ID VARCHAR(1),
EVENT_TYPE VARCHAR(2),
EVENT_ID VARCHAR(3),
EVENT_DATE DATE,
UPTIME INT);
INSERT INOT LOG_TABLE
VALUES
('1','A1','138','2018-10-29',34),
('1','A2','146','2018-11-13',49),
('1','A3','140','2018-11-02',38),
('1','B1','130','2018-10-13',18),
('1','B1','150','2018-11-19',55),
('1','B2','137','2018-10-27',32),
('2','A1','128','2018-10-11',59),
('2','A1','131','2018-10-16',64),
('2','A1','136','2018-10-25',73),
('2','A2','139','2018-10-31',79),
('2','A2','145','2018-11-11',90),
('2','A2','147','2018-11-14',93),
('2','A3','135','2018-10-24',72),
('2','B1','124','2018-10-03',51),
('2','B1','133','2018-10-19',67),
('2','B2','134','2018-10-22',70),
('2','B2','142','2018-11-06',85),
('2','B2','148','2018-11-15',94),
('2','B2','149','2018-11-17',96),
('3','A2','127','2018-10-10',122),
('3','A3','123','2018-10-01',113),
('3','A3','125','2018-10-06',118),
('3','A3','126','2018-10-07',119),
('3','A3','141','2018-11-05',148),
('3','A3','144','2018-11-10',153),
('3','B1','132','2018-10-18',130),
('3','B1','143','2018-11-08',151),
('3','B2','129','2018-10-12',124);
and code that works is as follows. I do a self join on the log table to return all the records with the gap between them and include those with a gap of 30 days or less. I then select those where there are more than 2 events into a second cte and from these I count distinct event types and event ids by system and uptime range
WITH EVENTGAP AS
(SELECT T1.EVENT_TYPE,
T1.SYSTEM_ID,
T1.EVENT_ID,
T2.EVENT_ID AS EVENT_ID2,
T1.EVENT_DATE,
T2.EVENT_DATE AS EVENT_DATE2,
T1.UPTIME,
DATEDIFF(T2.EVENT_DATE,T1.EVENT_DATE) AS EVENT_GAP
FROM LOG_TABLE T1
INNER JOIN LOG_TABLE T2
ON (T1.EVENT_TYPE=T2.EVENT_TYPE
AND T1.SYSTEM_ID=T2.SYSTEM_ID)
WHERE DATEDIFF(T2.EVENT_DATE,T1.EVENT_DATE) BETWEEN 0 AND 30
AND T1.UPTIME BETWEEN 0 AND 299
AND T2.UPTIME BETWEEN 0 AND 330),
EVENTCOUNT
AS (SELECT EVENT_TYPE,
SYSTEM_ID,
EVENT_ID,
EVENT_DATE,
COUNT(1)
FROM EVENTGAP
GROUP BY EVENT_TYPE,
SYSTEM_ID,
EVENT_ID,
EVENT_DATE
HAVING COUNT(1)>2)
SELECT EVENTGAP.SYSTEM_ID,
CASE WHEN FLOOR(UPTIME/50) = 0 THEN '0-49'
WHEN FLOOR(UPTIME/50) = 1 THEN '50-99'
WHEN FLOOR(UPTIME/50) = 2 THEN '100-149'
WHEN FLOOR(UPTIME/50) = 3 THEN '150-199'
WHEN FLOOR(UPTIME/50) = 4 THEN '200-249'
WHEN FLOOR(UPTIME/50) = 5 THEN '250-299' END AS UPTIME_BAND,
COUNT(DISTINCT EVENTGAP.EVENT_ID2) AS EVENT_COUNT,
COUNT(DISTINCT EVENTGAP.EVENT_TYPE) AS TYPE_COUNT
FROM EVENTGAP
WHERE EVENTGAP.EVENT_ID IN (SELECT DISTINCT EVENTCOUNT.EVENT_ID FROM EVENTCOUNT)
GROUP BY EVENTGAP.SYSTEM_ID,
CASE WHEN FLOOR(UPTIME/50) = 0 THEN '0-49'
WHEN FLOOR(UPTIME/50) = 1 THEN '50-99'
WHEN FLOOR(UPTIME/50) = 2 THEN '100-149'
WHEN FLOOR(UPTIME/50) = 3 THEN '150-199'
WHEN FLOOR(UPTIME/50) = 4 THEN '200-249'
WHEN FLOOR(UPTIME/50) = 5 THEN '250-299' END
This gives the following result, which should be unique counts of event ids and event types that have 3 or more events falling in any rolling 30 day period. Some events may be in more than one period but will only be counted once.
EVENTGAP.SYSTEM_ID UPTIME_BAND EVENT_COUNT TYPE_COUNT
2 50-99 10 3
3 100-149 4 1

In both Hive and Oracle, you would want to do this using window functions, using a window frame clause. The exact logic is different in the two databases.
In Hive you can use range between if you convert event_date to a number. A typical method is to subtract a fixed value from it. Another method is to use unix timestamps:
select lt.*
from (select lt.*,
count(*) over (partition by event_type
order by unix_timestamp(event_date)
range between 60*24*24*30 preceding and current row
) as rolling_count
from log_table lt
) lt
where rolling_count >= 2 -- or 9

Datetime SQL statement (Working in SQL Developer)

I'm new to the SQL scene but I've started to gather some data that makes sense to me after learning a little about SQL Developer. Although, I do need help with a query.
My goal:
To use the current criteria I have and select records only when the date-time value is within 5 minutes of the latest date-time. Here is my current sql statement
`SELECT ABAMS.T_WORKORDER_HIST.LINE_NO AS Line,
ABAMS.T_WORKORDER_HIST.STATE AS State,
ASMBLYTST.V_SEQ_SERIAL_ALL.BUILD_DATE,
ASMBLYTST.V_SEQ_SERIAL_ALL.SEQ_NO,
ASMBLYTST.V_SEQ_SERIAL_ALL.SEQ_NO_EXT,
ASMBLYTST.V_SEQ_SERIAL_ALL.UPD_REASON_CODE,
ABAMS.V_SERIAL_LINESET.LINESET_DATE AS "Lineset Time",
ABAMS.T_WORKORDER_HIST.SERIAL_NO AS ESN,
ABAMS.T_WORKORDER_HIST.ITEM_NO AS "Shop Order",
ABAMS.T_WORKORDER_HIST.CUST_NAME AS Customer,
ABAMS.T_ITEM_POLICY.PL_LOC_DROP_ZONE_NO AS PLDZ,
ABAMS.T_WORKORDER_HIST.CONFIG_NO AS Configuration,
ASMBLYTST.V_EDP_ENG_LAST_ABSN.LAST_ASMBLY_ABSN AS "Last Sta",
ASMBLYTST.V_LAST_ENG_LOCATION.LAST_ASMBLY_LOC,
ASMBLYTST.V_LAST_ENG_LOCATION.LAST_MES_LOC,
ASMBLYTST.V_LAST_ENG_LOCATION.LAST_ASMBLY_TIME,
ASMBLYTST.V_LAST_ENG_LOCATION.LAST_MES_TIME
FROM ABAMS.T_WORKORDER_HIST
LEFT JOIN ABAMS.V_SERIAL_LINESET
ON ABAMS.V_SERIAL_LINESET.SERIAL_NO = ABAMS.T_WORKORDER_HIST.SERIAL_NO
LEFT JOIN ASMBLYTST.V_EDP_ENG_LAST_ABSN
ON ASMBLYTST.V_EDP_ENG_LAST_ABSN.SERIAL_NO = ABAMS.T_WORKORDER_HIST.SERIAL_NO
LEFT JOIN ASMBLYTST.V_SEQ_SERIAL_ALL
ON ASMBLYTST.V_SEQ_SERIAL_ALL.SERIAL_NO = ABAMS.T_WORKORDER_HIST.SERIAL_NO
LEFT JOIN ABAMS.T_ITEM_POLICY
ON ABAMS.T_ITEM_POLICY.ITEM_NO = ABAMS.T_WORKORDER_HIST.ITEM_NO
LEFT JOIN ABAMS.T_CUR_STATUS
ON ABAMS.T_CUR_STATUS.SERIAL_NO = ABAMS.T_WORKORDER_HIST.SERIAL_NO
INNER JOIN ASMBLYTST.V_LAST_ENG_LOCATION
ON ASMBLYTST.V_LAST_ENG_LOCATION.SERIAL_NO = ABAMS.T_WORKORDER_HIST.SERIAL_NO
WHERE ABAMS.T_WORKORDER_HIST.LINE_NO = 10
AND (ABAMS.T_WORKORDER_HIST.STATE = 'PROD'
OR ABAMS.T_WORKORDER_HIST.STATE = 'SCHED')
AND ASMBLYTST.V_SEQ_SERIAL_ALL.BUILD_DATE BETWEEN TRUNC(SysDate) - 10 AND TRUNC(SysDate) + 1
AND (ABAMS.V_SERIAL_LINESET.LINESET_DATE IS NOT NULL
OR ABAMS.V_SERIAL_LINESET.LINESET_DATE IS NULL)
AND (ASMBLYTST.V_EDP_ENG_LAST_ABSN.LAST_ASMBLY_ABSN < '1800'
OR ASMBLYTST.V_EDP_ENG_LAST_ABSN.LAST_ASMBLY_ABSN IS NULL)
ORDER BY ASMBLYTST.V_EDP_ENG_LAST_ABSN.LAST_ASMBLY_ABSN DESC Nulls Last,
ABAMS.V_SERIAL_LINESET.LINESET_DATE Nulls Last,
ASMBLYTST.V_SEQ_SERIAL_ALL.BUILD_DATE,
ASMBLYTST.V_SEQ_SERIAL_ALL.SEQ_NO,
ASMBLYTST.V_SEQ_SERIAL_ALL.SEQ_NO_EXT`
Here are some of the records I get from the table
ASMBLYTST.V_LAST_ENG_LOCATION.LAST_ASMBLY_TIME
2018-06-14 01:28:25
2018-06-14 01:29:26
2018-06-14 01:27:30
2018-06-13 22:44:03
2018-06-14 01:28:45
2018-06-14 01:27:37
2018-06-14 01:27:41
What I essentially want is for
2018-06-13 22:44:03
to be excluded from the query because it is not within the 5 minute window from the latest record Which in this data set is
2018-06-14 01:29:26
The one dynamic problem i seem to have is that the values for date-time are constantly updating.
Any ideas?
Thank you!

Here are two different solutions, each uses a table called "ASET".
ASET contains 20 records 1 minute apart:
WITH
aset (ttime, cnt)
AS
(SELECT systimestamp AS ttime, 1 AS cnt
FROM DUAL
UNION ALL
SELECT ttime + INTERVAL '1' MINUTE AS ttime, cnt + 1 AS cnt
FROM aset
WHERE cnt < 20)
select * from aset;
Now using ASET for our data, the following query finds the maximum date in ASET, and restricts the results to the six records within 5 minutes of ASET:
SELECT *
FROM aset
WHERE ttime >= (SELECT MAX (ttime)
FROM aset)
- INTERVAL '5' MINUTE;
An alternative is to use an analytic function:
with bset
AS
(SELECT ttime, cnt, MAX (ttime) OVER () - ttime AS delta
FROM aset)
SELECT *
FROM bset
WHERE delta <= INTERVAL '5' MINUTE

Counting concurrent records based on startdate and enddate columns

The table structure:
StaffingRecords
PersonnelId int
GroupId int
StaffingStartDateTime datetime
StaffingEndDateTime datetime
How can I get a list of staffing records, given a date and a group id that employees belong to, where the count of present employees fell below a threshold, say, 3, at any minute of the day?
The way my brain works, I would call a stored proc repeatedly with each minute of the day, but of course this would be horribly inefficient:
SELECT COUNT(PersonnelId)
FROM DailyRosters
WHERE GroupId=#GroupId
AND StaffingStartTime <= #TimeParam
AND StaffingEndTime > #TimeParam
AND COUNT(GroupId) < 3
GROUP BY GroupId
HAVING COUNT(PersonnelId) < 3
Edit: If it helps to refine the question, employees may come and go throughout the day. Personnel may have a staffing record from 0800 - 0815, and another from 1000 - 1045, for example.

Here is a solution where I find all of the distinct start and end times, and then query to see how many other people are clocked in at the time. Everytime the answer is less than 4, you know you are understaffed at that time, and presumably until the NEXT start time.
with meaningfulDtms(meaningfulTime, timeType, group_id)
as
(
select distinct StaffingStartTime , 'start' as timeType, group_id
from DailyRosters
union
select distinct StaffingEndTime , 'end' as timeType, group_id
from DailyRosters
)
select COUNT(*), meaningfulDtms.group_id, meaningfulDtms.meaningfulTime
from DailyRosters dr
inner join meaningfulDtms on dr.group_id = meaningfulDtms.group_id
and (
(dr.StaffingStartTime < meaningfulDtms.meaningfulTime
and dr.StaffingEndTime >= meaningfulDtms.meaningfulTime
and meaningfulDtms.timeType = 'start')
OR
(dr.StaffingStartTime <= meaningfulDtms.meaningfulTime
and dr.StaffingEndTime > meaningfulDtms.meaningfulTime
and meaningfulDtms.timeType = 'end')
)
group by meaningfulDtms.group_id, meaningfulDtms.meaningfulTime
having COUNT(*) < 4

Create a table with all minutes in the day with dt at PK
It will have 1440 rows
this will not give you count of zero - no staff
select allMiuntes.dt, worktime.grpID, count(distinct(worktime.personID))
from allMinutes
join worktime
on allMiuntes.dt > worktime.start
and allMiuntes.dt < worktime.end
group by allMiuntes.dt, worktime.grpID
having count(distinct(worktime.personID)) < 3
for times with zero I think the best way is a master of grpID
but I am not sure about this one
select allMiuntes.dt, grpMaster.grpID, count(distinct(worktime.personID))
from grpMaster
cross join allMinutes
left join worktime
on allMiuntes.dt > worktime.start
and allMiuntes.dt < worktime.end
and worktime.grpID = grpMaster.grpID
group by allMiuntes.dt, grpMaster.grpID
having count(distinct(worktime.personID)) < 3

Getting ranges that are not in database

I want to get all times that an event is not taking place for each room. The start of the day is 9:00:00 and end is 22:00:00.
What my database looks like is this:
Event EventStart EventEnd Days Rooms DayStarts
CISC 3660 09:00:00 12:30:00 Monday 7-3 9/19/2014
MATH 2501 15:00:00 17:00:00 Monday:Wednesday 7-2 10/13/2014
CISC 1110 14:00:00 16:00:00 Monday 7-3 9/19/2014
I want to get the times that aren't in the database.
ex. For SelectedDate (9/19/2014) the table should return:
Room FreeTimeStart FreeTimeEnd
7-3 12:30:00 14:00:00
7-3 16:00:00 22:00:00
ex2. SelectedDate (10/13/2014):
Room FreeTimeStart FreeTimeEnd
7-2 9:00:00 15:00:00
7-2 17:00:00 22:00:00
What I have tried is something like this:
select * from Events where ________ NOT BETWEEN eventstart AND eventend;
But I do not know what to put in the place of the space.

This was a pretty complex request. SQL works best with sets, and not looking at line by line. Here is what I came up with. To make it easier to figure out, I wrote it as a series of CTE's so I could work through the problem a step at a time. I am not saying that this is the best possible way to do it, but it doesn't require the use of any cursors. You need the Events table and a table of the room names (otherwise, you don't see a room that doesn't have any bookings).
Here is the query and I will explain the methodology.
DECLARE #Events TABLE (Event varchar(20), EventStart Time, EventEnd Time, Days varchar(50), Rooms varchar(10), DayStarts date)
INSERT INTO #Events
SELECT 'CISC 3660', '09:00:00', '12:30:00', 'Monday', '7-3', '9/19/2014' UNION
SELECT 'MATH 2501', '15:00:00', '17:00:00', 'Monday:Wednesday', '7-2', '10/13/2014' UNION
SELECT 'CISC 1110', '14:00:00', '16:00:00', 'Monday', '7-3', '9/19/2014'
DECLARE #Rooms TABLE (RoomName varchar(10))
INSERT INTO #Rooms
SELECT '7-2' UNION
SELECT '7-3'
DECLARE #SelectedDate date = '9/19/2014'
DECLARE #MinTimeInterval int = 30 --smallest time unit room can be reserved for
;WITH
D1(N) AS (
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1
),
D2(N) AS (SELECT 1 FROM D1 a, D1 b),
D4(N) AS (SELECT 1 FROM D2 a, D2 b),
Numbers AS (SELECT TOP 3600 ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) -1 AS Number FROM D4),
AllTimes AS
(SELECT CAST(DATEADD(n,Numbers.Number*#MinTimeInterval,'09:00:00') as time) AS m FROM Numbers
WHERE DATEADD(n,Numbers.Number*#MinTimeInterval,'09:00:00') <= '22:00:00'),
OccupiedTimes AS (
SELECT e.Rooms, ValidTimes.m
FROM #Events E
CROSS APPLY (SELECT m FROM AllTimes WHERE m BETWEEN CASE WHEN e.EventStart = '09:00:00' THEN e.EventStart ELSE DATEADD(n,1,e.EventStart) END and CASE WHEN e.EventEnd = '22:00:00' THEN e.EventEnd ELSE DATEADD(n,-1,e.EventEnd) END) ValidTimes
WHERE e.DayStarts = #SelectedDate
),
AllRoomsAllTimes AS (
SELECT * FROM #Rooms R CROSS JOIN AllTimes
), AllOpenTimes AS (
SELECT a.*, ROW_NUMBER() OVER( PARTITION BY (a.RoomName) ORDER BY a.m) AS pos
FROM AllRoomsAllTimes A
LEFT OUTER JOIN OccupiedTimes o ON a.RoomName = o.Rooms AND a.m = o.m
WHERE o.m IS NULL
), Finalize AS (
SELECT a1.RoomName,
CASE WHEN a3.m IS NULL OR DATEDIFF(n,a3.m, a1.m) > #MinTimeInterval THEN a1.m else NULL END AS FreeTimeStart,
CASE WHEN a2.m IS NULL OR DATEDIFF(n,a1.m,a2.m) > #MinTimeInterval THEN A1.m ELSE NULL END AS FreeTimeEnd,
ROW_NUMBER() OVER( ORDER BY a1.RoomName ) AS Pos
FROM AllOpenTimes A1
LEFT OUTER JOIN AllOpenTimes A2 ON a1.RoomName = a2.RoomName and a1.pos = a2.pos-1
LEFT OUTER JOIN AllOpenTimes A3 ON a1.RoomName = a3.RoomName and a1.pos = a3.pos+1
WHERE A2.m IS NULL OR DATEDIFF(n,a1.m,a2.m) > #MinTimeInterval
OR
A3.m IS NULL OR DATEDIFF(n,a3.m, a1.m) > #MinTimeInterval
)
SELECT F1.RoomName, f1.FreeTimeStart, f2.FreeTimeEnd FROM Finalize F1
LEFT OUTER JOIN Finalize F2 ON F1.Pos = F2.pos-1 AND f1.RoomName = f2.RoomName
WHERE f1.pos % 2 = 1
In the first several lines, I create temp variables to simulate your tables Events and Rooms.
The variable #MinTimeInterval determines what time interval the room schedules can be on (every 30 min, 15 min, etc - this number needs to divide evenly into 60).
Since SQL cannot query data that is missing, we need to create a table that holds all of the times that we want to check for. The first several lines in the WITH create a table called AllTimes which are all the possible time intervals in your day.
Next, we get a list of all of the times that are occupied (OccupiedTimes), and then LEFT OUTER JOIN this table to the AllTimes table which gives us all the available times. Since we only want the start and end of each free time, create the Finalize table which self joins each record to the previous and next record in the table. If the times in these rows are greater than #MinTimeInterval, then we know it is either a start or end of a free time.
Finally we self join this last table to put the start and end times in the same row and only look at every other row.
This will need to be adjusted if a single row in Events spans multiple days or multiple rooms.

Here's a solution that will return the "complete picture" including rooms that aren't booked at all for the day in question:
Declare #Date char(8) = '20141013'
;
WITH cte as
(
SELECT *
FROM -- use your table name instead of the VALUES construct
(VALUES
('09:00:00','12:30:00' ,'7-3', '20140919'),
('15:00:00','17:00:00' ,'7-2', '20141013'),
('14:00:00','16:00:00' ,'7-3', '20140919')) x(EventStart , EventEnd,Rooms, DayStarts)
), cte_Days_Rooms AS
-- get a cartesian product for the day specified and all rooms as well as the start and end time to compare against
(
SELECT y.EventStart,y.EventEnd, x.rooms,a.DayStarts FROM
(SELECT #Date DayStarts) a
CROSS JOIN
(SELECT DISTINCT Rooms FROM cte)x
CROSS JOIN
(SELECT '09:00:00' EventStart,'09:00:00' EventEnd UNION ALL
SELECT '22:00:00' EventStart,'22:00:00' EventEnd) y
), cte_1 AS
-- Merge the original data an the "base data"
(
SELECT * FROM cte WHERE DayStarts=#Date
UNION ALL
SELECT * FROM cte_Days_Rooms
), cte_2 as
-- use the ROW_NUMBER() approach to sort the data
(
SELECT *, ROW_NUMBER() OVER(PARTITION BY DayStarts, Rooms ORDER BY EventStart) as pos
FROM cte_1
)
-- final query: self join with an offest of one row, eliminating duplicate rows if a room is booked starting 9:00 or ending 22:00
SELECT c2a.DayStarts, c2a.Rooms , c2a.EventEnd, c2b.EventStart
FROM cte_2 c2a
INNER JOIN cte_2 c2b on c2a.DayStarts = c2b.DayStarts AND c2a.Rooms =c2b.Rooms AND c2a.pos = c2b.pos -1
WHERE c2a.EventEnd <> c2b.EventStart
ORDER BY c2a.DayStarts, c2a.Rooms

sql db2 select records from either table

I have an order file, with order id and ship date. Orders can only be shipped monday - friday. This means there are no records selected for Saturday and Sunday.
I use the same order file to get all order dates, with date in the same format (yyyymmdd).
i want to select a count of all the records from the order file based on order date... and (i believe) full outer join (or maybe right join?) the date file... because i would like to see
20120330 293
20120331 0
20120401 0
20120402 920
20120403 430
20120404 827
etc...
however, my sql statement is still not returning a zero record for the 31st and 1st.
with DatesTable as (
select ohordt "Date" from kivalib.orhdrpf
where ohordt between 20120315 and 20120406
group by ohordt order by ohordt
)
SELECT ohscdt, count(OHTXN#) "Count"
FROM KIVALIB.ORHDRPF full outer join DatesTable dts on dts."Date" = ohordt
--/*order status = filled & order type = 1 & date between (some fill date range)*/
WHERE OHSTAT = 'F' AND OHTYP = 1 and ohscdt between 20120401 and 20120406
GROUP BY ohscdt ORDER BY ohscdt
any ideas what i'm doing wrong?
thanks!

It's because there is no data for those days, they do not show up as rows. You can use a recursive CTE to build a contiguous list of dates between two values that the query can join on:
It will look something like:
WITH dates (val) AS (
SELECT CAST('2012-04-01' AS DATE)
FROM SYSIBM.SYSDUMMY1
UNION ALL
SELECT Val + 1 DAYS
FROM dates
WHERE Val < CAST('2012-04-06' AS DATE)
)
SELECT d.val AS "Date", o.ohscdt, COALESCE(COUNT(o.ohtxn#), 0) AS "Count"
FROM dates AS d
LEFT JOIN KIVALIB.ORDHRPF AS o
ON o.ohordt = TO_CHAR(d.val, 'YYYYMMDD')
WHERE o.ohstat = 'F'
AND o.ohtyp = 1

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Show specific rows in SQL by not existing in another table - sql

Related

Improving the performance of a query

Datetime SQL statement (Working in SQL Developer)

Counting concurrent records based on startdate and enddate columns

Getting ranges that are not in database

sql db2 select records from either table

Categories

Resources