I am trying to create the following table in SQL Server:
Id Had_appointment Date Month Year Clinic
1 1 2019-01-03 January 2019 A
1 1 2019-01-05 January 2019 B
5 1 2019-04-03 April 2019 C
From the following:
Id Admin_codes Date Clinic
1 AAA2 2019-01-03 A
1 D22S 2019-01-03 A
1 FFD3 2019-01-05 B
1 E222 2019-01-05 B
5 EEE1 2019-04-03 C
5 P332 2019-04-03 C
5 AA33 2019-04-03 C
5 XC22 2019-04-03 C
6 A000 2019-02-19 C
7 A999 2019-03-11 C
How can I do this? I don't want to include any individuals in my table who 1) did not have appointments & 2) have specific Admin_codes such as 'A000' and 'A999'. Thanks in advance.
We can do this by selecting distinct records for the appointments (treating every admin_code other than 'A000' and 'A999' as an appointment).
SELECT DISTINCT t.Id, '1' AS 'Had_appointment', t.[date], datename(month, [date]) [Month], year([date]) [Year], t.Clinic
FROM #t t
WHERE Admin_codes NOT IN ('A000', 'A999')
Please see demo here.
By "table" I'm assuming you mean query/resultset. In addition to #sacse's Answer, you can also accomplish the same with GROUP BY.
DECLARE #Data TABLE ( id INT, Admin_codes VARCHAR(4), [Date] DATE, Clinic VARCHAR(1) );
INSERT INTO #Data ( id, Admin_codes, [Date], Clinic ) VALUES
( 1, 'AAA2', '2019-01-03', 'A' ),( 1, 'D22S', '2019-01-03', 'A' ),( 1, 'FFD3', '2019-01-05', 'B' ),
( 1, 'E222', '2019-01-05', 'B' ),( 5, 'EEE1', '2019-04-03', 'C' ),( 5, 'P332', '2019-04-03', 'C' ),
( 5, 'AA33', '2019-04-03', 'C' ),( 5, 'XC22', '2019-04-03', 'C' ),( 6, 'A000', '2019-02-19', 'C' ),
( 7, 'A999', '2019-03-11', 'C' );
SELECT
id,
1 AS Had_appointment,
[Date],
DATENAME ( month, [Date] ) AS [Month],
YEAR ( [Date] ) AS [Year],
Clinic
FROM #Data
WHERE
Admin_codes NOT IN ( 'A000', 'A999' )
GROUP BY
id, [Date], Clinic
ORDER BY
id, [Date];
Returns
+----+-----------------+------------+---------+------+--------+
| id | Had_appointment | Date | Month | Year | Clinic |
+----+-----------------+------------+---------+------+--------+
| 1 | 1 | 2019-01-03 | January | 2019 | A |
| 1 | 1 | 2019-01-05 | January | 2019 | B |
| 5 | 1 | 2019-04-03 | April | 2019 | C |
+----+-----------------+------------+---------+------+--------+
Using GROUP BY will come in handy in the event you plan to do any additional aggregating (e.g., SUM, AVG, etc.).
Related
I have a table from which I am trying to return the quantity per day that the article was in the system.
Example is in table Bestand the are multiple palletes of a different articles that each have a Booking In and Out date; I am try to find out the Min and Max amount of stock that was in the system per article and month.
My thinking is that if I can return the stock quantity for each day and then read out the Min and Max values.
The Timespan would be set at the time of running the SQL and the articles would be fixed.
To find out the quantity for each day I have used the following SQL:
SELECT DISTINCT
a.artbez1 AS Artikelbezeichnung,
b.artikelnr AS Artikelnummer,
SUM(CASE WHEN TO_DATE('2019-11-01 00:00:00', 'YYYY-MM-DD HH24:MI:SS') BETWEEN b.neu_datum AND b.aender_datum THEN 1 * b.menge_ist ELSE 0 END) AS "01 Nov 2019"
FROM
artikel a, bestand b
WHERE
b.artikelnr IN ('273632002', .... (huge long list of numbers) ....)
AND b.artikelnr = a.artikelnr
GROUP BY
a.artbez1, b.artikelnr;
This returns for example:
ARTIKELBEZEICHNUNG
ARTIKELNUMMER
01 Nov 2019
SC-4400.CW
220450002
39
S-320.FK120
220502004
0
H-595.FK120
220800004
35
AC-548.FK209
220948032
0
AS-6800.CW
221355002
20
I would like return this for each day of the Month and then from that return the Min and Max Value for each Article
I have the following SQL to return the days of a given Month and was wondering if anyone had any ideas on how they could be combined (If at all possible):
SELECT to_date('01.11.2019','dd.mm.yyyy')+LEVEL-1
FROM dual
CONNECT BY LEVEL <= TO_CHAR(LAST_DAY(to_date('01.11.2019','dd.mm.yyyy')),'DD')
DATES
2019-11-01 00:00:00
2019-11-02 00:00:00
2019-11-03 00:00:00
2019-11-04 00:00:00
2019-11-05 00:00:00
2019-11-06 00:00:00
2019-11-07 00:00:00
The result i am try to get would be something like:
ARTIKELBEZEICHNUNG
ARTIKELNUMMER
Nov 19 Min
Nov 19 Max
SC-4400.CW
220450002
5
39
S-320.FK120
220502004
0
15
H-595.FK120
220800004
2
35
AC-548.FK209
220948032
0
0
AS-6800.CW
221355002
10
20
Is this at all possible in SQL?
Thanks for taking the time to read my post.
JeRi
You can use a partitioned outer join:
WITH calendar ( day ) AS (
SELECT DATE '2019-11-01'
FROM DUAL
UNION ALL
SELECT day + INTERVAL '1' DAY
FROM calendar
WHERE day < LAST_DAY( DATE '2019-11-01' )
),
daily_totals ( artbez1, Artikelnr, Day, total_menge_ist ) AS (
SELECT MAX( ab.artbez1 ),
ab.artikelnr,
c.day,
COALESCE( SUM( ab.menge_ist ), 0 )
FROM calendar c
LEFT OUTER JOIN
( SELECT a.artikelnr,
a.artbez1,
b.neu_datum,
b.aender_datum,
b.menge_ist
FROM artikel a
LEFT JOIN bestand b
ON ( a.artikelnr = b.artikelnr )
-- WHERE b.artikelnr IN ('273632002', .... (huge long list of numbers) ....)
) ab
PARTITION BY ( ab.artikelnr, ab.artbez1 )
ON ( c.day BETWEEN ab.neu_datum AND ab.aender_datum )
GROUP BY ab.artikelnr, c.day
)
SELECT MAX( artbez1 ) AS Artikelbezeichnung,
artikelnr AS Artikelnummer,
TRUNC( day, 'MM' ) AS month,
MIN( total_menge_ist ) AS min_total_menge_ist,
MAX( total_menge_ist ) AS max_total_menge_ist
FROM daily_totals
GROUP BY artikelnr, TRUNC( day, 'MM' );
Which, for the sample data:
CREATE TABLE artikel ( artikelnr, artbez1 ) AS
SELECT 220450002, 'SC-4400.CW' FROM DUAL UNION ALL
SELECT 220502004, 'S-320.FK120' FROM DUAL UNION ALL
SELECT 220800004, 'H-595.FK120' FROM DUAL UNION ALL
SELECT 220948032, 'AC-548.FK209' FROM DUAL UNION ALL
SELECT 221355002, 'AS-6800.CW' FROM DUAL;
CREATE TABLE bestand ( artikelnr, neu_datum, aender_datum, menge_ist ) AS
SELECT 220450002, DATE '2019-10-30', DATE '2019-11-01', 20 FROM DUAL UNION ALL
SELECT 220450002, DATE '2019-11-01', DATE '2019-11-05', 19 FROM DUAL UNION ALL
SELECT 220502004, DATE '2019-11-05', DATE '2019-11-03', 5 FROM DUAL UNION ALL
SELECT 220800004, DATE '2019-11-01', DATE '2019-11-15', 35 FROM DUAL UNION ALL
SELECT 221355002, DATE '2019-10-20', DATE '2019-11-05', 5 FROM DUAL UNION ALL
SELECT 221355002, DATE '2019-10-25', DATE '2019-11-10', 5 FROM DUAL UNION ALL
SELECT 221355002, DATE '2019-10-28', DATE '2019-11-13', 5 FROM DUAL UNION ALL
SELECT 221355002, DATE '2019-10-30', DATE '2019-11-15', 5 FROM DUAL UNION ALL
SELECT 221355002, DATE '2019-11-05', DATE '2019-11-20', 5 FROM DUAL;
Outputs:
ARTIKELBEZEICHNUNG | ARTIKELNUMMER | MONTH | MIN_TOTAL_MENGE_IST | MAX_TOTAL_MENGE_IST
:----------------- | ------------: | :------------------ | ------------------: | ------------------:
SC-4400.CW | 220450002 | 2019-11-01 00:00:00 | 0 | 39
S-320.FK120 | 220502004 | 2019-11-01 00:00:00 | 0 | 0
AC-548.FK209 | 220948032 | 2019-11-01 00:00:00 | 0 | 0
H-595.FK120 | 220800004 | 2019-11-01 00:00:00 | 0 | 35
AS-6800.CW | 221355002 | 2019-11-01 00:00:00 | 0 | 25
db<>fiddle here
I have the following data (the data is available from 2017 - Present)
SELECT * FROM TABLE1 WHERE DATE > TO_DATE('01/01/2019','MM/DD/YYYY')
Emp_ID Date Vehicle_ID Working_Hours
1005 01/01/2019 X500 7
1005 01/02/2019 X500 6
1005 01/03/2019 X700 7
1005 01/04/2019 X500 5
1005 01/05/2019 X700 7
1005 01/06/2019 X500 7
1006 01/01/2019 X500 7
1006 01/02/2019 X500 6
1006 01/03/2019 X700 7
1006 01/04/2019 X500 5
1006 01/05/2019 X700 7
1006 01/06/2019 X500 7
I need to calculate two columns.
LAST_6M_UNIQ_Vehicle_Count ==> Count of Unique Vehicle ID in the last(past) 6 months for that employee
LAST_6M_Vehicle_Count ==> Count of all vehicle ID for that employee in the Past 6 months
Note: Past 6 month from the date column
Expected output:
Emp_ID Date Vehicle_ID Working_Hours LAST_6M_UNIQ_Vehicle_Count LAST_6M_Vehicle_Count
1005 01/01/2019 X500 7 6 66
1005 01/02/2019 X500 6 7 62
1005 01/03/2019 X700 7 6 63
1005 01/04/2019 X500 5 7 67
1005 01/05/2019 X700 7 7 66
1005 01/06/2019 X500 7 7 67
. . . .
. . . .
. . . .
1005 03/20/2019 X600 6 12 75
1006 01/01/2019 X500 7 11 74
1006 01/02/2019 X500 6 10 66
1006 01/03/2019 X700 7 11 72
1006 01/04/2019 X500 5 13 67
1006 01/05/2019 X700 7 12 64
1006 01/06/2019 X500 7 12 63
For example, in the first row, the value for LAST_6M_UNIQ_Vehicle_Count is 6 because for the employee id 1005, the unique count of vehicle id between ((01/01/2019) - 6 month) and 01/01/2019 has 6 different vehicle id in them.
I tried Over and Partition by but the 6 month interval is missing
SELECT t.*, COUNT(DISTINCT t.VEHICLE_ID) OVER (PARTITION BY t.EMP_ID ORDER BY t.DATE)
AS LAST_6M_UNIQ_Vehicle_Count
FROM TABLE1 t
I am not able to calculate the values based on 6 month interval for each rows.
Your help is much appreciated.
Oracle doesn't like COUNT( DISTINCT ... ) OVER ( ... ) when used in a windowed analytic function with a range and will raise an ORA-30487: ORDER BY not allowed here exception (otherwise, that would be the solution). It will work without the DISTINCT keyword but not with it.
Instead, you can use a correlated sub-query:
SELECT t.*,
( SELECT COUNT( DISTINCT vehicle_id )
FROM table_name c
WHERE c.emp_id = t.emp_id
AND c."DATE" <= t."DATE"
AND ADD_MONTHS( t."DATE", -6 ) <= c."DATE"
) AS last_6m_uniq_vehicle_count,
COUNT(t.vehicle_id) OVER (
PARTITION BY t.emp_id
ORDER BY t."DATE"
RANGE BETWEEN INTERVAL '6' MONTH PRECEDING
AND CURRENT ROW
) AS last_6m_vehicle_count
FROM table_name t
Which for the sample data:
CREATE TABLE table_name ( vehicle_id, emp_id, "DATE" ) AS
SELECT 1, 1, DATE '2020-08-31' FROM DUAL UNION ALL
SELECT 2, 1, DATE '2020-07-31' FROM DUAL UNION ALL
SELECT 1, 1, DATE '2020-06-30' FROM DUAL UNION ALL
SELECT 2, 1, DATE '2020-05-31' FROM DUAL UNION ALL
SELECT 2, 1, DATE '2020-04-30' FROM DUAL UNION ALL
SELECT 2, 1, DATE '2020-03-31' FROM DUAL UNION ALL
SELECT 2, 1, DATE '2020-02-29' FROM DUAL UNION ALL
SELECT 2, 1, DATE '2020-01-31' FROM DUAL UNION ALL
SELECT 3, 1, DATE '2020-01-31' FROM DUAL;
Outputs:
VEHICLE_ID | EMP_ID | DATE | LAST_6M_UNIQ_VEHICLE_COUNT | LAST_6M_VEHICLE_COUNT
---------: | -----: | :-------- | -------------------------: | --------------------:
2 | 1 | 31-JAN-20 | 2 | 2
3 | 1 | 31-JAN-20 | 2 | 2
2 | 1 | 29-FEB-20 | 2 | 3
2 | 1 | 31-MAR-20 | 2 | 4
2 | 1 | 30-APR-20 | 2 | 5
2 | 1 | 31-MAY-20 | 2 | 6
1 | 1 | 30-JUN-20 | 3 | 7
2 | 1 | 31-JUL-20 | 3 | 8
1 | 1 | 31-AUG-20 | 2 | 7
db<>fiddle here
You can do this with window functions, and a range frame specification.
Computing the distinct count is a bit tricky: Oracle does not support it directly, but we can proceed in two steps. First perform a window count within employee/vehicle partitions, and then take in account only the first occurence of each vehicle in the employee partition.
So:
select vehicle_id, emp_id, "DATE",
sum(case when flag = 1 then 1 else 0 end) over(
partition by emp_id
order by "DATE"
range between interval '6' month preceding and current row
) as last_6m_uniq_vehicle_count,
count(*) over (
partition by emp_id
order by "DATE"
range between interval '6' month preceding and current row
) as last_6m_vehicle_count
from (
select t.*,
count(*) over (
partition by emp_id , vehicle_id
order by "DATE"
range between interval '6' month preceding and current row
) as flag
from table_name t
) t
order by "DATE", vehicle_id
As MTO points out, count(distinct) cannot be used as a window function to solve this.
For that reason, I would go for a lateral join:
select t.*, l.*
from t cross join lateral
(select count(*) as last_6m_vehicle_count, count(distinct t.vehicle_id) as last_6m_uniq_vehicle_count
from t t2
where t2.emp_id = t.emp_id and
t2.dte <= t.dte and
t2.dte > add_months(t.dte, -6)
) l;
Here is a db<>fiddle.
I have a table with 200.000 rows in a SQL Server 2014 database looking like this:
CREATE TABLE DateRanges
(
Contract VARCHAR(8),
Sector VARCHAR(8),
StartDate DATE,
EndDate DATE
);
INSERT INTO DateRanges (Contract, Sector, StartDate, Enddate)
SELECT '111', '999', '01-01-2014', '03-31-2014'
union
SELECT '111', '999', '04-01-2014', '06-30-2014'
union
SELECT '111', '999', '07-01-2014', '09-30-2014'
union
SELECT '111', '999', '10-01-2014', '12-31-2014'
union
SELECT '111', '888', '08-01-2014', '08-31-2014'
union
SELECT '111', '777', '08-15-2014', '08-31-2014'
union
SELECT '222', '999', '01-01-2014', '03-31-2014'
union
SELECT '222', '999', '04-01-2014', '06-30-2014'
union
SELECT '222', '999', '07-01-2014', '09-30-2014'
union
SELECT '222', '999', '10-01-2014', '12-31-2014'
union
SELECT '222', '666', '11-01-2014', '11-30-2014'
UNION
SELECT '222', '555', '11-15-2014', '11-30-2014';
As you can see there can be multiple overlaps for each contract and what I would like to have is the result like this
Contract Sector StartDate EndDate
---------------------------------------------
111 999 01-01-2014 07-31-2014
111 888 08-01-2014 08-14-2014
111 777 08-15-2014 08-31-2014
111 999 09-01-2014 12-31-2014
222 999 01-01-2014 10-31-2014
222 666 11-01-2014 11-14-2014
222 555 11-15-2014 11-30-2014
222 999 12-01-2014 12-31-2014
I can not figure out how this can be done and the examples i have seen on this site quite do not fit my problem.
This answer makes use of a few different techniques. The first is a recursive-cte that creates a table with every relevant cal_date which then gets cross apply'd with unique Contract values to get every combination of both values. The second is window-functions such as lag and row_number to determine a variety of things detailed in the comments below. Lastly, and probably most importantly, gaps-and-islands to determine when one Contract/Sector combination ends and the next begins.
Answer:
--determine range of dates
declare #bgn_dt date = (select min(StartDate) from DateRanges)
, #end_dt date = (select max(EndDate) from DateRanges)
--use a recursive CTE to create a record for each day / Contract
; with dates as
(
select #bgn_dt as cal_date
union all
select dateadd(d, 1, a.cal_date) as cal_date
from dates as a
where a.cal_date < #end_dt
)
select d.cal_date
, c.Contract
into #contract_dates
from dates as d
cross apply (select distinct Contract from DateRanges) as c
option (maxrecursion 0)
--Final Select
select f.Contract
, f.Sector
, min(f.cal_date) as StartDate
, max(f.cal_date) as EndDate
from (
--Use the sum-over to obtain the Island Numbers
select dr.Contract
, dr.Sector
, dr.cal_date
, sum(dr.IslandBegin) over (partition by dr.Contract order by dr.cal_date asc) as IslandNbr
from (
--Determine if the record is the start of a new Island
select a.Contract
, a.Sector
, a.cal_date
, case when lag(a.Sector, 1, NULL) over (partition by a.Contract order by a.cal_date asc) = a.Sector then 0 else 1 end as IslandBegin
from (
--Determine which Contract/Date combinations are valid, and rank the Sectors that are in effect
select cd.cal_date
, dr.Contract
, dr.Sector
, dr.EndDate
, row_number() over (partition by dr.Contract, cd.cal_date order by dr.StartDate desc) as ConractSectorRnk
from #contract_dates as cd
left join DateRanges as dr on cd.Contract = dr.Contract
and cd.cal_date between dr.StartDate and dr.EndDate
) as a
where a.ConractSectorRnk = 1
and a.Contract is not null
) as dr
) as f
group by f.Contract
, f.Sector
, f.IslandNbr
order by f.Contract asc
, min(f.cal_date) asc
Output:
+----------+--------+------------+------------+
| Contract | Sector | StartDate | EndDate |
+----------+--------+------------+------------+
| 111 | 999 | 2014-01-01 | 2014-07-31 |
| 111 | 888 | 2014-08-01 | 2014-08-14 |
| 111 | 777 | 2014-08-15 | 2014-08-31 |
| 111 | 999 | 2014-09-01 | 2014-12-31 |
| 222 | 999 | 2014-01-01 | 2014-10-31 |
| 222 | 666 | 2014-11-01 | 2014-11-14 |
| 222 | 555 | 2014-11-15 | 2014-11-30 |
| 222 | 999 | 2014-12-01 | 2014-12-31 |
+----------+--------+------------+------------+
I thought about writing a sql query.
I have a very simple table. There are two fields in this table.
CREATE TABLE [CHECKINOUT](
[USERID] [int] NOT NULL,
[CHECKTIME] [datetime] NOT NULL DEFAULT (getdate())
);GO
USERID CHECKTIME
1 2014-11-04 08:24:49.000
1 2014-11-03 16:57:00.000
1 2014-11-03 08:15:54.000
1 2014-10-28 12:57:58.000
1 2014-10-28 08:22:46.000
1 2014-10-24 16:58:33.000
1 2014-10-24 12:53:06.000
1 2014-10-24 08:21:38.000
1 2014-10-22 16:19:55.000
1 2014-10-21 08:26:21.000
There are sample table above.
I want to write this simple query using the pivot.
I wrote a pivot query but the value returned is null.
I wrote a query like this.
SELECT [USERID],[MORN_IN],[MORN_OUT],[NOON_IN],[NOON_OUT] FROM
(
SELECT [USERID], convert(NVARCHAR, ([CHECKTIME]), 104) as DATE_TIME FROM [CHECKINOUT]
) AS IN_OUT
PIVOT
(
MAX(DATE_TIME) --TO DATE
FOR DATE_TIME -- MY ROW COLUMN
IN
(
[MORN_IN],[MORN_OUT],[NOON_IN],[NOON_OUT] -- MY ROW COLUMN
)
) AS PIVOT_TABLE
incorrect query results--
USERID MORN_IN MORN_OUT NOON_IN NOON_OUT
1 NULL NULL NULL NULL
2 NULL NULL NULL NULL
3 NULL NULL NULL NULL
4 NULL NULL NULL NULL
5 NULL NULL NULL NULL
6 NULL NULL NULL NULL
7 NULL NULL NULL NULL
I want to do what?
the same user on the same day of their movements
I want to break into pieces.
for example:
00:00-11:00 =>MORN_IN
11:00-13:00 =>MORN_OUT(first record ONLY MIN(11:00-13:00))
12:00-15:00 =>NOON_IN (second record max(12:00-13:00) NOON_IN > MORN_OUT)
15:00-00:00 =>NOON_OUT
SELECT TOP 3 [USERID]
,[CHECKTIME]
FROM [CHECKINOUT] ORDER BY [USERID],[CHECKTIME] DESC
USERID my CHECKTIME
1 2014-10-24 16: 58: 33.000
1 2014-10-24 12: 53: 06,000
1 2014-10-24 08: 21: 38.000
now turn to the results of the pivot table (I can not do this part. but should return results like this)
USERID MORN_IN MORN_OUT NOON_IN NOON_OUT
1 2014-10-24 08: 21: 38.000 2014-10-24 12: 53: 06,000 NULL 2014-10-24 16: 58: 33.000
1
If time interval 13:00 - 16:30 is considered to be NOON_IN, then the following query:
SELECT DAY_DIVISION, [MORN_IN], [MORN_OUT], [NOON_IN], [NOON_OUT]
FROM
(SELECT CHECKTIME, CASE
WHEN CAST(CHECKTIME as time) >= '00:00:00' AND CAST(CHECKTIME as time) < '11:00:00' THEN 'MORN_IN'
WHEN CAST(CHECKTIME as time) >= '11:00:00' AND CAST(CHECKTIME as time) < '13:00:00' THEN 'MORN_OUT'
WHEN CAST(CHECKTIME as time) >= '13:00:00' AND CAST(CHECKTIME as time) < '16:30:00' THEN 'NOON_IN'
WHEN CAST(CHECKTIME as time) >= '16:30:00' THEN 'NOON_OUT'
END AS TIME_DIVISION,
RANK() OVER ( ORDER BY CAST(CHECKTIME as date) ASC) AS DAY_DIVISION
FROM CHECKINOUT) AS SourceTable
PIVOT
(
MAX(CHECKTIME)
FOR TIME_DIVISION IN ([MORN_IN], [MORN_OUT], [NOON_IN], [NOON_OUT])
) AS PivotTable;
yields this output:
DAY_DIVISION MORN_IN MORN_OUT NOON_IN NOON_OUT
------------------------------------------------------------------------------------
1 2014-10-21 08:26:21.000 NULL NULL NULL
2 NULL NULL 2014-10-22 16:19:55.000 NULL
3 2014-10-24 08:21:38.000 2014-10-24 12:54:06.000 NULL 2014-10-24 16:58:33.000
7 2014-10-28 08:22:46.000 2014-10-28 12:57:58.000 NULL NULL
9 2014-11-03 08:15:54.000 NULL NULL 2014-11-03 16:57:00.000
11 2014-11-04 08:24:49.000 NULL NULL NULL
Taking care of logins between 12:00 and 13:00 and multiple userIDs. (Tested on Oracle 11.2)
WITH
CheckInOutRaw(userID, checkTime) AS(
SELECT 1, '2014-11-04 08:24:49.000' FROM DUAL UNION ALL
SELECT 1, '2014-11-03 16:57:00.000' FROM DUAL UNION ALL
SELECT 1, '2014-11-03 08:15:54.000' FROM DUAL UNION ALL
SELECT 1, '2014-10-28 12:57:58.000' FROM DUAL UNION ALL
SELECT 1, '2014-10-28 08:22:46.000' FROM DUAL UNION ALL
SELECT 1, '2014-10-24 16:58:33.000' FROM DUAL UNION ALL
SELECT 1, '2014-10-24 12:53:06.000' FROM DUAL UNION ALL
SELECT 1, '2014-10-24 08:21:38.000' FROM DUAL UNION ALL
SELECT 1, '2014-10-22 16:19:55.000' FROM DUAL UNION ALL
SELECT 1, '2014-10-21 08:26:21.000' FROM DUAL UNION ALL
SELECT 2, '2014-11-04 08:24:49.000' FROM DUAL UNION ALL
SELECT 2, '2014-11-03 16:57:00.000' FROM DUAL UNION ALL
SELECT 2, '2014-11-03 08:15:54.000' FROM DUAL UNION ALL
SELECT 2, '2014-10-29 11:07:58.000' FROM DUAL UNION ALL
SELECT 2, '2014-10-29 12:07:58.000' FROM DUAL UNION ALL
SELECT 2, '2014-10-29 16:57:58.000' FROM DUAL UNION ALL
SELECT 2, '2014-10-28 11:07:58.000' FROM DUAL UNION ALL
SELECT 2, '2014-10-28 12:07:58.000' FROM DUAL UNION ALL
SELECT 2, '2014-10-28 16:57:58.000' FROM DUAL UNION ALL
SELECT 2, '2014-10-28 08:22:46.000' FROM DUAL UNION ALL
SELECT 2, '2014-10-27 12:57:58.000' FROM DUAL UNION ALL
SELECT 2, '2014-10-27 12:07:58.000' FROM DUAL UNION ALL
SELECT 2, '2014-10-27 16:57:58.000' FROM DUAL UNION ALL
SELECT 2, '2014-10-27 08:22:46.000' FROM DUAL UNION ALL
SELECT 2, '2014-10-24 16:58:33.000' FROM DUAL UNION ALL
SELECT 2, '2014-10-24 12:53:06.000' FROM DUAL UNION ALL
SELECT 2, '2014-10-24 08:21:38.000' FROM DUAL UNION ALL
SELECT 2, '2014-10-22 13:19:55.000' FROM DUAL UNION ALL
SELECT 2, '2014-10-22 16:19:55.000' FROM DUAL UNION ALL
SELECT 2, '2014-10-21 08:26:21.000' FROM DUAL UNION ALL
SELECT 2, '2014-10-20 12:19:55.000' FROM DUAL UNION ALL
SELECT 2, '2014-10-20 16:19:55.000' FROM DUAL
),
CheckInOutTemp AS (
SELECT
userID
, TO_TIMESTAMP(checkTime, 'YYYY-MM-DD HH24:MI:SSXFF3') checkTime
FROM
CheckInOutRaw
),
CheckInOut AS (
SELECT
userID
, checkTime
, TRUNC(checkTime) dt
, EXTRACT(HOUR FROM checkTime) h
, COUNT(CASE EXTRACT(HOUR FROM checkTime) WHEN 11 THEN 1 ELSE NULL END) OVER (PARTITION BY TRUNC(checkTime), userID) h11
, COUNT(CASE EXTRACT(HOUR FROM checkTime) WHEN 12 THEN 1 ELSE NULL END) OVER (PARTITION BY TRUNC(checkTime), userID) h12
FROM
CheckInOutTemp
),
S AS (
SELECT
userID
, checkTime
, dt
, CASE
WHEN h < 11 THEN 'MORN_IN'
WHEN h = 11 THEN 'MORN_OUT'
WHEN h11 = 1 AND h = 12 THEN 'NOON_IN'
WHEN h = 12 THEN 'MORN_OUT'
WHEN h < 16 THEN 'NOON_IN'
ELSE 'NOON_OUT'
END slot
FROM CheckInOut
WHERE NOT ((h = 12) AND (h12 = 2))
UNION ALL
SELECT
userID
, MIN(checkTime) checkTime
, dt
, 'MORN_OUT' slot
FROM CheckInOut
WHERE ((h = 12) AND (h12 = 2))
GROUP BY userID, dt
UNION ALL
SELECT
userID
, MAX(checkTime) checkTime
, dt
, 'NOON_IN' slot
FROM CheckInOut
WHERE ((h = 12) AND (h12 = 2))
GROUP BY userID, dt
)
SELECT
userID, TO_CHAR(dt, 'YYYY-MM-DD') dt, TO_CHAR(MORN_IN, 'HH24:MI:SS') morn_in, TO_CHAR(MORN_OUT, 'HH24:MI:SS') morn_out, TO_CHAR(NOON_IN, 'HH24:MI:SS') noon_in, TO_CHAR(NOON_OUT, 'HH24:MI:SS') noon_out
FROM (SELECT * FROM S PIVOT(MAX(checkTime) FOR slot IN ('MORN_IN' morn_in, 'MORN_OUT' morn_out, 'NOON_IN' noon_in, 'NOON_OUT' noon_out))) ORDER BY userID, DT DESC
;
Returns:
| USERID | DT | MORN_IN | MORN_OUT | NOON_IN | NOON_OUT |
|--------|------------|----------|----------|----------|----------|
| 1 | 2014-11-04 | 08:24:49 | (null) | (null) | (null) |
| 1 | 2014-11-03 | 08:15:54 | (null) | (null) | 16:57:00 |
| 1 | 2014-10-28 | 08:22:46 | 12:57:58 | (null) | (null) |
| 1 | 2014-10-24 | 08:21:38 | 12:53:06 | (null) | 16:58:33 |
| 1 | 2014-10-22 | (null) | (null) | (null) | 16:19:55 |
| 1 | 2014-10-21 | 08:26:21 | (null) | (null) | (null) |
| 2 | 2014-11-04 | 08:24:49 | (null) | (null) | (null) |
| 2 | 2014-11-03 | 08:15:54 | (null) | (null) | 16:57:00 |
| 2 | 2014-10-29 | (null) | 11:07:58 | 12:07:58 | 16:57:58 |
| 2 | 2014-10-28 | 08:22:46 | 11:07:58 | 12:07:58 | 16:57:58 |
| 2 | 2014-10-27 | 08:22:46 | 12:07:58 | 12:57:58 | 16:57:58 |
| 2 | 2014-10-24 | 08:21:38 | 12:53:06 | (null) | 16:58:33 |
| 2 | 2014-10-22 | (null) | (null) | 13:19:55 | 16:19:55 |
| 2 | 2014-10-21 | 08:26:21 | (null) | (null) | (null) |
| 2 | 2014-10-20 | (null) | 12:19:55 | (null) | 16:19:55 |
Looking at
| 2 | 2014-10-20 | (null) | 12:19:55 | (null) | 16:19:55 |
it does probably not make a lot of sense - but seems to be in the line with the specification.
SQL Fiddle
I have a query, which returns the following, EXCEPT for the last column, which is what I need to figure out how to create. For each given ObservationID I need to return the date on which the status changes; something like a LEAD() function that would take conditions and not just offsets. Can it be done?
I need to calculate the column Change Date; it should be the last date the status was not the current status.
+---------------+--------+-----------+--------+-------------+
| ObservationID | Region | Date | Status | Change Date | <-This field
+---------------+--------+-----------+--------+-------------+
| 1 | 10 | 1/3/2012 | Ice | 1/4/2012 |
| 2 | 10 | 1/4/2012 | Water | 1/6/2012 |
| 3 | 10 | 1/5/2012 | Water | 1/6/2012 |
| 4 | 10 | 1/6/2012 | Gas | 1/7/2012 |
| 5 | 10 | 1/7/2012 | Ice | |
| 6 | 20 | 2/6/2012 | Water | 2/10/2012 |
| 7 | 20 | 2/7/2012 | Water | 2/10/2012 |
| 8 | 20 | 2/8/2012 | Water | 2/10/2012 |
| 9 | 20 | 2/9/2012 | Water | 2/10/2012 |
| 10 | 20 | 2/10/2012 | Ice | |
+---------------+--------+-----------+--------+-------------+
a model clause (10g+) can do this in a compact way:
SQL> create table observation(ObservationID , Region ,obs_date, Status)
2 as
3 select 1, 10, date '2012-03-01', 'Ice' from dual union all
4 select 2, 10, date '2012-04-01', 'Water' from dual union all
5 select 3, 10, date '2012-05-01', 'Water' from dual union all
6 select 4, 10, date '2012-06-01', 'Gas' from dual union all
7 select 5, 10, date '2012-07-01', 'Ice' from dual union all
8 select 6, 20, date '2012-06-02', 'Water' from dual union all
9 select 7, 20, date '2012-07-02', 'Water' from dual union all
10 select 8, 20, date '2012-08-02', 'Water' from dual union all
11 select 9, 20, date '2012-09-02', 'Water' from dual union all
12 select 10, 20, date '2012-10-02', 'Ice' from dual ;
Table created.
SQL> select ObservationID, obs_date, Status, status_change
2 from observation
3 model
4 dimension by (Region, obs_date, Status)
5 measures ( ObservationID, obs_date obs_date2, cast(null as date) status_change)
6 rules (
7 status_change[any,any,any] = min(obs_date2)[cv(Region), obs_date > cv(obs_date), status != cv(status)]
8 )
9 order by 1;
OBSERVATIONID OBS_DATE STATU STATUS_CH
------------- --------- ----- ---------
1 01-MAR-12 Ice 01-APR-12
2 01-APR-12 Water 01-JUN-12
3 01-MAY-12 Water 01-JUN-12
4 01-JUN-12 Gas 01-JUL-12
5 01-JUL-12 Ice
6 02-JUN-12 Water 02-OCT-12
7 02-JUL-12 Water 02-OCT-12
8 02-AUG-12 Water 02-OCT-12
9 02-SEP-12 Water 02-OCT-12
10 02-OCT-12 Ice
fiddle: http://sqlfiddle.com/#!4/f6687/1
i.e. we will dimension on region, date and status as we want to look at cells with the same region, but get the first date that the status differs on.
we also have to measure date too so i created an alias obs_date2 to do that, and we want a new column status_change to hold the date the status changed.
this line is the line that does all the working out for us:
status_change[any,any,any] = min(obs_date2)[cv(Region), obs_date > cv(obs_date), status != cv(status)]
it says, for our three dimensions, only look at the rows with the same region (cv(Region),) and look at rows where the date follows the date of the current row (obs_date > cv(obs_date)) and also the status is different from the current row (status != cv(status)) finally get the minimum date that satisfies this set of conditions (min(obs_date2)) and assign it to status_change. The any,any,any part on the left means this calculation applies to all rows.
I've tried many times to understand the MODEL clause and never really quite managed it, so thought I would add another solution
This solution takes some of what Ronnis has done but instead uses the IGNORE NULLS clause of the LEAD function. I think that this is only new with Oracle 11 but you could probably replace it with the FIRST_VALUE function for Oracle 10 if necessary.
select
observation_id,
region,
observation_date,
status,
lead(case when is_change = 'Y' then observation_date end) ignore nulls
over (partition by region order by observation_date) as change_observation_date
from (
select
a.observation_id,
a.region,
a.observation_date,
a.status,
case
when status = lag(status) over (partition by region order by observation_date)
then null
else 'Y' end as is_change
from observations a
)
order by 1
I frequently do this when cleaning up overlapping from/to-dates and duplicate rows.
Your case is much simpler though, since you only have the "from-date" :)
Setting up the test data
create table observations(
observation_id number not null
,region number not null
,observation_date date not null
,status varchar2(10) not null
);
insert
into observations(observation_id, region, observation_date, status)
select 1, 10, date '2012-03-01', 'Ice' from dual union all
select 2, 10, date '2012-04-01', 'Water' from dual union all
select 3, 10, date '2012-05-01', 'Water' from dual union all
select 4, 10, date '2012-06-01', 'Gas' from dual union all
select 5, 10, date '2012-07-01', 'Ice' from dual union all
select 6, 20, date '2012-06-02', 'Water' from dual union all
select 7, 20, date '2012-07-02', 'Water' from dual union all
select 8, 20, date '2012-08-02', 'Water' from dual union all
select 9, 20, date '2012-09-02', 'Water' from dual union all
select 10, 20, date '2012-10-02', 'Ice' from dual;
commit;
The below query has three points of interest:
Identifying repeated information (the recording show the same as previous recording)
Ignoring the repeated recordings
Determining the date from the "next" change
.
with lagged as(
select a.*
,case when status = lag(status, 1) over(partition by region
order by observation_date)
then null
else rownum
end as change_flag -- 1
from observations a
)
select observation_id
,region
,observation_date
,status
,lead(observation_date, 1) over(
partition by region
order by observation_date
) as change_date --3
,lead(observation_date, 1, sysdate) over(
partition by region
order by observation_date
) - observation_date as duration
from lagged
where change_flag is not null -- 2
;