Calculate duration between Phase - sql

I have the following history table (record user action):
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| parent_id | property_names | changed_property | time_c | outcome |
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| 123456 | {PhaseId,LastUpdateTime} | {"PhaseId":{"newValue":"Fulfill","oldValue":"Approve"},"LastUpdateTime":{"newValue":1671027321749,"oldValue":1671027321170}} | 1671027321749 | success |
| 123456 | {PhaseId,LastUpdateTime,ApprovalStatus} | {"PhaseId":{"newValue":"Approve","oldValue":"Log"},"LastUpdateTime":{"newValue":1671011168777,"oldValue":1671011168043},"ApprovalStatus":{"newValue":"InProgress"}} | 1671011168777 | success |
| 123456 | {LastUpdateTime,PhaseId,Urgency} | {"LastUpdateTime":{"newValue":1671011166077},"PhaseId":{"newValue":"Log"},"Urgency":{"newValue":"TotalLossOfService"}} | 1671011166077 | success |
| 123456 | {LastUpdateTime,ApprovalStatus} | {"LastUpdateTime":{"newValue":1671027321170,"oldValue":1671027320641},"ApprovalStatus":{"newValue":"Approved","oldValue":"InProgress"}} | 1671027321170 | success |
| 123456 | {PhaseId,LastUpdateTime,ExecutionEnd_c} | {"PhaseId":{"newValue":"Accept","oldValue":"Fulfill"},"LastUpdateTime":{"newValue":1671099802675,"oldValue":1671099801501},"ExecutionEnd_c":{"newValue":1671099802374}} | 1671099802675 | success |
| 123456 | {PhaseId,LastUpdateTime,CompletionCode} | {"PhaseId":{"newValue":"Review","oldValue":"Accept"},"LastUpdateTime":{"newValue":1671099984979,"oldValue":1671099982723},"CompletionCode":{"oldValue":"CompletionCodeAbandonedByUser"}} | 1671099984979 | success |
| 123456 | {PhaseId,LastUpdateTime,ExecutionStart_c} | {"PhaseId":{"newValue":"Fulfill","oldValue":"Review"},"LastUpdateTime":{"newValue":1671100012012,"oldValue":1671099984979},"ExecutionStart_c":{"newValue":1671100011728,"oldValue":1671027321541}} | 1671100012012 | success |
| 123456 | {UserAction,PhaseId,LastUpdateTime,ExecutionEnd_c} | {"UserAction":{"oldValue":"UserActionReject"},"PhaseId":{"newValue":"Accept","oldValue":"Fulfill"},"LastUpdateTime":{"newValue":1671100537178,"oldValue":1671100535959},"ExecutionEnd_c":{"newValue":1671100536730,"oldValue":1671099802374}} | 1671100537178 | success |
| 123456 | {PhaseId,Active,CloseTime,LastUpdateTime,LastActiveTime,ClosedByPerson} | {"PhaseId":{"newValue":"Close","oldValue":"Accept"},"Active":{"newValue":false,"oldValue":true},"CloseTime":{"newValue":1671101084529},"LastUpdateTime":{"newValue":1671101084788,"oldValue":1671101083903},"LastActiveTime":{"newValue":1671101084529},"ClosedByPerson":{"newValue":"511286"}} | 1671101084788 | success |
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Description of the column :
parent_id : link to the parent element
property_names : property having a modification
changed_property : New value for the properties. for ex:
{
"PhaseId":{
"newValue":"Fulfill",
"oldValue":"Approve"
},
"LastUpdateTime":{
"newValue":1671027321749,
"oldValue":1671027321170
}
}
The property PhaseId change the value from Approve to Fulfill
time_c : Unix Timestamp of the update
outcome : Status of the update
My goal is to calculate the duration of each phase.
Expected output :
------------------------------------------------------------
| parent_id | Log | Approve | Fulfill | Accept | Review |
------------------------------------------------------------
| 123456 | 2700 | 16152972 | 73006092 | 729914 | 27033 |
------------------------------------------------------------
Log : 1671011168777 - 1671011166077 = 2700
Approve : 1671027321749 - 1671011168777 = 16152972
Fulfill : (1671100537178 - 1671100012012) + (1671099802675 - 1671027321749) = 73006092
Accept : (1671101084788 - 1671100537178) + (1671099984979 - 1671099802675) = 729914
Review : 1671100012012 - 1671099984979 = 27033
At this moment, I'm able to retreive the new and old value of the PhaseId and convert the unix timestamp to datetime.
My issue is how to calculate the duration of a phase using SQL
My current SQL request :
SELECT * FROM
(SELECT
parent_id,
property_names,
changed_property,
time_c,
to_char(to_timestamp(time_c/1000.0) at time zone 'Europe/Paris', 'yyyy-mm-dd hh24:mi:ss') AS "time to datetime",
outcome,
changed_property::json->'PhaseId'->> 'newValue' AS "PhaseId (new)",
changed_property::json->'PhaseId'->> 'oldValue' AS "PhaseId (old)"
FROM history
WHERE array_to_string(property_names, ', ') like '%PhaseId%'
ORDER BY time_c DESC) AS temp_c
/*
WHERE "PhaseId (new)" = 'Close'
OR "PhaseId (old)" = 'Close'
*/
Result (irrevelant data hidded) :
-----------------------------------------------------------------------------------
| parent_id | time_c | time to datetime | PhaseId (new) | PhaseId (old) |
-----------------------------------------------------------------------------------
| 123456 | 1671101084788 | 2022-12-15 11:44:44 | Close | Accept |
| 123456 | 1671100537178 | 2022-12-15 11:35:37 | Accept | Fulfill |
| 123456 | 1671100012012 | 2022-12-15 11:26:52 | Fulfill | Review |
| 123456 | 1671099984979 | 2022-12-15 11:26:24 | Review | Accept |
| 123456 | 1671099802675 | 2022-12-15 11:23:22 | Accept | Fulfill |
| 123456 | 1671027321749 | 2022-12-14 15:15:21 | Fulfill | Approve |
| 123456 | 1671011168777 | 2022-12-14 10:46:08 | Approve | Log |
| 123456 | 1671011166077 | 2022-12-14 10:46:06 | Log | null |
-----------------------------------------------------------------------------------
DB fidle : https://www.db-fiddle.com/f/ckqtYy3EuASF4RdF9dSEcv/2

select * from crosstab(
'
with ordered_changes as (select parent_id,
time_c,
changed_property::json -> ''PhaseId'' ->> ''newValue'' AS PhaseId_New,
changed_property::json -> ''PhaseId'' ->> ''oldValue'' AS PhaseId_Old,
property_names,
changed_property,
outcome
from history
where arraycontains(property_names, ARRAY [''PhaseId''])
order by parent_id, time_c desc),
all_stage_durations as (select oc.parent_id,
oc.time_c - lag(oc.time_c, 1) over (order by time_c) as duration,
oc.PhaseId_old,
oc.time_c end_ts,
lag(oc.PhaseId_New, 1) over (order by time_c),
lag(oc.time_c, 1) over (order by time_c) start_ts
from ordered_changes oc)
select asd.parent_id, asd.PhaseId_old stage, sum(asd.duration) total_time
from all_stage_durations asd
where asd.PhaseId_old is not null
group by asd.parent_id, asd.PhaseId_old
order by parent_id, stage
',
'select stage from (' ||
'select distinct changed_property::json -> ''PhaseId'' ->> ''newValue'' AS stage from history union ' ||
'select distinct changed_property::json -> ''PhaseId'' ->> ''oldValue'' AS stage from history ) a ' ||
'where stage is not null order by stage'
)
as ct(parent_id int, Accept int, Approve int, Close int, Fulfill int, Log int, Review int)
;

Here is how I managed to calculate it :
WITH temp AS (
SELECT parent_id,
changed_property::json->'PhaseId'->> 'newValue' AS phase,
time_c,
LEAD(time_c,1) OVER (
PARTITION BY parent_id
ORDER BY parent_id,time_c
) next_time
FROM history
where 'PhaseId' = ANY(property_names)
)
SELECT parent_id,
phase,
justify_interval(make_interval(secs =>SUM((next_time-time_c)/1000))) AS "Durations"
FROM temp
GROUP BY parent_id,phase
ORDER BY parent_id
I made use of the function LEAD
Fiddle: https://www.db-fiddle.com/f/ckqtYy3EuASF4RdF9dSEcv/4

Related

Multiple SELECT statements on the same table in BigQuery?

Consider the following example table:
+-----------+--------------+----------+
| device_id | execution_id | severity |
+-----------+--------------+----------+
| id1 | 86g8g5t3tz4e | INFO |
| | 86g8g5t3tz4e | INFO |
| | 86g8g5t3tz4e | ERROR |
| id2 | 86g8t0gk9t8k | INFO |
| | 86g8t0gk9t8k | INFO |
| | 86g8t0gk9t8k | INFO |
| id3 | ox1fl5e4gpxa | INFO |
| | ox1fl5e4gpxa | INFO |
| | ox1fl5e4gpxa | ERROR |
+-----------+--------------+----------+
Where I have logs from an internal system. deviceId is guaranteed to be found at the beginning on each execution.
I'd like to get all device_ids which their execution_id ends with an ERROR. I can get all the execution_ids like that:
SELECT execution_id as id FROM `my_table` WHERE severity = "ERROR" LIMIT 1000
How do I correlate it with the deviceIds? Am I looking for multiple SELECTs? A GROUP BY? A JOIN?
Thanks
Your problem is that the error rows lack the device IDs, so you must find them in the table using the execution ID.
Probably the easiest way to do that is aggregation:
select execution_id, max(device_id)
from mytable
group by execution_id
having max(case when severity = 'ERROR' then 1 else 0 end) = 1;

How to use wm_concat one a column that already exists in the query?

So... I am currently using Oracle 11.1g and I need to create a query that uses the ID and CusCODE from Table_with_value and checks Table_with_status using the ID to find active CO_status but on different CusCODE.
This is what I have so far - obviously does not work as it should unless CusCODE and ID are provided manually:
SELECT wm_concat(CoID) as active_CO_Status_for_same_ID_but_different_CusCODE
FROM Table_with_status
WHERE
CoID IN (SELECT CoID FROM Table_with_status WHERE ID = Table_with_value.ID AND CusCODE != Table_with_value.CusCODE)) AND Co_status = 'active';
Table_with_value:
|CoID | CusCODE | ID | Value |
|--------|---------|----------|----|
|354223 | 1.432 | 0784296L | 99 |
|321232 | 4.212321.22 | 0432296L | 32 |
|938421 | 3.213 | 0021321L | 93 |
Table_with_status:
|CoID | CusCODE | ID | Co_status|
|--------|--------------|----------|--------|
|354223 | 1.432 | 0784296L | active|
|354232 | 1.432 | 0784296L | inactive |
|666698 | 1.47621 | 0784296L | active |
|666700 | 1.5217 | 0784296L | active |
|938421 | 3.213 | 0021321L | active |
|938422 | 3.213 | 0021321L | active |
|938423 | 3.213 | 0021321L | active |
|321232 | 4.212321.22 | 0432296L | active |
|321232 | 4.212321.22 | 0432296L | active |
|321232 | 1.689 | 0432296L | inactive |
Expected output:
|CoID | active_CO_Status_for_same_ID_but_different_CusCODE | ID | Value |
|--------|---------|----------|----|
|354223 | 666698,666700 | 1.432 | 0784296L | 99 |
|321232 | N/A | 4.212321.22 | 0432296L | 32 |
|938421 | N/A | 3.213 | 0021321L | 93 |
Any idea on how this can be implemented ideally without any PL/SQL for loops, but it should be fine as well since the output dataset is expected < 300 IDs.
I apologize in advance for the cryptic nature in which I structured the question :) Let me know if something is not clear.
From your description and expected output, it looks like you need a left outer join, something like:
SELECT v.CoID,
wm_concat(s.CoID) as other_active_CusCODE -- active_CO_Status_for_same_ID_but_different_CusCODE
v.CusCODE,
v.ID,
v.value
FROM Table_with_value v
LEFT JOIN Table_with_status s
ON s.ID = v.ID
AND s.CusCODE != v.CusCODE
AND s.Co_status = 'active'
GROUP BY v.CoID, v.CusCODE, v.ID, v.value;
SQL Fiddle using listagg() instead of the never-supported and now-removed wm_concat(); with a couple of different approaches if the logic isn't quite what I interpreted. With your sample data they all get:
COID OTHER_ACTIVE_CUSCODE CUSCODE ID VALUE
------ -------------------- ----------- -------- -----
321232 (null) 4.212321.22 0432296L 32
354223 666698,666700 1.432 0784296L 99
938421 (null) 3.213 0021321L 93
Your code looks like it should work, assuming you are referring to the correct tables:
SELECT wm_concat(s.CoID) as active_CO_Status_for_same_ID_but_different_CusCODE
FROM Table_with_status s
WHERE s.CoID IN (SELECT v.CoID
FROM Table_with_value v
WHERE v.ID = s.ID AND
v.CusCODE <> s.CusCODE
) AND
s.Co_status = 'active';

PostgreSQL: show trips within a bounding box

I have a trips table containing user's trip information, like so:
select * from trips limit 10;
trip_id | daily_user_id | session_ids | seconds_start | lat_start | lon_start | seconds_end | lat_end | lon_end | distance
---------+---------------+-------------+---------------+------------+------------+-------------+------------+------------+------------------
594221 | 16772 | {170487} | 1561324555 | 41.1175475 | -8.6298934 | 1561325119 | 41.1554091 | -8.6283493 | 5875.39697884959
563097 | 7682 | {128618} | 1495295471 | 41.1782829 | -8.5950303 | 1495299137 | 41.1783908 | -8.5948965 | 5364.81067787512
596303 | 17264 | {172851} | 1578011699 | 41.5195598 | -8.6393526 | 1578012513 | 41.4614024 | -8.717709 | 11187.7956426909
595648 | 17124 | {172119} | 1575620857 | 41.1553116 | -8.6439528 | 1575621885 | 41.1621821 | -8.6383042 | 1774.83365424607
566061 | 8720 | {133624} | 1509005051 | 41.1241975 | -8.5958988 | 1509006310 | 41.1424158 | -8.6101461 | 3066.40306678979
566753 | 8947 | {134662} | 1511127813 | 41.1887996 | -8.5844238 | 1511129839 | 41.2107519 | -8.5511712 | 5264.64026582458
561179 | 7198 | {125861} | 1493311197 | 41.1776935 | -8.5947254 | 1493311859 | 41.1773815 | -8.5947254 | 771.437257541019
541328 | 2119 | {46950} | 1461103381 | 41.1779 | -8.5949738 | 1461103613 | 41.1779129 | -8.5950202 | 177.610819150637
535519 | 908 | {6016} | 1460140650 | 41.1644658 | -8.6422775 | 1460141201 | 41.1642646 | -8.6423309 | 1484.61552373019
548460 | 3525 | {102026} | 1462289206 | 41.177689 | -8.594679 | 1462289843 | 41.1734476 | -8.5916326 | 1108.05119077308
(10 rows)
The task is to filter trips that start and end within the bounding box defined by upper left: 41.24895, -8.68494 and lower right: 41.11591, -8.47569.
If I understand correctly, you can just compare that starting and ending coordinates:
select t.*
from trips t
where lat_start >= 41.11591 and lat_start <= 41.24895 and
lat_end >= 41.11591 and lat_end <= 41.24895 and
long_start >= -8.68494 and long_start <= -8.47569 and
long_end >= -8.68494 and long_end <= -8.47569
Since your coordinates are stored in x,y columns, you have to use ST_MakePoint to create a proper geometry. After that, you can create a BBOX using the function ST_MakeEnvelope and check if start and end coordinates are inside the BBOX using ST_Contains, e.g.
WITH bbox(geom) AS (
VALUES (ST_MakeEnvelope(-8.68494,41.24895,-8.47569,41.11591,4326))
)
SELECT * FROM trips,bbox
WHERE
ST_Contains(bbox.geom,ST_SetSRID(ST_MakePoint(lon_start,lat_start),4326)) AND
ST_Contains(bbox.geom,ST_SetSRID(ST_MakePoint(lon_end,lat_end),4326));
Note: the CTE isn't really necessary and is in the query just for illustration purposes. You can repeat the ST_MakeEnvelope function on both conditions in the WHERE clause instead of bbox.geom. This query also assumes the SRS WGS84 (4326).

How can I summarize / pivot data with oracle sql

I have a table containing geological resource information.
| Property | Zone | Area | Category | Tonnage | Au_gt | Au_oz |
|----------|------|-------------|-----------|---------|-------|-------|
| Ket | Eel | Open Pit | Measured | 43400 | 5.52 | 7700 |
| Ket | Eel | Open Pit | Inferred | 51400 | 5.88 | 9700 |
| Ket | Eel | Open Pit | Indicated | 357300 | 6.41 | 73600 |
| Ket | Eel | Underground | Measured | 3300 | 7.16 | 800 |
| Ket | Eel | Underground | Inferred | 14700 | 6.16 | 2900 |
| Ket | Eel | Underground | Indicated | 168100 | 8.85 | 47800 |
I would like to summarize the data so that it can be read more easily by our clients.
| Property | Zone | Category | Open_Pit_Tonnage | Open_Pit_Au_gt | Open_Pit_Au_oz | Underground_tonnage | Underground_au_gt | Underground_au_oz | Combined_tonnage | Combined_au_gt | Combined_au_oz |
|----------|------|-----------|------------------|----------------|----------------|---------------------|-------------------|-------------------|------------------|----------------|----------------|
| Ket | Eel | Measured | 43,400 | 5.52 | 7,700 | 3,300 | 7.16 | 800 | 46,700 | 5.64 | 8,500 |
| Ket | Eel | Indicated | 357,300 | 6.41 | 73,600 | 168,100 | 8.85 | 47,800 | 525,400 | 7.19 | 121,400 |
| Ket | Eel | Inferred | 51,400 | 5.88 | 9,700 | 14,700 | 6.16 | 2,900 | 66,100 | 5.94 | 12,600 |
I'm fairly new to pivot tables. How could I write a query to translate and summarize the data?
Thanks!
If your Oracle version is 11.1 or higher (which it should be if you are a relatively new user!) then you can use the PIVOT operator, as shown below.
Note that the result of the PIVOT operation can be given an alias (I used p) - this makes it easier to write the SELECT clause.
I assumed the name of your table is geological_data - replace it with your actual table name.
select p.*
, open_pit_tonnage + underground_tonnage as combined_tonnage
, open_pit_au_gt + underground_au_gt as combined_au_gt
, open_pit_au_oz + underground_au_oz as combined_au_oz
from geological_data
pivot (sum(tonnage) as tonnage, sum(au_gt) as au_gt, sum(au_oz) as au_oz
for area in ('Open Pit' as open_pit, 'Underground' as underground)) p
;
Conditional aggregation is a simple method:
select Property, Zone, Category,
max(case when area = 'Open Pit' then tonnage end) as open_pit_tonnage,
max(case when area = 'Open Pit' then Au_gt end) as open_pit_Au_gt,
max(case when area = 'Open Pit' then Au_oz end) as open_pit_Au_ox,
max(case when area = 'Underground' then tonnage end) as Underground_tonnage,
max(case when area = 'Underground' then Au_gt end) as Underground_Au_gt,
max(case when area = 'Underground' then Au_oz end) as Underground_Au_ox
from t
group by Property, Zone, Category
SQL Server PIVOT operator is used to convert rows to columns.
Goal is to turn the category names from the first column of the output into multiple columns and count the number of products for each category
This query reference can be taken into account for you above table:
SELECT * FROM
(
SELECT
category_name,
product_id,
model_year
FROM
production.products p
INNER JOIN production.categories c
ON c.category_id = p.category_id
) t
PIVOT(
COUNT(product_id)
FOR category_name IN (
[Children Bicycles],
[Comfort Bicycles],
[Cruisers Bicycles],
[Cyclocross Bicycles],
[Electric Bikes],
[Mountain Bikes],
[Road Bikes])
) AS pivot_table;

casting a REAL as INT and comparing

I am casting a real to an int and a float to an int and comparing the two like this:
where
cast(a.[SUM(PAID_AMT)] as int)!=cast(b.PAID_AMT as int)
but i am still getting results where the two are equal. for example:
+-----------+-----------+------------+------------+----------+
| accn | load_dt | pmtdt | sumpaidamt | Bpaidamt |
+-----------+-----------+------------+------------+----------+
| A133312 | 6/7/2011 | 11/28/2011 | 98.39 | 98.39 |
| A445070 | 6/2/2011 | 9/22/2011 | 204.93 | 204.93 |
| A465606 | 5/19/2011 | 10/19/2011 | 560.79 | 560.79 |
| A508742 | 7/12/2011 | 10/19/2011 | 279.65 | 279.65 |
| A567730 | 5/27/2011 | 10/24/2011 | 212.76 | 212.76 |
| A617277 | 7/12/2011 | 10/12/2011 | 322.02 | 322.02 |
| A626384 | 6/16/2011 | 10/21/2011 | 415.84 | 415.84 |
| AA0000044 | 5/12/2011 | 5/23/2011 | 197.38 | 197.38 |
+-----------+-----------+------------+------------+----------+
here is the full query:
select
a.accn,
a.load_dt,
a.pmtdt,
a.[SUM(PAID_AMT)] sumpaidamt,
sum(b.paid_amt) Bpaidamt
from
[MILLENNIUM_DW_DEV].[dbo].[Millennium_Payment_Data_May2011_July2012] a
join
F_PAYOR_PAYMENTS_DAILY b
on
a.accn=b.ACCESSION_ID
and
a.final_rpt_dt=b.FINAL_REPORT_DATE
and
a.load_dt=b.LOAD_DATE
and
a.pmtdt=b.PAYMENT_DATE
where
cast(a.[SUM(PAID_AMT)] as int)!=cast(b.PAID_AMT as int)
group by
a.accn,
a.load_dt,
a.pmtdt,
a.[SUM(PAID_AMT)]
what am i doing wrong? how do i return only records that are NOT equal?
I don't see why there is an issue.
The query is returning the sum of the payments in b (sum(b.paid_amt) Bpaidamt). The where clause is comparing individual payments. This just means that there is more than one payment.
Perhaps your intention is to have a HAVING clause instead:
having cast(a.[SUM(PAID_AMT)] as int)!=cast(sum(b.PAID_AMT) as int)
You can do a round and a cast statement.
cast(round(sumpaidamt,2) as money) <> cast(round(Bpaidamt,2) as money)
Sql Fiddle showing how it would work http://sqlfiddle.com/#!3/4eb79/1