How to grab the DATE that the MAX occurred? - sql

So i have a table called Value. This table has columns called: VALUE_ID, VALUE, HR, VALUE_TYPE. I am trying to grab not only the maximum value but also the HR (and ultimately the day) that the Maximum Value Occurred.
Below is some sample data:
VALUE_ID VALUE HR VALUE_TYPE
1 75 DEC-25-2018 01:00:00 AM Bananas
2 10 DEC-25-2018 01:00:00 AM Bananas
3 787 DEC-25-2018 05:00:00 PM Bananas
I want:
(For Hourly)
MAX(Value) HR Value_Type
75 DEC-25-2018 01:00:00 AM Bananas
787 DEC-25-2018 05:00:00 PM Bananas
(For Day)
MAX(Value) HR(Day) Value_Type
787 DEC-25-2018 05:00:00 PM Bananas
I've tried the following (this is probably completely wrong but im not sure how to combine columns from two separate queries into one table):
select max(value) as max_value
, Value_Type
from value
group by value_type
UNION
select HR from
from value
where value = (select max(value) as max_value
, Value_Type
from value
group by value_type;
Thanks in advance.

Analytic functions are perfect for this kind of question. They allow the base data to be read just once (rather than multiple times, as in solutions that aggregate, then compare to the original data).
In the sample session below, notice a few things. I create some sample data in a WITH clause (which you don't need, you have the actual table). I use the TO_DATE function to create dates. So that I don't need to write the format model multiple times, I first alter my session to set the default date format to the one you used.
The query is written for the hourly intervals; you can modify it easily for daily, by changing the argument to TRUNC() in the inner query, from 'hh' to 'dd'. If you are new to analytic functions, select and run the inner query by itself first, to see what it produces. Then it will be trivial to understand what the outer query does.
alter session set nls_date_format = 'MON-dd-yyyy hh:mi:ss AM';
with simulated_table (VALUE_ID, VALUE, HR, VALUE_TYPE) as (
select 1, 75, to_date('DEC-25-2018 01:00:00 AM'), 'Bananas' from dual union all
select 2, 10, to_date('DEC-25-2018 01:00:00 AM'), 'Bananas' from dual union all
select 3, 787, to_date('DEC-25-2018 05:00:00 PM'), 'Bananas' from dual
)
select value_id, value, hr, value_type
from (
select s.*,
max(value) over (partition by value_type, trunc(hr, 'hh')) maxval
from simulated_table s
)
where value = maxval
;
VALUE_ID VALUE HR VALUE_TYPE
---------- ---------- ----------------------- ----------
1 75 DEC-25-2018 01:00:00 AM Bananas
3 787 DEC-25-2018 05:00:00 PM Bananas

You could do:
select v.*
from value v
where v.value = (select max(v2.value)
from value v2
where trunc(v2.hr, 'HH')= trunc(v.hr, 'HH')
) or
v.value = (select max(v2.value)
from value v2
where trunc(v2.hr, 'DD')= trunc(v.hr, 'DD')
) ;
This gets the maximum value rows for both the hour and the day. You can use one clause or the other for just hours or just days.

Related

Oracle SQL Check current rows with previous rows dynamically

I am trying to work on something in Oracle SQL in the attached table.
The goal is to define a Visit Type where a Primary Visit refers to any visits that happened after 3 months from previous visit(s). However, the challenge is that sometimes I'll need to compare to previous row and sometimes I need to compare with previous N rows.
For e.g., Transaction ID 3 is a revisit because its start date is within the 'end date +90 days' of Transaction ID 1 (Dec 16). Transaction ID 4 is primary because it happened after it took place after the the very first 'end date+90 days', meaning I not only need to compare to previous 1 row but previous 3 rows.
Hope this is clear!
Thanks.
See Details above. Thank you!
From Oracle 12 you can use MATCH_RECOGNIZE for row-by-row pattern matching:
SELECT transaction_id, start_date, end_date, visit_type
FROM table_name
MATCH_RECOGNIZE(
ORDER BY start_date
MEASURES
CLASSIFIER() AS visit_type
ALL ROWS PER MATCH
PATTERN (PRIMARY revisit*)
DEFINE revisit AS start_date <= FIRST(end_date) + INTERVAL '90' DAY
);
Which, for the sample data:
CREATE TABLE table_name (transaction_id, start_date, end_date) AS
SELECT 1, DATE '2020-08-05', DATE '2020-09-07' FROM DUAL UNION ALL
SELECT 2, DATE '2020-09-19', DATE '2020-10-27' FROM DUAL UNION ALL
SELECT 3, DATE '2020-11-01', DATE '2020-12-19' FROM DUAL UNION ALL
SELECT 4, DATE '2021-01-23', DATE '2021-01-26' FROM DUAL UNION ALL
SELECT 5, DATE '2021-02-27', DATE '2021-03-27' FROM DUAL;
Outputs:
TRANSACTION_ID
START_DATE
END_DATE
VISIT_TYPE
1
2020-08-05 00:00:00
2020-09-07 00:00:00
PRIMARY
2
2020-09-19 00:00:00
2020-10-27 00:00:00
REVISIT
3
2020-11-01 00:00:00
2020-12-19 00:00:00
REVISIT
4
2021-01-23 00:00:00
2021-01-26 00:00:00
PRIMARY
5
2021-02-27 00:00:00
2021-03-27 00:00:00
REVISIT

Order Data based on previous row data

I have a query in Oracle SQL. My query gives three columns: old_data ,new_data and transaction_date. I want to sort this data primarily based on increasing transaction_date and secondly in such a way that new_data of previous row equals old_data of next row. Both new_data and old_date are number fields that can decrease or increase.
If I sort just by transaction_date, some data has the same exact date and time and hence the order will not be accurate as I need new_data of previous row to match old_data of current row. I also cannot use a hierarchical query alone to meet the second sorting condition since transaction_date sorting is the primary sorting condition.
Can anyone suggest a solution?
A sample output will need to look like below:
output_sample
Thanks in advance
You could use a hierarchical query and connect by equal dates as well as the relationship between old- and new-data:
SELECT transaction_date,
new_data,
old_data
FROM table_name
START WITH old_data IS NULL -- You need to define how to pick the first row
CONNECT BY
PRIOR transaction_date = transaction_date
AND PRIOR new_data = old_date
ORDER SIBLINGS BY
transaction_date
Which, for the sample data:
ALTER SESSION SET NLS_DATE_FORMAT = 'YYYY-MM-DD HH24:MI:SS';
CREATE TABLE table_name ( transaction_date, new_data, old_data ) AS
SELECT DATE '2022-01-01', 1, NULL FROM DUAL UNION ALL
SELECT DATE '2022-01-01', 2, 1 FROM DUAL UNION ALL
SELECT DATE '2022-01-01', 3, 2 FROM DUAL UNION ALL
SELECT DATE '2022-01-02', 3, NULL FROM DUAL UNION ALL
SELECT DATE '2022-01-02', 1, 3 FROM DUAL UNION ALL
SELECT DATE '2022-01-02', 2, 1 FROM DUAL UNION ALL
SELECT DATE '2022-01-03', 4, NULL FROM DUAL;
Outputs:
TRANSACTION_DATE
NEW_DATA
OLD_DATA
2022-01-01 00:00:00
1
null
2022-01-01 00:00:00
2
1
2022-01-01 00:00:00
3
2
2022-01-02 00:00:00
3
null
2022-01-02 00:00:00
1
3
2022-01-02 00:00:00
2
1
2022-01-03 00:00:00
4
null
fiddle
Update
Given the updated sample data:
CREATE TABLE table_name ( transaction_date, old_data, new_data ) AS
SELECT DATE '2021-12-20'+INTERVAL '11:25' HOUR TO MINUTE, 0, 115.09903 FROM DUAL UNION ALL
SELECT DATE '2021-12-20'+INTERVAL '11:25' HOUR TO MINUTE, 115.09903, 115.13233 FROM DUAL UNION ALL
SELECT DATE '2021-12-20'+INTERVAL '11:25' HOUR TO MINUTE, 115.13233, 115.16490 FROM DUAL UNION ALL
SELECT DATE '2021-12-20'+INTERVAL '11:25' HOUR TO MINUTE, 115.16490, 115.19678 FROM DUAL UNION ALL
SELECT DATE '2021-12-20'+INTERVAL '11:35' HOUR TO MINUTE, 115.19678, 115.22799 FROM DUAL UNION ALL
SELECT DATE '2021-12-20'+INTERVAL '11:35' HOUR TO MINUTE, 115.22799, 115.25854 FROM DUAL UNION ALL
SELECT DATE '2021-12-20'+INTERVAL '11:35' HOUR TO MINUTE, 115.25854, 115.28846 FROM DUAL UNION ALL
SELECT DATE '2021-12-20'+INTERVAL '11:35' HOUR TO MINUTE, 115.28846, 115.31776 FROM DUAL;
Then, since both old_data and new_data increase with time, you could use:
SELECT *
FROM table_name
ORDER BY old_data;
or:
SELECT *
FROM table_name
ORDER BY new_data;
or, if you want to use the hierarchy then:
SELECT transaction_date,
old_data,
new_data
FROM table_name
START WITH old_data = 0
CONNECT BY
PRIOR transaction_date <= transaction_date
AND PRIOR new_data = old_data
ORDER SIBLINGS BY
transaction_date
Which all output:
TRANSACTION_DATE
OLD_DATA
NEW_DATA
2021-12-20 11:25:00
0
115.09903
2021-12-20 11:25:00
115.09903
115.13233
2021-12-20 11:25:00
115.13233
115.1649
2021-12-20 11:25:00
115.1649
115.19678
2021-12-20 11:35:00
115.19678
115.22799
2021-12-20 11:35:00
115.22799
115.25854
2021-12-20 11:35:00
115.25854
115.28846
2021-12-20 11:35:00
115.28846
115.31776
fiddle

Query that counts total records per day and total records with same time timestamp and id per day in Bigquery

I have timeseries data like this:
time
id
value
2018-04-25 22:00:00 UTC
A
1
2018-04-25 23:00:00 UTC
A
2
2018-04-25 23:00:00 UTC
A
2.1
2018-04-25 23:00:00 UTC
B
1
2018-04-26 23:00:00 UTC
B
1.3
How do i write a query to produce an output table with these columns:
date: the truncated time
records: the number of records during this date
records_conflicting_time_id: the number of records during this date where the combination of time, id are not unique. In the example data above the two records with id==A at 2018-04-25 23:00:00 UTC would be counted for date 2018-04-25
So the output of our query should be:
date
records
records_conflicting_time_id
2018-04-25
4
2
2018-04-26
1
0
Getting records is easy, i just truncate the time to get date and then group by date. But i'm really struggling to produce a column that counts the number of records where id + time is not unique over that date...
Consider below approach
select date(time) date,
sum(cnt) records,
sum(if(cnt > 1, cnt, 0)) conflicting_records
from (
select time, id, count(*) cnt
from your_table
group by time, id
)
group by date
if applied to sample data in your question - output is
with YOUR_DATA as
(
select cast('2018-04-25 22:00:00 UTC' as timestamp) as `time`, 'A' as id, 1.0 as value
union all select cast('2018-04-25 23:00:00 UTC' as timestamp) as `time`, 'A' as id, 2.0 as value
union all select cast('2018-04-25 23:00:00 UTC' as timestamp) as `time`, 'A' as id, 2.1 as value
union all select cast('2018-04-25 23:00:00 UTC' as timestamp) as `time`, 'B' as id, 1.0 as value
union all select cast('2018-04-26 23:00:00 UTC' as timestamp) as `time`, 'B' as id, 1.3 as value
)
select cast(timestamp_trunc(t1.`time`, day) as date) as `date`,
count(*) as records,
case when count(*)-count(distinct cast(t1.`time` as string) || t1.id) = 0 then 0
else count(*)-count(distinct cast(t1.`time` as string) || t1.id)+1
end as records_conflicting_time_id
from YOUR_DATA t1
group by cast(timestamp_trunc(t1.`time`, day) as date)
;

Getting last 4 months data from given date column some months data is midding

I have below data
Record_date ID
28-feb-2022 xyz
31-Jan-2022 ABC
30-nov-2022 jkl
31-oct-2022 dcs
I want to get last 3 months data from given date column. We don't have to consider the missing month.
Output should be:
Record_date ID
28-feb-2022 xyz
31-Jan-2022 ABC
30-nov-2022 jkl
In the last 3 months Dec is missing but we have to ignore it as the data is not available. Tried many things but not working.
Any suggestions?
Assuming you are using Oracle then you can use Oralce ADD_MONTHS function and filter the data.
--- untested
-- Assumption Record_date is a date column
SELECT * FROM table1
where Record_date > ADD_MONTHS(SYSDATE, -3)
To get the data for the three months that are latest in the table, you can use:
SELECT record_date,
id
FROM (
SELECT t.*,
DENSE_RANK() OVER (ORDER BY TRUNC(Record_date, 'MM') DESC) AS rnk
FROM table_name t
)
WHERE rnk <= 3;
Which, for the sample data:
CREATE TABLE table_name (Record_date, ID) AS
SELECT DATE '2022-02-28', 'xyz' FROM DUAL UNION ALL
SELECT DATE '2022-01-31', 'ABC' FROM DUAL UNION ALL
SELECT DATE '2022-11-30', 'jkl' FROM DUAL UNION ALL
SELECT DATE '2022-10-31', 'dcs' FROM DUAL;
Outputs:
RECORD_DATE
ID
2022-11-30 00:00:00
jkl
2022-10-31 00:00:00
dcs
2022-02-28 00:00:00
xyz
db<>fiddle here

how to get total of amount against person_number and some more columns in a single oracle SQL query and grand amount

i have one table the query is like this
SELECT
cost_center_name,
person_number,
person_full_name,
TO_DATE(TO_CHAR(wfc_start_date,'DD-MON-YYYY HH:MI:SS AM'),'DD-MON-YYYY HH:MI:SS AM') start_date,
TO_DATE(TO_CHAR(wfc_end_date,'DD-MON-YYYY HH:MI:SS AM'),'DD-MON-YYYY HH:MI:SS AM') end_date,
TO_CHAR(wfc_start_date,'DD-MON-YYYY HH:MI:SS AM') start_date_hours,
TO_CHAR(wfc_end_date,'DD-MON-YYYY HH:MI:SS AM') end_date_hours,
pay_code_name,
duration_dd_hh_mi_ss,
wage_amount,
FROM
XX_pay_type a
WHERE
person_number IN (
'102',
'103'
)
AND pay_period_ending_date = '20-APR-2019'
the duration_dd_hh_mi_ss,
has values like 00:4:32:00
00:3:20:00
i want the sum(wage_amount) and sum(duration_dd_hh_mi_ss) from the sql query
against person_number
i also want grand_total (wage_amount)
sum(duration_dd_hh_mi_ss) will be in this case 00:7:52:00
i tried sum(wage_amount) over(partition by person_number)
but i cannot get the sum(duration_dd_hh_mi_ss) against the person_number and the grand(wage_total)
If duration is substraction of end_date and start_date then use substraction. Do not follow first step.
1
Otherwise use to_dsinterval(). But to be able to use it we need to replace first : in duration with space, because to_dsinterval does not accept this parameter.
select regexp_replace('00:2:32:00', ':', ' ', 1, 1) from dual; --> 00 2:32:00
select to_dsinterval(regexp_replace('00:2:32:00', ':', ' ', 1, 1)) from dual;
Now we have intervals which can be summed, with some effort. It is not trivial, but possible. Strings cannot.
2
To get partial sums and grand totals in one query use aggregation with cube, rollup or grouping sets. Here is an example with rollup:
-- sample data
with t(center, id, name, duration, wage_amount) as (
select 'C1', '102', 'Tim', '01:2:00:00', 80 from dual union all
select 'C1', '102', 'Tim', '00:0:32:00', 70 from dual union all
select 'C2', '102', 'Tim', '00:2:00:00', 100 from dual union all
select 'C2', '103', 'Bob', '00:2:00:00', 120 from dual )
-- end of sample data
query:
select center, id, name, sum(wage_amount) amt,
numtodsinterval(sum(sysdate + to_dsinterval(regexp_replace(duration, ':', ' ', 1, 1))
- sysdate ), 'day') duration
from t
group by rollup((id, name), center)
Result:
CENTER ID NAME AMT DURATION
------ --- ---- ---------- -------------------
C1 102 Tim 150 +000000001 02:32:00
C2 102 Tim 100 +000000000 02:00:00
102 Tim 250 +000000001 04:32:00
C2 103 Bob 120 +000000000 02:00:00
103 Bob 120 +000000000 02:00:00
370 +000000001 06:32:00
You can also do it as you started, using analytical sums which will give you values in additional columns. Or make a union (union all) of your data and grouping query.
What is the data type of 'duration_dd_hh_mi_ss' column?