Oracle Database Temporal Query Implementation - Collapse Date Ranges - sql

This is the result of one of my queries:
SURGERY_D
---------
01-APR-05
02-APR-05
03-APR-05
04-APR-05
05-APR-05
06-APR-05
07-APR-05
11-APR-05
12-APR-05
13-APR-05
14-APR-05
15-APR-05
16-APR-05
19-APR-05
20-APR-05
21-APR-05
22-APR-05
23-APR-05
24-APR-05
26-APR-05
27-APR-05
28-APR-05
29-APR-05
30-APR-05
I want to collapse the date ranges which are continuous, into intervals. For examples,
[01-APR-05, 07-APR-05], [11-APR-05, 16-APR-05] and so on.
In terms of temporal databases, I want to 'collapse' the dates. Any idea how to do that on Oracle? I am using version 11. I searched for it and read a book but couldn't find/understand how to do it. It might be simple, but everyone has their own flaws and Oracle is mine. Also, I am new to SO so my apologies if I have violated any rules. Thank You!

You can take advantage of the ROW_NUMBER analytical function to generate a unique, sequential number for each of the records (we'll assign that number to the dates in ascending order).
Then, you group the dates by difference between the date and the generated number - the consecutive dates will have the same difference:
Date Number Difference
01-APR-05 1 1 -- MIN(date_val) in group with diff. = 1
02-APR-05 2 1
03-APR-05 3 1
04-APR-05 4 1
05-APR-05 5 1
06-APR-05 6 1
07-APR-05 7 1 -- MAX(date_val) in group with diff. = 1
11-APR-05 8 3 -- MIN(date_val) in group with diff. = 3
12-APR-05 9 3
13-APR-05 10 3
14-APR-05 11 3
15-APR-05 12 3
16-APR-05 13 3 -- MAX(date_val) in group with diff. = 3
Finally, you select the minimal and maximal date in each of the groups to get the beginning and ending of each range.
Here's the query:
SELECT
MIN(date_val) start_date,
MAX(date_val) end_date
FROM (
SELECT
date_val,
row_number() OVER (ORDER BY date_val) AS rn
FROM date_tab
)
GROUP BY date_val - rn
ORDER BY 1
;
Output:
START_DATE END_DATE
------------ ----------
01-04-2005 07-04-2005
11-04-2005 16-04-2005
19-04-2005 24-04-2005
26-04-2005 30-04-2005
You can check how that works on SQLFidlle: Dates ranges example

Related

Using Parameter within timestamp_trunc in SQL Query for DataStudio

I am trying to use a custom parameter within DataStudio. The data is hosted in BigQuery.
SELECT
timestamp_trunc(o.created_at, #groupby) AS dateMain,
count(o.id) AS total_orders
FROM `x.default.orders` o
group by 1
When I try this, it returns an error saying that "A valid date part name is required at [2:35]"
I basically need to group the dates using a parameter (e.g. day, week, month).
I have also included a screenshot of how I have created the parameter in Google DataStudio. There is a default value set which is "day".
A workaround that might do the trick here is to use a rollup in the group by with the different levels of aggregation of the date, since I am not sure you can pass a DS parameter to work like that.
See the following example for clarity:
with default_orders as (
select timestamp'2021-01-01' as created_at, 1 as id
union all
select timestamp'2021-01-01', 2
union all
select timestamp'2021-01-02', 3
union all
select timestamp'2021-01-03', 4
union all
select timestamp'2021-01-03', 5
union all
select timestamp'2021-01-04', 6
),
final as (
select
count(id) as count_orders,
timestamp_trunc(created_at, day) as days,
timestamp_trunc(created_at, week) as weeks,
timestamp_trunc(created_at, month) as months
from
default_orders
group by
rollup(days, weeks, months)
)
select * from final
The output, then, would be similar to the following:
count | days | weeks | months
------+------------+----------+----------
6 | null | null | null <- this, represents the overall (counted 6 ids)
2 | 2021-01-01| null | null <- this, the 1st rollup level (day)
2 | 2021-01-01|2020-12-27| null <- this, the 1st and 2nd (day, week)
2 | 2021-01-01|2020-12-27|2021-01-01 <- this, all of them
And so on.
At the moment of visualizing this on data studio, you have two options: setting the metric as Avg instead of Sum, because as you can see there's kind of a duplication at each stage of the day column; or doing another step in the query and get rid of nulls, like this:
select
*
from
final
where
days is not null and
weeks is not null and
months is not null

SELECT-SQL-Statement - Transformation of a single data record with a date period into several single data records per day

I have the following example records in a table that contains records with time periods (Originally import data):
ID
DateFrom
DateTo
Value
1
01.01.2021
03.01.2021
A
2
02.03.2021
06.03.2021
B
...
The data is imported as individual records into a separate table.
I would like to put the data records into the following form with a SELECT query in order to be able to check in the 2nd step whether all data were imported as a single data record:
ID
DateFrom
DateTo
Value
1
01.01.2021
01.01.2021
A
1
02.01.2021
02.01.2021
A
1
03.01.2021
03.01.2021
A
2
02.03.2021
02.03.2021
B
2
03.03.2021
03.03.2021
B
2
04.03.2021
04.03.2021
B
2
05.03.2021
05.03.2021
B
2
06.03.2021
06.03.2021
B
..
Unfortunately, I have a knot in my head and cannot find a query approach.
I am sure the hierarchical query suits here. The problem though I still can't fit it in here without using distinct.
This query will work with assummption that "datefrom" and "dateto" columns are of DATE format.
Replace "test_data" with table name you store dates in.
select td.id,
qq.day_date,
value
from test_data td
join (select distinct id,
datefrom + level - 1 day_date
from test_data
connect by level <= (dateto - datefrom + 1)) qq
on qq.id = td.id
order by td.id, qq.day_date;
If datato and datafrom are just varchars, you may convert them to dates using to_date function.

PostgreSQL - Generate series using subqueries

Using PostgreSQL, I need to accomplish the following scenario. I have a table called routine, where I store start_date and end_date columns. I have another table called exercises, where I store all the data related with each exercise and finally, I have a table called routine_exercise where I create the relationship between the routine and the exercise. Each routine can have seven days (one day indicates the day of the week, e.g: 1 means Monday, etc) of exercises and each day can have one or more exercise. For example:
Exercise Table
Exercise ID
Name
1
Exercise 1
2
Exercise 2
3
Exercise 3
Routine Table
Routine ID
Name
1
Routine 1
2
Routine 2
3
Routine 3
Routine_Exercise Table
Exercise ID
Routine ID
Day
1
1
1
2
1
1
3
1
1
1
1
2
2
1
3
3
1
4
The thing that I'm trying to do is generate a series from start_date to end_date (e.g 03-25-2020 to 05-25-2020, two months) and assign to each date the number of day it supposed to work.
For example, using the data in the Routine_Exercise Table the user should only workout days: 1,2,3,4, so I would like to attach that number to each date. For example, something like this:
Expected Result
Date
Number
03-25-2020
1
03-26-2020
2
03-27-2020
3
03-28-2020
4
03-29-2020
null
03-30-2020
null
03-31-2020
null
04-01-2020
1
04-02-2020
2
04-03-2020
3
04-04-2020
4
04-05-2020
null
Any suggestions or different ideas on how to implement this? Another solution that doesn't require series?
Thanks in advance!
You can generate the dates between start and end input dates using generate_series and then do left join with your routine_exercise table as follows:
SELECT t.d, re.day
FROM generate_series(timestamp '2020-03-25', timestamp '2020-05-25',
interval '1 day') AS t(d)
left join (select distinct day from Routine_Exercise re WHERE ROUTINE_ID = 1) re
on mod(extract(day from (t.d -timestamp '2020-03-25')), 7) + 1 = re.day;

Exponential decay in SQL for different dates page views

I have a different dates with the amount of products viewed on a webpage over a 30 day time frame. I am trying to create a exponential decay model in SQL. I am using exponential decay because I want to highlight the latest events over older ones. I not sure how to write this in SQL without getting an error. I have never done this before with this type of model so want to make sure I am doing it correctly too.
=================================
Data looks like this
product views date
a 1 2014-05-15
a 2 2014-05-01
b 2 2014-05-10
c 4 2014-05-02
c 1 2014-05-12
d 3 2014-05-11
================================
Code:
create table decay model as
select product,views,date
case when......
from table abc
group by product;
not sure what to write to do the model
I want to penalize products that were viewed that were older vs products that were viewed more recently
Thank you for your help
You can do it like this:
Choose the partition in which you want to apply exponential decay, then order descending by date within such a group.
use the function ROW_NUMBER() with ascendent ordering to get the row numbering within each subgroup.
calculate pow(your_variable_in_[0,1], rownum) and apply it to your result.
Code might look like this (might work in Oracle SQL or db2):
SELECT <your_partitioning>, date, <whatever>*power(<your_variable>,rownum-1)
FROM (SELECT a.*
, ROW_NUMBER() OVER (PARTITION BY <your_partitioning> ORDER BY a.date DESC) AS rownum
FROM YOUR_TABLE a)
ORDER BY <your_partitioning>, date DESC
EDIT: I read again over your problem and think I understood now what you asked for, so here is a solution which might work (decay factor is 0.9 here):
SELECT product, sum(adjusted_views) // (i)
FROM (SELECT product, views*power(0.9, rownum-1) AS adjusted_views, date, rownum // (ii)
FROM (SELECT product, views, date // (iii)
, ROW_NUMBER() OVER (PARTITION BY product ORDER BY a.date DESC) AS rownum
FROM YOUR_TABLE a)
ORDER BY product, date DESC)
GROUP BY product
The inner select statement (iii) creates a temporary table that might look like this
product views date rownum
--------------------------------------------------
a 1 2014-05-15 1
a 2 2014-05-14 2
a 2 2014-05-13 3
b 2 2014-05-10 1
b 3 2014-05-09 2
b 2 2014-05-08 3
b 1 2014-05-07 4
The next query (ii) then uses the rownumber to construct an exponentially decaying factor 0.9^(rownum-1) and applies it to views. The result is
product adjusted_views date rownum
--------------------------------------------------
a 1 * 0.9^0 2014-05-15 1
a 2 * 0.9^1 2014-05-14 2
a 2 * 0.9^2 2014-05-13 3
b 2 * 0.9^0 2014-05-10 1
b 3 * 0.9^1 2014-05-09 2
b 2 * 0.9^2 2014-05-08 3
b 1 * 0.9^3 2014-05-07 4
In a last step (the outer query) the adjusted views are summed up, as this seems to be the quantity you are interested in.
Note, however, that in order to be consistent there should be regular distances between the dates, e.g., always on day (--not one day here and a month there, because these will be weighted in a similar fashion although they shouldn't).

Rank based on sequence of dates

I am having data as below
**Heading Date**
A 2009-02-01
B 2009-02-03
c 2009-02-05
d 2009-02-06
e 2009-02-08
I need rank as below
Heading Date Rank
A 2009-02-01 1
B 2009-02-03 2
c 2009-02-05 1
d 2009-02-06 2
e 2009-02-07 3
As I need rank based on date. If the date is continuous the rank should be 1, 2, 3 etc. If there is any break on dates I need to start over with 1, 2, ...
Can any one help me on this?
SELECT heading, thedate
,row_number() OVER (PARTITION BY grp ORDER BY thedate) AS rn
FROM (
SELECT *, thedate - (row_number() OVER (ORDER BY thedate))::int AS grp
FROM demo
) sub;
While you speak of "rank" you seem to want the result of the window function row_number().
Form groups of consecutive days (same date in grp) in subquery sub.
Number rows with another row_number() call, this time partitioned by grp.
One subquery is the bare minimum here, since window functions cannot be nested.
SQL Fiddle.
Note that I went with the second version of your contradictory sample data. And the result is as #mu suggested in his comment.
Also assuming that there are no duplicate dates. You'd have to aggregate first in this case.
Hi this is not correct answer, I am trying.. It is interesting..:) I am posting what I got so far: sqlfiddle
SELECT
rank() over (order by thedate asc) as rank,
heading, thedate
FROM
demo
Order by
rank asc;
Now I am trying to get the break in dates. I don't know how? But may be these links useful
SQL — computing end dates from a given start date with arbitrary
breaks
How to rank in postgres query
I will update if I got anything.
Edit:
I got this for mysql, I am posting this because it may helpful. Check Emulate Row_Number()
Here
Given a table with two columns i and j, generate a resultset that has
a derived sequential row_number column taking the values 1,2,3,... for
a defined ordering of j which resets to 1 when the value of i changes
Bangalore BLR - Bagmane Tech Park 2013-10-11 Data Centre 0
Bangalore BLR - Bagmane Tech Park 2013-10-11 BMS 0
Bangalore BLR - Bagmane Tech Park 2013-10-12 BMS 0
Bangalore BLR - Bagmane Tech Park 2013-10-15 BMS 3
I am having data lyk this..
If last column is zero the rank should be made based on all columns..If the date is continuous
like 2013-10-11 ,2013-10-12 rank should be 1,2...
If there is any break in date 2013-10-11 ,2013-10-12 and 2013-10-15 again the rank should start from 1 for 2013-10-15