Oracle Order By Seconds - sql

I have a table like the following:
TIME Quantity
200918 122
200919 333
200919 500
181222 32
181223 43
The output I would like for the above data is:
Time Quantity
200919 955
181223 75
Essentially I want to group based on a tolerance of one second and sum up the quantity taking the latest time. Any pointers?
Thanks

You can use lag() and cumulative group by:
select max(time), sum(quantity)
from (select t.*,
sum(case when prev_time < time - 1 then 1 else 0 end) over (order by time) as grp
from (select t.*, lag(time) over (order by time) as prev_time
from t
) t
)
group by grp;

Related

How to do a Min and Max of date but following the changes in price points

I'm not really sure how to word this question better so I'll provide the data that I have and the result that I'm after.
This is the data that I have
sku sales qty date
A 100 1 1-Jan-19
A 200 2 2-Jan-19
A 100 1 3-Jan-19
A 240 2 4-Jan-19
A 360 3 5-Jan-19
A 360 4 6-Jan-19
A 200 2 7-Jan-19
A 90 1 8-Jan-19
B 100 1 9-Jan-19
B 200 2 10-Jan-19
And this is the result that I'm after
sku price sum(qty) sum(sales) min(date) max(date)
A 100 4 400 1-Jan-19 3-Jan-19
A 120 5 600 4-Jan-19 5-Jan-19
A 90 4 360 6-Jan-19 6-Jan-19
A 100 2 200 7-Jan-19 7-Jan-19
A 90 1 90 8-Jan-19 8-Jan-19
B 100 3 300 9-Jan-19 10-Jan-19
As you can see, I'm trying to get the min and max date of each price point, where price = sales/qty. At this point, I can get the min and max date of the same price but I can separate it when there's another price in between. I think I have to use some sort of min(date) over (partition by sales/qty order by date) but I can't figure it out yet.
I'm using Redshift SQL
This is a gaps-and-islands query. You can do this by generating a sequence and subtracting that from the date. Then aggregate:
select sku, price, sum(qty), sum(sales),
min(date), max(date)
from (select t.*,
row_number() over (partition by sku, price order by date) as seqnum
from t
) t
group by sku, price, (date - seqnum * interval '1 day')
order by sku, price, min(date);
You can do with Sub Query and LAG
FIDDLE DEMO
SELECT SKU, Price, SUM(Qty) SumQty, SUM(Sales) SumSales, MIN(date) MinDate, MAX(date) MaxDate
FROM (
SELECT SKU,Price,SUM(is_change) OVER(order by SKU, date) is_change,Sales, Qty,date
FROM (SELECT SKU, Sales/Qty AS Price, Sales, Qty,date,
CASE WHEN Sales/Qty = lag(Sales/Qty) over (order by SKU, date)
and SKU = lag(SKU) OVER (order by SKU, date) then 0 ELSE 1 END AS is_change
FROM Tbl
)InnerSelect
) X GROUP BY sku, price,is_change
ORDER BY SKU,MIN(date)
Output

Find the longest streak of perfect scores per player

I have a the following result from a SELECT query with ORDER BY player_id ASC, time ASC in PostgreSQL database:
player_id points time
395 0 2018-06-01 17:55:23.982413-04
395 100 2018-06-30 11:05:21.8679-04
395 0 2018-07-15 21:56:25.420837-04
395 100 2018-07-28 19:47:13.84652-04
395 0 2018-11-27 17:09:59.384-05
395 100 2018-12-02 08:56:06.83033-05
399 0 2018-05-15 15:28:22.782945-04
399 100 2018-06-10 12:11:18.041521-04
454 0 2018-07-10 18:53:24.236363-04
675 0 2018-08-07 20:59:15.510936-04
696 0 2018-08-07 19:09:07.126876-04
756 100 2018-08-15 08:21:11.300871-04
756 100 2018-08-15 16:43:08.698862-04
756 0 2018-08-15 17:22:49.755721-04
756 100 2018-10-07 15:30:49.27374-04
756 0 2018-10-07 15:35:00.975252-04
756 0 2018-11-27 19:04:06.456982-05
756 100 2018-12-02 19:24:20.880022-05
756 100 2018-12-04 19:57:48.961111-05
I'm trying to find each player's longest streak where points = 100, with the tiebreaker being whichever streak began most recently. I also need to determine the time at which that player's longest streak began. The expected result would be:
player_id longest_streak time_began
395 1 2018-12-02 08:56:06.83033-05
399 1 2018-06-10 12:11:18.041521-04
756 2 2018-12-02 19:24:20.880022-05
A gaps-and-islands problem indeed.
Assuming:
"Streaks" are not interrupted by rows from other players.
All columns are defined NOT NULL. (Else you have to do more.)
This should be simplest and fastest as it only needs two fast row_number() window functions:
SELECT DISTINCT ON (player_id)
player_id, count(*) AS seq_len, min(ts) AS time_began
FROM (
SELECT player_id, points, ts
, row_number() OVER (PARTITION BY player_id ORDER BY ts)
- row_number() OVER (PARTITION BY player_id, points ORDER BY ts) AS grp
FROM tbl
) sub
WHERE points = 100
GROUP BY player_id, grp -- omit "points" after WHERE points = 100
ORDER BY player_id, seq_len DESC, time_began DESC;
db<>fiddle here
Using the column name ts instead of time, which is a reserved word in standard SQL. It's allowed in Postgres, but with limitations and it's still a bad idea to use it as identifier.
The "trick" is to subtract row numbers so that consecutive rows fall in the same group (grp) per (player_id, points). Then filter the ones with 100 points, aggregate per group and return only the longest, most recent result per player.
Basic explanation for the technique:
Select longest continuous sequence
We can use GROUP BY and DISTINCT ON in the same SELECT, GROUP BY is applied before DISTINCT ON. Consider the sequence of events in a SELECT query:
Best way to get result count before LIMIT was applied
About DISTINCT ON:
Select first row in each GROUP BY group?
This is a gap and island problem, you can try to use SUM condition aggravated function with window function, getting gap number.
then use MAX and COUNT window function again.
Query 1:
WITH CTE AS (
SELECT *,
SUM(CASE WHEN points = 100 THEN 1 END) OVER(PARTITION BY player_id ORDER BY time) -
SUM(1) OVER(ORDER BY time) RN
FROM T
)
SELECT player_id,
MAX(longest_streak) longest_streak,
MAX(cnt) longest_streak
FROM (
SELECT player_id,
MAX(time) OVER(PARTITION BY rn,player_id) longest_streak,
COUNT(*) OVER(PARTITION BY rn,player_id) cnt
FROM CTE
WHERE points > 0
) t1
GROUP BY player_id
Results:
| player_id | longest_streak | longest_streak |
|-----------|-----------------------------|----------------|
| 756 | 2018-12-04T19:57:48.961111Z | 2 |
| 399 | 2018-06-10T12:11:18.041521Z | 1 |
| 395 | 2018-12-02T08:56:06.83033Z | 1 |
One way to do this is to look at how many rows between the previous and next non-100 results. To get the lengths of the streaks:
with s as (
select s.*,
row_number() over (partition by player_id order by time) as seqnum,
count(*) over (partition by player_id) as cnt
from scores s
)
select s.*,
coalesce(next_seqnum, cnt + 1) - coalesce(prev_seqnum, 0) - 1 as length
from (select s.*,
max(seqnum) filter (where score <> 100) over (partition by player_id order by time) as prev_seqnum,
max(seqnum) filter (where score <> 100) over (partition by player_id order by time) as next_seqnum
from s
) s
where score = 100;
You can then incorporate the other conditions:
with s as (
select s.*,
row_number() over (partition by player_id order by time) as seqnum,
count(*) over (partition by player_id) as cnt
from scores s
),
streaks as (
select s.*,
coalesce(next_seqnum - prev_seqnum) over (partition by player_id) as length,
max(next_seqnum - prev_seqnum) over (partition by player_id) as max_length,
max(next_seqnum) over (partition by player_id) as max_next_seqnum
from (select s.*,
coalesce(max(seqnum) filter (where score <> 100) over (partition by player_id order by time), 0) as prev_seqnum,
coalesce(max(seqnum) filter (where score <> 100) over (partition by player_id order by time), cnt + 1) as next_seqnum
from s
) s
where score = 100
)
select s.*
from streaks s
where length = max_length and
next_seqnum = max_next_seqnum;
Here is my answer
select
user_id,
non_streak,
streak,
ifnull(non_streak,streak) strk,
max(time) time
from (
Select
user_id,time,
points,
lag(points) over (partition by user_id order by time) prev_point,
case when points + lag(points) over (partition by user_id order by time) = 100 then 1 end as non_streak,
case when points + lag(points) over (partition by user_id order by time) > 100 then 1 end as streak
From players
) where ifnull(non_streak,streak) is not null
group by 1,2,3
order by 1,2
) group by user_id`

Teradara SQL - Operation with max-min dates

suppose I have the following data frame in Reradata SQL.
How can I get the variation between the highest and lowest date, at user level? Regards
Initial table
user date price
1 1-1 10
1 2-1 20
1 3-1 30
2 1-1 12
2 2-1 22
2 3-1 32
3 1-1 13
3 2-1 23
3 3-1 33
Final table
user var_price
1 30/10-1
2 32/12-1
3 33/13-1
Try this-
SELECT B.[user],
CAST(SUM(B.max_price) AS VARCHAR)+'/'+CAST(SUM(B.min_price) AS VARCHAR)+ '-1' var_price,
SUM(B.max_price)/SUM(B.min_price) -1 calculated_var_price
FROM
(
SELECT * FROM
(
SELECT [user],0 max_price,price min_price,ROW_NUMBER() OVER (PARTITION BY [user] ORDER BY DATE) RN
FROM your_table
)A WHERE RN = 1
UNION ALL
SELECT * FROM
(
SELECT [user],price max_price,0 min_price, ROW_NUMBER() OVER (PARTITION BY [user] ORDER BY DATE DESC) RN
FROM your_table
)A WHERE RN = 1
)B
GROUP BY B.[user]
Output is-
user var_price calculated_var_price
1 30/10-1 2
2 32/12-1 1
3 33/13-1 1
Is this what you want?
select user, max(price) / min(price) - 1
from t
group by user;
Your values are monotonically increasing, so max() and min() seems like the simplest solution.
EDIT:
You can use window functions:
select user, max(last_price) / max(first_price) - 1
from (select t.*,
first_value(price) over (partition by user order by date rows between unbounded preceding and current_row) as first_price,
first_value(price) over (partition by user order by date desc rows between unbounded preceding and current_row) as last_price
from t
) t
group by user;
select user
,price as first_price
,last_value(price)
over (paritition by user
order by date
rows between unbounded preceding and unbounded following) as last_price
from mytab
qualify
row_number() -- lowest date only
over (paritition by user
order by date) = 1
This returns the row with the lowest date and adds the price of the latest date

Aging bucket in SQL query

I have a table which has member ID along with LM_Conversion_date and retired_date. I have managed to get the difference between two date but now i would like to have the aging bucket and reflect those membership number which falls under those bucket. Here is my table example and how i want to see the data,
Member_no LM_Conversion_date Retired_date Date_difference
100026 08/12/2017 31/12/2017 23
100114 31/08/2017 31/08/2017 0
100620 15/09/2017 30/09/2017 15
100726 10/01/2017 31/12/2016 -10
I want the output to be
All negative 0-15 15-30 >30
100726 100114 100026
100620
Any Help will be much appreciated
You can do this using conditional aggregation:
select max(case when grp = '<0' then member_no end) as all_negative,
max(case when grp = '<=15' then member_no end) as [0-15],
max(case when grp = '<=30' then member_no end) as [15-30],
max(case when grp = '>30' then member_no end) as [>30]
from (select t.*, v.grp,
row_number() over (partition by grp order by member_no) as seqnum
from t cross apply
(values (case when date_difference <= 0 then '<0'
when date_difference <= 15 then '<=15'
when date_difference <= 30 then '<=30'
else '>30'
end)
) v(grp)
) t
group by seqnum
order by seqnum;
The subquery basically enumerates the members in each group. These are aggregated into separate rows by the aggregation.

SQL partition by on date range

Assume this is my table:
ID NUMBER DATE
------------------------
1 45 2018-01-01
2 45 2018-01-02
2 45 2018-01-27
I need to separate using partition by and row_number where the difference between one date and another is greater than 5 days. Something like this would be the result of the above example:
ROWNUMBER ID NUMBER DATE
-----------------------------
1 1 45 2018-01-01
2 2 45 2018-01-02
1 3 45 2018-01-27
My actual query is something like this:
SELECT ROW_NUMBER() OVER(PARTITION BY NUMBER ODER BY ID DESC) AS ROWNUMBER, ...
But as you can notice, it doesn't work for the dates. How can I achieve that?
You can use lag function :
select *, row_number() over (partition by number, grp order by id) as [ROWNUMBER]
from (select *, (case when datediff(day, lag(date,1,date) over (partition by number order by id), date) <= 1
then 1 else 2
end) as grp
from table
) t;
by using lag and datediff funtion
select * from
(
select t.*,
datediff(day,
lag(DATE) over (partition by NUMBER order by id),
DATE
) as diff
from t
) as TT where diff>5
http://sqlfiddle.com/#!18/130ae/11
I think you want to identify the groups, using lag() and datediff() and a cumulative sum. Then use row_number():
select t.*,
row_number() over (partition by number, grp order by date) as rownumber
from (select t.*,
sum(grp_start) over (partition by number order by date) as grp
from (select t.*,
(case when lag(date) over (partition by number order by date) < dateadd(day, 5, date)
then 1 else 0
end) as grp_start
from t
) t
) t;