How to get the validity date range of a price from individual daily prices in SQL - sql

I have some prices for the month of January.
Date,Price
1,100
2,100
3,115
4,120
5,120
6,100
7,100
8,120
9,120
10,120
Now, the o/p I need is a non-overlapping date range for each price.
price,from,To
100,1,2
115,3,3
120,4,5
100,6,7
120,8,10
I need to do this using SQL only.
For now, if I simply group by and take min and max dates, I get the below, which is an overlapping range:
price,from,to
100,1,7
115,3,3
120,4,10

This is a gaps-and-islands problem. The simplest solution is the difference of row numbers:
select price, min(date), max(date)
from (select t.*,
row_number() over (order by date) as seqnum,
row_number() over (partition by price, order by date) as seqnum2
from t
) t
group by price, (seqnum - seqnum2)
order by min(date);
Why this works is a little hard to explain. But if you look at the results of the subquery, you will see how the adjacent rows are identified by the difference in the two values.

SELECT Lag.price,Lag.[date] AS [From], MIN(Lead.[date]-Lag.[date])+Lag.[date] AS [to]
FROM
(
SELECT [date],[Price]
FROM
(
SELECT [date],[Price],LAG(Price) OVER (ORDER BY DATE,Price) AS LagID FROM #table1 A
)B
WHERE CASE WHEN Price <> ISNULL(LagID,1) THEN 1 ELSE 0 END = 1
)Lag
JOIN
(
SELECT [date],[Price]
FROM
(
SELECT [date],Price,LEAD(Price) OVER (ORDER BY DATE,Price) AS LeadID FROM [#table1] A
)B
WHERE CASE WHEN Price <> ISNULL(LeadID,1) THEN 1 ELSE 0 END = 1
)Lead
ON Lag.[Price] = Lead.[Price]
WHERE Lead.[date]-Lag.[date] >= 0
GROUP BY Lag.[date],Lag.[price]
ORDER BY Lag.[date]

Another method using ROWS UNBOUNDED PRECEDING
SELECT price, MIN([date]) AS [from], [end_date] AS [To]
FROM
(
SELECT *, MIN([abc]) OVER (ORDER BY DATE DESC ROWS UNBOUNDED PRECEDING ) end_date
FROM
(
SELECT *, CASE WHEN price = next_price THEN NULL ELSE DATE END AS abc
FROM
(
SELECT a.* , b.[date] AS next_date, b.price AS next_price
FROM #table1 a
LEFT JOIN #table1 b
ON a.[date] = b.[date]-1
)AA
)BB
)CC
GROUP BY price, end_date

Related

SQL - Return count of consecutive days where value was unchanged

I have a table like
date
ticker
Action
'2022-03-01'
AAPL
BUY
'2022-03-02'
AAPL
SELL.
'2022-03-03'
AAPL
BUY.
'2022-03-01'
CMG
SELL.
'2022-03-02'
CMG
HOLD.
'2022-03-03'
CMG
HOLD.
'2022-03-01'
GPS
SELL.
'2022-03-02'
GPS
SELL.
'2022-03-03'
GPS
SELL.
I want to do a group by ticker then count all the times that Actions have sequentially been the value that they are as of the last date, here it's 2022-03-03. ie for this example table it'd be like;
ticker
NumSequentialDaysAction
AAPL
0
CMG
1
GPS
2
Fine to pass in 2022-03-03 as a value, don't need to figure that out on the fly.
Tried something like this
---Table Creation---
CREATE TABLE UserTable
([Date] DATETIME2, [Ticker] varchar(5), [Action] varchar(5))
;
INSERT INTO UserTable
([Date], [Ticker], [Action])
VALUES
('2022-03-01' , 'AAPL' , 'BUY'),
('2022-03-02' , 'AAPL' , 'SELL'),
('2022-03-03' , 'AAPL' , 'BUY'),
('2022-03-01' , 'CMG' , 'SELL'),
('2022-03-02' , 'CMG' , 'HOLD'),
('2022-03-03' , 'CMG' , 'HOLD'),
('2022-03-01' , 'GPS' , 'SELL'),
('2022-03-02' , 'GPS' , 'SELL'),
('2022-03-03' , 'GPS' , 'SELL')
;
---Attempted Solution---
I'm thinking that I need to do a sub query to get the last value and join on itself to get the matching values. Then apply a window function, ordered by date to see that the proceeding value is sequential.
WITH CTE AS (SELECT Date, Ticker, Action,
ROW_NUMBER() OVER (PARTITION BY Ticker, Action ORDER BY Date) as row_num
FROM UserTable)
SELECT Ticker, COUNT(DISTINCT Date) as count_of_days
FROM CTE
WHERE row_num = 1
GROUP BY Ticker;
WITH CTE AS (SELECT Date, Ticker, Action,
DENSE_RANK() OVER (PARTITION BY Ticker ORDER BY Action,Date) as rank
FROM table)
SELECT Ticker, COUNT(DISTINCT Date) as count_of_days
FROM CTE
WHERE rank = 1
GROUP BY Ticker;
You can do this with the help of the LEAD function like so. You didn't specify which RDBMS you're using. This solution works in PostgreSQL:
WITH "withSequential" AS (
SELECT
ticker,
(LEAD("Action") OVER (PARTITION BY ticker ORDER BY date ASC) = "Action") AS "nextDayIsSameAction"
FROM UserTable
)
SELECT
ticker,
SUM(
CASE
WHEN "nextDayIsSameAction" IS TRUE THEN 1
ELSE 0
END
) AS "NumSequentialDaysAction"
FROM "withSequential"
GROUP BY ticker
Here is a way to do this using gaps and islands solution.
Thanks for sharing the create and insert scripts, which helps to build the solution quickly.
dbfiddle link.
https://dbfiddle.uk/rZLDTrNR
with data
as (
select date
,ticker
,action
,case when lag(action) over(partition by ticker order by date) <> action then
1
else 0
end as marker
from usertable
)
,interim_data
as (
select *
,sum(marker) over(partition by ticker order by date) as grp_val
from data
)
,interim_data2
as (
select *
,count(*) over(partition by ticker,grp_val) as NumSequentialDaysAction
from interim_data
)
select ticker,NumSequentialDaysAction
from interim_data2
where date='2022-03-03'
Another option, you could use the difference between two row_numbers approach as the following:
select [Ticker], count(*)-1 NumSequentialDaysAction -- you could use (distinct) to remove duplicate rows
from
(
select *,
row_number() over (partition by [Ticker] order by [Date]) -
row_number() over (partition by [Ticker], [Action] order by [Date]) grp
from UserTable
where [date] <= '2022-03-03'
) RN_Groups
/* get only rows where [Action] = last date [Action] */
where [Action] = (select top 1 [Action] from UserTable T
where T.[Ticker] = RN_Groups.[Ticker] and [date] <= '2022-03-03'
order by [Date] desc)
group by [Ticker], [Action], grp
See demo

SQL - get start & end balance for each member each year

so I'd like to effectively get for each year the starting and end balance for each member for every year there is a record. for example the below would give me the latest balance for each member each year based on the date column
SELECT
T.MemberID,
T.DateCol,
T.Amount
FROM
(SELECT T.MemberID,
T.DateCol,
Amount,
ROW_NUMBER() OVER (PARTITION BY MemberID,
YEAR(DateCol)
ORDER BY
DateCol desc) AS seqnum
FROM
Tablet T
GROUP BY DateCol, MemberID, Amount
) T
WHERE
seqnum = 1 AND
MemberID = '1000009'
and the below would give me the earliest balance for each year
SELECT
T.MemberID,
T.DateCol,
T.Amount
FROM
(SELECT T.MemberID,
T.DateCol,
Amount,
ROW_NUMBER() OVER (PARTITION BY MemberID,
YEAR(DateCol)
ORDER BY
DateCol) AS seqnum
FROM
Tablet T
GROUP BY DateCol, MemberID, Amount
) T
WHERE
seqnum = 1 AND
MemberID = '1000009'
This would give me a result set like the below, column titles (MemberID, Date, Amount)
What I'm looking for is one query which is done by YEAR, MEMBERID, STARTBALANCE, ENDBALANCE as the columns. And would look like the below
What would be the best way to go about this?
commented above

PostgreSQL Percent Change using Row Number

I'm trying to find the percent change using row number with PostgreSQL but I'm running into an error where my "percent_change" column shows 0.
Here is what I have as my code.
WITH CTE AS (
SELECT date, sales, ROW_NUMBER() OVER (ORDER by date) AS rn
FROM sales_2019)
SELECT c1.date, c1.sales,
CAST(COALESCE (((c1.sales - c2.sales) * 1.0 / c2.sales) * 100, 0) AS INT) AS percent_change
FROM CTE AS c1
LEFT JOIN CTE AS c2
ON c1.date = c2.date AND c1.rn = c2.rn + 1
Here is my SQL table in case it's needed. Thank you in advance, I greatly appreciate it.
You can use LAG() for your requirement:
select
date,
sales,
round(coalesce((((sales-(lag(sales) over (order by date)))*1.0)/(lag(sales) over (order by date)))*100,0),2)
from sales_2019
or you can try with WITH clause
with cte as ( select
date,
sales,
coalesce(lag(sales) over (order by date),0) as previous_month
from sales_2019
)
select
date,
sales,
round( coalesce( (sales-previous_month)*1.0/nullif(previous_month,0),0 )*100,2)
from cte
DEMO
EDIT as per requirement in comment
with cte as ( select
date_,
sales,
ROW_NUMBER() OVER (ORDER by date_) AS rn1,
ROW_NUMBER() OVER (ORDER by date_)-1 AS rn2
from sales_2019
)
select t1.date_,
t1.sales,
round( coalesce( (t1.sales-t2.sales)*1.0/nullif(t2.sales,0),0 )*100,2)
from cte t1 left join cte t2 on t1.rn2=t2.rn1
DEMO

Oracle SQL: Show entries from component tables once apiece

My objective is produce a dataset that shows a boatload of data from, in total, just shy of 50 tables, all in the same Oracle SQL database schema. Each table except the first consists of, as far as the report I'm building cares, two elements:
A foreign-key identifier that matches a row on the first table
A date
There may be many rows on one of these tables corresponding to one case, and it will NOT be the same number of rows from table to table.
My objective is to have each row in the first table show up as many times as needed to display all the results from the other tables once. So, something like this (except on a lot more tables):
CASE_FILE_ID INITIATED_DATE INSPECTION_DATE PAYMENT_DATE ACTION_DATE
------------ -------------- --------------- ------------ -----------
1000 10-JUL-1986 14-JUL-1987 10-JUL-1986
1000 14-JUL-1988 10-JUL-1987
1000 14-JUL-1989 10-JUL-1988
1000 10-JUL-1989
My current SQL code (shrunk down to five tables, but the rest all follow the same format as T1-T4):
SELECT DISTINCT
A.CASE_FILE_ID,
T1.DATE AS INITIATED_DATE,
T2.DATE AS INSPECTION_DATE,
T3.DATE AS PAYMENT_DATE,
T4.DATE AS ACTION_DATE
FROM
RECORDS.CASE_FILE A
LEFT OUTER JOIN RECORDS.INITIATE T1 ON A.CASE_FILE_ID = T1.CASE_FILE_ID
LEFT OUTER JOIN RECORDS.INSPECTION T2 ON A.CASE_FILE_ID = T2.CASE_FILE_ID
LEFT OUTER JOIN RECORDS.PAYMENT T3 ON A.CASE_FILE_ID = T3.CASE_FILE_ID
LEFT OUTER JOIN RECORDS.ACTION T4 ON A.CASE_FILE_ID = T4.CASE_FILE_ID
ORDER BY
A.CASE_FILE_ID
The problem is, the output this produces results in distinct combinations; so in the above example (where I added a 'WHERE' clause of A.CASE_FILE_ID = '1000'), instead of four rows for case 1000, it'd show twelve (1 Initiated Date * 3 Inspection Dates * 4 Payment Dates = 12 rows). Suffice it to say, as the number of tables increases, this would get very prohibitive in both display and runtime, very quickly.
What is the best way to get an output loosely akin to the ideal above, where any one date is only shown once? Failing that, is there a way to get it to only show as many lines for one CASE_FILE as it needs to show all the dates, even if some dates repeat within that?
There isn't a good way, but there are two ways. One method involves subqueries for each table and complex outer joins. The second involves subqueries and union all. Let's go with that one:
SELECT CASE_FILE_ID,
MAX(INITIATED_DATE) as INITIATED_DATE,
MAX(INSPECTION_DATE) as INSPECTION_DATE,
MAX(PAYMENT_DATE) as PAYMENT_DATE,
MAX(ACTION) as ACTION
FROM ((SELECT A.CASE_FILE_ID, NULL as INITIATED_DATE, NULL as INSPECTION_DATE,
NULL as PAYMENT_DATE, NULL as ACTION_DATE,
1 as seqnum
FROM RECORDS.CASE_FILE A
) UNION ALL
(SELECT T1.CASE_FILE_ID, DATE as INITIATED_DATE, NULL as INSPECTION_DATE,
NULL as PAYMENT_DATE, NULL as ACTION_DATE,
ROW_NUMBER() OVER (PARTITION BY CASE_FILE_ID ORDER BY DATE) as seqnum
FROM RECORDS.INITIATE
) UNION ALL
(SELECT T1.CASE_FILE_ID, NULL as INITIATED_DATE, DATE as INSPECTION_DATE,
NULL as PAYMENT_DATE, NULL as ACTION_DATE,
ROW_NUMBER() OVER (PARTITION BY CASE_FILE_ID ORDER BY DATE) as seqnum
FROM RECORDS.INSPECTION
) UNION ALL
(SELECT T1.CASE_FILE_ID, NULL as INITIATED_DATE, NULL as INSPECTION_DATE,
DATE as PAYMENT_DATE, NULL as ACTION_DATE,
ROW_NUMBER() OVER (PARTITION BY CASE_FILE_ID ORDER BY DATE) as seqnum
FROM RECORDS.PAYMENT
) UNION ALL
(SELECT T1.CASE_FILE_ID, NULL as INITIATED_DATE, NULL as INSPECTION_DATE,
NULL as PAYMENT_DATE, ACTION as ACTION_DATE,
ROW_NUMBER() OVER (PARTITION BY CASE_FILE_ID ORDER BY DATE) as seqnum
FROM RECORDS.ACTION
)
) a
GROUP BY CASE_FILE_ID, seqnum;
Hmmm, a closely related solution is easier to maintain:
SELECT CASE_FILE_ID,
MAX(CASE WHEN type = 'INITIATED' THEN DATE END) as INITIATED_DATE,
MAX(CASE WHEN type = 'INSPECTION' THEN DATE END) as INSPECTION_DATE,
MAX(CASE WHEN type = 'PAYMENT' THEN DATE END) as PAYMENT_DATE,
MAX(CASE WHEN type = 'ACTION' THEN DATE END) as ACTION
FROM ((SELECT A.CASE_FILE_ID, NULL as TYPE, NULL as DATE,
1 as seqnum
FROM RECORDS.CASE_FILE A
) UNION ALL
(SELECT T1.CASE_FILE_ID, 'INSPECTION', DATE,
ROW_NUMBER() OVER (PARTITION BY CASE_FILE_ID ORDER BY DATE) as seqnum
FROM RECORDS.INITIATE
) UNION ALL
(SELECT T1.CASE_FILE_ID, 'INSPECTION', DATE,
ROW_NUMBER() OVER (PARTITION BY CASE_FILE_ID ORDER BY DATE) as seqnum
FROM RECORDS.INSPECTION
) UNION ALL
(SELECT T1.CASE_FILE_ID, 'PAYMENT', DATE,
ROW_NUMBER() OVER (PARTITION BY CASE_FILE_ID ORDER BY DATE) as seqnum
FROM RECORDS.PAYMENT
) UNION ALL
(SELECT T1.CASE_FILE_ID, 'ACTION', DATE,
ROW_NUMBER() OVER (PARTITION BY CASE_FILE_ID ORDER BY DATE) as seqnum
FROM RECORDS.ACTION
)
) a
GROUP BY CASE_FILE_ID, seqnum;

T-SQL query to obtain the no of days an item was at the current price

Declare #sec_temp table
(
sec_no varchar(10),
amount money,
price_date date
)
insert #sec_temp
values
('123ABC', 25, '2011-01-20'),
('123ABC', 25, '2011-01-19'),
('123ABC', 25, '2011-01-18'),
('123ABC', 20, '2011-01-15'),
('123ABC', 22, '2011-01-13'),
('456DEF', 22, '2011-01-13')
Problem: To list out the distinct sec_no with the latest price (amount) and the number of days it was at the current price. In this case,
Result:
sec_no amount no_of_days_at_price
123ABC 25 3 e.g. 01-18 to 01-20
456DEF 22 1 e.g. 01-13
select
a.sec_no,
a.amount,
min(price_date) as FirstDateAtPrice,
No_of_days_at_price = COALESCE(DATEDIFF(d, c.price_date, a.price_date),0)
from (
select *, ROW_NUMBER() over (partition by sec_no order by price_date desc) rn
from #sec_temp) a
outer apply (
select top 1 *
from #sec_temp b
where a.sec_no=b.sec_no and a.amount <> b.amount
order by b.price_date desc
) c
where a.rn=1
The subquery A works out the greatest-1-per-group, which is to say the most recent price record for each sec_no. The subquery C finds the first prior record that holds a different price for the same sec_no. The difference in the two dates is the number of days sought. If you need it to be one for no prior date, change the end of the COALESCE line to 1 instead of 0.
EDITED for clarified question
To start counting from the first date equal to the current rate, use this query instead
select
sec_no,
amount,
No_of_days_at_price = 1 + DATEDIFF(d, min(price_date), max(price_date))
from (
select *,
ROW_NUMBER() over (partition by sec_no order by price_date desc) rn,
ROW_NUMBER() over (partition by sec_no, amount order by price_date desc) rn2
from #sec_temp
) X
WHERE rn=rn2
group by sec_no, amount
AND FINALLY If the required result is actually the days between
the first date on which the price is equal to current; and
today
Then the only part to change is this:
No_of_days_at_price = 1 + DATEDIFF(d, min(price_date), getdate())
Here's one approach, first looking up the latest price, and then the last price that was different:
select secs.sec_no
, latest.amount as price
, case when previous.price_date is null then 1
else datediff(day, previous.price_date, latest.price_date)
end as days_at_price
from (
select distinct sec_no
from #sec_temp
) secs
cross apply
(
select top 1 amount
, price_date
from #sec_temp
where sec_no = secs.sec_no
order by
price_date desc
) latest
outer apply
(
select top 1 price_date
from #sec_temp
where sec_no = secs.sec_no
and amount <> latest.amount
order by
price_date desc
) previous
This prints:
sec_no price days_at_price
123ABC 25,00 5
456DEF 22,00 1