SQL to find the date when the price last changed - sql

Input:
Date Price
12/27 5
12/21 5
12/20 4
12/19 4
12/15 5
Required Output:
The earliest date when the price was set in comparison to the current price.
For e.g., price has been 5 since 12/21.
The answer cannot be 12/15 as we are interested in finding the earliest date where the price was the same as the current price without changing in value(on 12/20, the price has been changed to 4)

This should be about right. You didn't provide table structures or names, so...
DECLARE #CurrentPrice MONEY
SELECT TOP 1 #CurrentPrice=Price FROM Table ORDER BY Date DESC
SELECT MIN(Date) FROM Table WHERE Price=#CurrentPrice AND Date>(
SELECT MAX(Date) FROM Table WHERE Price<>#CurrentPrice
)
In one query:
SELECT MIN(Date)
FROM Table
WHERE Date >
( SELECT MAX(Date)
FROM Table
WHERE Price <>
( SELECT TOP 1 Price
FROM Table
ORDER BY Date DESC
)
)

This question kind of makes no sense so im not 100% sure what you are after.
create four columns, old_price, new_price, old_date, new_date.
! if old_price === new_price, simply print the old_date.

What database server are you using? If it was Oracle, I would use their windowing function. Anyway, here is a quick version that works in mysql:
Here is the sample data:
+------------+------------+---------------+
| date | product_id | price_on_date |
+------------+------------+---------------+
| 2011-01-01 | 1 | 5 |
| 2011-01-03 | 1 | 4 |
| 2011-01-05 | 1 | 6 |
+------------+------------+---------------+
Here is the query (it only works if you have 1 product - will have to add a "and product_id = ..." condition on the where clause if otherwise).
SELECT p.date as last_price_change_date
FROM test.prices p
left join test.prices p2 on p.product_id = p2.product_id and p.date < p2.date
where p.price_on_date - p2.price_on_date <> 0
order by p.date desc
limit 1
In this case, it will return "2011-01-03".
Not a perfect solution, but I believe it works. Have not tested on a larger dataset, though.
Make sure to create indexes on date and product_id, as it will otherwise bring your database server to its knees and beg for mercy.
Bernardo.

Related

How to check if date is in parallel time interval?

I have data like this:
TimeID | StartDate | EndDate | Price
000001 | 03.04.20 | 10.10.20 | 12
000002 | 01.02.20 | 31.12.99 | 13
000003 | 01.01.20 | 31.01.20 | 15
For a given date eg. 05.05.20 I want to get the cheapest price.
This Date would fit in TimeID 1 and 2. But 1 would be cheaper.
The needet price would be 12.
Is it possible to group intervals or how can i find the lowest price in parallel time intervals?
I found a solution:
select TOP 1 Price from Tab
where StartDate <= '05.05.20'
and EndDate >= '05.05.20'
order by Price
This gives me the lowest price for fiting time intervals.
If there is a more elegant solution, let me know.
select top 1 * from table_name where date> startdate and date <enddate order by price
A slightly simpler query is to use the min function, that returns the lowest value on the selected records (interval).
select min(Price) from Tab where '05.05.20' between StartDate and EndDate
I've also used the between operator to simplify a couple of conditions xx >= aa and xx <= bb

SQL: Calculate number of days since last success

Following table represents results of given test.
Every result for the same test is either pass ( error_id=0) or fail ( error_id <> 0)
I need help to write a query, that returns the number of runs since last good run ( error_id= 0) and the date.
| Date | test_id | error_id |
-----------------------------------
| 2019-12-20 | 123 | 23
| 2019-12-19 | 123 | 23
| 2019-12-17 | 123 | 22
| 2019-12-18 | 123 | 0
| 2019-12-16 | 123 | 11
| 2019-12-15 | 123 | 11
| 2019-12-13 | 123 | 11
| 2019-12-12 | 123 | 0
So the result for this example should be:
| 2019-12-18 | 123 | 4
as the test 123 was PASS on 2019-12-18 and this happened 4 runs ago.
I have a query to determine whether given run is error or not, but I have trouble applying appropriate window function to it to get the wanted result
select test_id, Date, error_id, (CASE WHEN error_id 0 THEN 1 ELSE 0 END) as is_error
from testresults
You can generate a row number, in reverse order from the sorting of the query itself:
SELECT test_date, test_id, error_code,
(row_number() OVER (ORDER BY test_date asc) - 1) as runs_since_last_pass
FROM tests
WHERE test_date >= (SELECT MAX(test_date) FROM tests WHERE error_code=0)
ORDER BY test_date DESC
LIMIT 1;
Note that this will run into issues if test_date is not unique. Better use a timestamp (precise to the millisecond) instead of a date.
Here's a DBFiddle: https://www.db-fiddle.com/f/8gSHVcXMztuRiFcL8zLeEx/0
If there's more than one test_id, you'll want to add a PARTITION BY clause to the row number function, and the subquery would become a bit more complex. It may be more efficient to come up with a way to do this by a JOIN instead of a subquery, but it would be more cognitively complex.
I think you just want aggregation and some filtering:
select id, count(*),
max(date) over (filter where error_id = 0) as last_success_date
from t
where date > (select max(t2.date) from t t2 where t2.error_id = 0);
group by id;
You have to use the Maximum date of the good runs for every test_id in your query. You can try this query:
select tr2.Date_error, tr.test_id, count(tr.error_id) from
testresults tr inner join (select max(Date_error), test_id
from testresult where error_id=0 group by test_id) tr2 on
tr.test_id=tr2.test_id and tr.date_error >=tr2.date_error
group by test_id
This should do the trick:
select count(*) from table t,
(select max(date) date from table where error_id = 0) good
where t.date >= good.date
Basically you are counting the rows that have a date >= the date of the last success.
Please note: If you need the number of days, it is a complete different query:
select now()::date - max(test_date) last_valid from tests
where error_code = 0;

PostgreSQL: How to write a query for this scenario

I have this below table.
+_______+________+__________+________+
|Playid |billid| amount | Date |
+_______+________+__________+________+
|123 | 345 | 144.9 | 2015-09|
|123 | 456 | 200 | 2015-10|
+_______+________+__________+________+
I need to write a query to show only the bill amount that has most recent transaction date (Date) like below.
+_______+________+__________+________+
|Playid |billid| amount | Date |
+_______+________+__________+________+
|123 | 456 | 200 | 2015-10|
+_______+________+__________+________+
Please help me how do I do it.
MAX(Date) can be used if you want to display only the playid and the most recent date.
However, The issue with what you are trying to do, is that you want to display all the columns. And this where the ranking functions come into play. In this case you can use the row_number function like this:
SELECT PlayId, billid, amount, date
FROM
(
SELECT
PlayId, billid, amount, date,
row_number() over(partition by playid order by date dec) as rn
FROM tablename
) t
where rn = 1
The row_number() over(partition by playid order by date dec) will give each group of playid a ranking number, the first one (the lowest one) will be the one with the most recent date. Then you just need to filter on the row number equal to 1.
Postgres offers distinct on. This is simpler to write and often has the best performance:
select distinct on (playid) t.*
from t
order by playid, order by date desc;

SQL grouping by datetime with a maximum difference of x minutes

I have a problem with grouping my dataset in MS SQL Server.
My table looks like
# | CustomerID | SalesDate | Turnover
---| ---------- | ------------------- | ---------
1 | 1 | 2016-08-09 12:15:00 | 22.50
2 | 1 | 2016-08-09 12:17:00 | 10.00
3 | 1 | 2016-08-09 12:58:00 | 12.00
4 | 1 | 2016-08-09 13:01:00 | 55.00
5 | 1 | 2016-08-09 23:59:00 | 10.00
6 | 1 | 2016-08-10 00:02:00 | 5.00
Now I want to group the rows where the SalesDate difference to the next row is of a maximum of 5 minutes.
So that row 1 & 2, 3 & 4 and 5 & 6 are each one group.
My approach was getting the minutes with the DATEPART() function and divide the result by 5:
(DATEPART(MINUTE, SalesDate) / 5)
For row 1 and 2 the result would be 3 and grouping here would work perfectly.
But for the other rows where there is a change in the hour or even in the day part of the SalesDate, the result cannot be used for grouping.
So this is where I'm stuck. I would really appreciate, if someone could point me in the right direction.
You want to group adjacent transactions based on the timing between them. The idea is to assign some sort of grouping identifier, and then use that for aggregation.
Here is an approach:
Identify group starts using lag() and date arithmetic.
Do a cumulative sum of the group starts to identify each group.
Aggregate
The query looks like this:
select customerid, min(salesdate), max(saledate), sum(turnover)
from (select t.*,
sum(case when salesdate > dateadd(minute, 5, prev_salesdate)
then 1 else 0
end) over (partition by customerid order by salesdate) as grp
from (select t.*,
lag(salesdate) over (partition by customerid order by salesdate) as prev_salesdate
from t
) t
) t
group by customerid, grp;
EDIT
Thanks to #JoeFarrell for pointing out I have answered the wrong question. The OP is looking for dynamic time differences between rows, but this approach creates fixed boundaries.
Original Answer
You could create a time table. This is a table that contains one record for each second of the day. Your table would have a second column that you can use to perform group bys on.
CREATE TABLE [Time]
(
TimeId TIME(0) PRIMARY KEY,
TimeGroup TIME
)
;
-- You could use a loop here instead.
INSERT INTO [Time]
(
TimeId,
TimeGroup
)
VALUES
('00:00:00', '00:00:00'), -- First group starts here.
('00:00:01', '00:00:00'),
('00:00:02', '00:00:00'),
('00:00:03', '00:00:00'),
...
('00:04:59', '00:00:00'),
('00:05:00', '00:05:00'), -- Second group starts here.
('00:05:01', '00:05:00')
;
The approach works best when:
You need to reuse your custom grouping in several different queries.
You have two or more custom groups you often use.
Once populated you can simply join to the table and output the desired result.
/* Using the time table.
*/
SELECT
t.TimeGroup,
SUM(Turnover) AS SumOfTurnover
FROM
Sales AS s
INNER JOIN [Time] AS t ON t.TimeId = CAST(s.SalesDate AS Time(0))
GROUP BY
t.TimeGroup
;

Get rid of "semi-duplicates" in a query

I currently have a query that returns, for example, the following: (You can assume that this is what the table structure looks like)
customer_id | start_date | end_date
1 | 20120101 | 20120401
2 | 20120402 | 20121231
1 | 20130101 | 20130401
1 | 20130101 | 20130330
2 | 20130331 | 99991231
2 | 20130402 | 99991231
There's two points to consider:
A Customer can come back, so doing a normal max/min approach on this doesn't work.
This is actually an overview of multiple services, and sometimes one of them starts or ends in a different date. (Very uncommon, but I need to deal with this scenario.)
So taking the above into account, I want a query that will return the 1st, 2nd, 3rd, and 5th lines.
My idea & approach to this would be:
If start_dates are equal, display the max end date. (group by customer_id & start_date, max(end_date))
If end_dates are equal, display the min start date. (group by customer_id & end_date, min(start_date))
I can write a query that will do one of the above, but I'm not sure how I'd be able to go about doing both of them at the same time. Or if a different approach altogether would be better.
SQL Server 2008
Thank you!
I think you can do this with not exists condition -
the following query you can use for this output -
select customer_id , start_date , end_date
from table_name t_1
where not exists(
select 1 from table_name t_2
where t_2.customer_id = t_1.customer_id
and t_2.start_date = t_1.start_date
and t_2.end_date > t_1.end_date)
and not exists (
select 1 from table_name t_3
where t_3.customer_id = t_1.customer_id
and t_3.end_date = t_1.end_date
and t_3.start_date<t_1.start_date)