Analytic function - Comparing values using LAG()

Analytic function - Comparing values using LAG() - sql

Assume following data:
| Col1 | Col2 |
| 3 | 20-dec-15 |
| 4 | 20-dec-15 |
| 8 | 25-dec-15 |
|10 | 25-dec-15 |
I have to compare the values of column Col1 for a particular date.
For Example: For 20-dec-15 changes occured as 3 changed to 4.
I have to solve this using an analytical function.
Following is the query which I am using
decode(LAG(Col1,1,Col1) OVER (partition by Col2 order by Col2),Col1,0,1) Changes
As Col2 is date column, Partition by date is not working for me. Can we apply date column as Partition?
Expected Result should be:
| Changes |
| 0 |
| 1 |
| 0 |
| 1 |
Here 1 means Change occured while comparing for same date.

You need to use trunc() in order to reset the time part to 00:00:00 but you should still keep order by col2 so that all rows on the same day are ordered by the time part:
I also prefer an explicit case for this kind of comparison, personally I find the decode() really hard to read:
select case
when col1 = lag(col1,1,col1) over (partition by trunc(col2) order by col2) then 0
else 1
end as changes
from the_table;

Related

How to find two consecutive rows sorted by date, containing a specific value?

I have a table with the following structure and data in it:
| ID | Date | Result |
|---- |------------ |-------- |
| 1 | 30/04/2020 | + |
| 1 | 01/05/2020 | - |
| 1 | 05/05/2020 | - |
| 2 | 03/05/2020 | - |
| 2 | 04/05/2020 | + |
| 2 | 05/05/2020 | - |
| 2 | 06/05/2020 | - |
| 3 | 01/05/2020 | - |
| 3 | 02/05/2020 | - |
| 3 | 03/05/2020 | - |
| 3 | 04/05/2020 | - |
I'm trying to write an SQL query (I'm using SQL Server) which returns the date of the first two consecutive negative results for a given ID.
For example, for ID no. 1, the first two consecutive negative results are on 01/05 and 05/05.
The first two consecutive results for ID No. 2 are on 05/05 and 06/05.
The first two consecutive negative results for ID No. 3 are on on 01/05 and 02/05 .
So the query should produce the following result:
| ID | FirstNegativeDate |
|---- |------------------- |
| 1 | 01/05 |
| 2 | 05/05 |
| 3 | 01/05 |
Please note that the dates aren't necessarily one day apart. Sometimes, two consecutive negative tests may be several days apart. But they should still be considered as "consecutive negative tests". In other words, two negative tests are not 'consecutive' only if there is a positive test result in between them.
How can this be done in SQL? I've done some reading and it looks like maybe the PARTITION BY statement is required but I'm not sure how it works.

This is a gaps-and-island problem, where you want the start of the first island of '-'s that contains at least two rows.
I would recommend lead() and aggregation:
select id, min(date) first_negative_date
from (
select t.*, lead(result) over(partition by id order by date) lead_result
from mytable t
) t
where result = '-' and lead_result = '-'
group by id

Use LEAD or LAG functions over ID partition ordered by your Date column.
Then simple check where LEAD/LAG column is equal to Result.
You'll need also to filter the top ones.
The image attached just shows what LEAD/LAG would return

Column "f.price" must appear in the GROUP BY clause or be used in an aggregate function, but I've already used window function

I have a table with two columns: date and price. They both aren't unique.
I need to get running total in unique date order (one date - values sum for this date, next date - next sum + previous one and so on).
I know how to do this with subquery, but I want to use window functions:
There is a simple query:
SELECT f.date, SUM(f.price) OVER () FROM f GROUP BY f.date
It returns the error:
column f.price must appear in the GROUP BY clause or be used in an aggregate function
But I've already used aggregate function (SUM).
Can somebody tell me why this happend?

try avoiding over()
select f.date,
SUM(f.price)
from f
group by f.date

You are mixing window functions and aggregation, which is generally not a good idea. You are getting the error because, indeed, column f.price is not used in an aggregate function (it is used a window function).
I believe that the following query should give you what you want. It uses a window function, and relies on DISTINCT instead of aggregation.
SELECT DISTINCT fdate, SUM(fprice) OVER(ORDER BY fdate) FROM f ORDER BY fdate;
Demo on DB Fiddle:
Consider the following sample data, that seems to match your spec:
| fdate | fprice |
| ------------------------ | ------ |
| 2018-01-01T00:00:00.000Z | 1 |
| 2018-01-01T00:00:00.000Z | 2 |
| 2018-01-02T00:00:00.000Z | 3 |
| 2018-01-03T00:00:00.000Z | 4 |
| 2018-01-03T00:00:00.000Z | 1 |
The query would return:
| fdate | sum |
| ------------------------ | --- |
| 2018-01-01T00:00:00.000Z | 3 |
| 2018-01-02T00:00:00.000Z | 6 |
| 2018-01-03T00:00:00.000Z | 11 |

Comparing two tables that are the same and listing out the max date

I was wondering if it's possible to compare dates within the same table with same ID, but the catch is that there is an additional column that display the status. For instance, here's a table A:
The results I would like to see is this:
I know I could use a group by and max aggregate with ID to find the max date; however, I would like the status (Running/Stopped) column associated to be there. It would help me a lot.

In most databases, the fastest method (assuming the right indexes) is a correlated subquery:
select t.*
from t
where t.date = (select max(t2.date) from t t2 where t2.id = t.id);
Even if not the fastest, this should work in any database.

In case of Oracle, you can use the KEEP clause like this:
SELECT t.id,
MAX(t.status) KEEP (DENSE_RANK LAST ORDER BY t."DATE") AS corresponding_status,
MAX(t."DATE") AS last_date
FROM tab t
GROUP BY t.id
ORDER BY 1
For this sample data:
+----+---------+------------+
| ID | STATUS | DATE |
+----+---------+------------+
| 1 | Running | 2018-02-03 |
| 1 | Stopped | 2018-04-04 |
| 2 | Running | 2018-03-24 |
| 2 | Stopped | 2018-01-02 |
| 3 | Running | 2018-06-12 |
| 3 | Stopped | 2018-06-12 |
+----+---------+------------+
This would return this result:
+----+----------------------+------------+
| ID | CORRESPONDING_STATUS | LAST_DATE |
+----+----------------------+------------+
| 1 | Stopped | 2018-04-04 |
| 2 | Running | 2018-03-24 |
| 3 | Stopped | 2018-06-12 |
+----+----------------------+------------+
As can be seen in this SQL Fiddle.
For the cases, when you have multiple entries on the same ID and DATE combination, it'll choose one STATUS value - in this case the last one (based on alphanumerical sorting), as I've used MAX on the STATUS.
The part LAST ORDER BY t."DATE" corresponds to how we choose DATE value in the group, i.e. by choosing the last DATE in the group.
See this Oracle Docs entry on more details.

Comparing consecutive rows using oracle

I have a table which looks something like this:
| ID | FROM_DATE | TO_DATE |
------------------------------
| 1 | 1/1/2001 | 2/1/2001|
| 1 | 2/1/2001 | 3/1/2001|
| 1 | 2/1/2001 | 6/1/2001|
| 1 | 3/1/2001 | 4/1/2001|
| 2 | 1/1/2001 | 2/1/2001|
| 2 | 1/1/2001 | 6/1/2001|
| 2 | 2/1/2001 | 3/1/2001|
| 2 | 3/1/2001 | 4/1/2001|
It is already sorted by ID, From_Date, To_date.
What I want to do is delete the rows where the from_date is earlier than the to_date from the previous line and the ID is equal to the ID from the previous line. So in this example, I would delete the 3rd and 6th rows only.
I know I need some kind of looping structure to accomplish this, but I don't know how since I'm really looking at two rows at a time here. How can I accomplish this within Oracle?
EDIT: Where using the 'LAG' function is quicker and easier, I end up deleting the 4th and 7th rows also - which is not what I want to do. For example, when it gets to row 4, it should compare the 'from_date' to the 'to_date' from row 2 (instead of row 3, because row 3 should be deleted).

You could use the lag window function to identify these rows:
DELETE FROM mytable
WHERE rowid IN (SELECT rowid
FROM (SELECT rowid, from_date,
LAG(to_date) OVER
(PARTITION BY id
ORDER BY from_date, to_date)
AS lag_to_date
FROM my_table) t
WHERE from_date < lag_to_date)

SQL: Update column with with an index that resets if column changes

Firstly, sorry for the wording of the question. I'm not too sure how to express it. Hopefully the example below is clear.
If I have a table
Id | Type | Order
0 | Test | null
1 | Test | null
2 | Blah | null
3 | Blah | null
I want to turn it into this
Id | Type | Order
0 | Test | 1
1 | Test | 2
2 | Blah | 1
3 | Blah | 2
So I'm grouping the table by 'type' and allocating a number to 'order' incrementally. It'll start at 1 per type.
How should I go about doing it?
Db I'm using is Sybase 15.

select
Id,
Type,
row_number() over(partition by Type order by Id) as [Order]
from YourTable
You should utilize the ROW_NUMBER function to get you what you're looking for.
ROW_NUMBER (Transact-SQL)
Returns the sequential number of a row within a partition of a result
set, starting at 1 for the first row in each partition.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Analytic function - Comparing values using LAG() - sql

Related

How to find two consecutive rows sorted by date, containing a specific value?

Column "f.price" must appear in the GROUP BY clause or be used in an aggregate function, but I've already used window function

Comparing two tables that are the same and listing out the max date

Comparing consecutive rows using oracle

SQL: Update column with with an index that resets if column changes

Categories

Resources