Oracle Lag- determine if previous row columns are the same as current row - sql

Is it possible using analytical functions in oracle (lag for instance) to check the previous row, and based on 2 columns, determine if these values are exactly the same as the current row. If they are, then output the letter 'Y', else 'N'.
Something like:
IF prev.col1 = curr.col1 and prev.col2 = curr.col2 THEN 'Y' ELSE 'N'
I want to use this Y or N to filter out these records in a Crystal Report that I am writing.

Yes, but as analytics are processed fairly late, you need write the query with the analytic as a subquery:
SELECT CASE WHEN col1 = prev_col1 AND col2 = prev_col2 THEN 'Y' ELSE 'N' END as yesno
FROM (
SELECT col1, col2,
LAG( col1, 1 ) OVER ( ORDER BY col1, col2 ) AS prev_col1
LAG( col2, 1 ) OVER ( ORDER BY col1, col2 ) AS prev_col2
FROM mytable
)
You'll need to adjust the ORDER BY clause depending on how you're defining the "previous" row. You may also want to add a PARTITION BY clause if you don't want to treat the whole table as a single group of rows.

Related

How to express "either the single resulting record or NULL", without an inner-query LIMIT?

Consider the following query:
SELECT (SELECT MIN(col1) FROM table1) = 7;
Assuming col1 is non-NULLable, this will yield either true or false - or possibly NULL when table1 is empty;
But now suppose I have:
SELECT (
SELECT
FIRST_VALUE (col2) OVER (
ORDER BY col1
) AS col2_for_first_col1
FROM table1
) = 7;
(and assume col2 is also non-NULLable for simplicity.)
If there is a unique col2 value for the lowest col1 value, or the table is empty, then this works just like before. But if there are multiple col2 values for the lowest col1, I'm going to get a query runtime error.
My question: What is a short, elegant way to get NULL from this last query also in the case of multiple inner-query results? I could of course duplicate it and check the count, but I would rather avoid that.
Important caveat: I'm using MonetDB, and it doesn't seem to support ORDER BY ... LIMIT 1 on inner queries.
Without the MonetDB limitation, you would seem to want:
SELECT (SELECT col2
FROM table1
ORDER BY col1
LIMIT 1
) = 7;
with the limitation, you can use window functions differently:
SELECT (SELECT col2
FROM (SELECT col2, ROW_NUMBER() OVER (ORDER BY col1) as seqnum
FROM table1
) t
WHERE seqnum = 1
) = 7;

Return a column value only once in row 1 and return NULL for other rows

select col1, col2, col3 from table where *some condition*
Here, the number of rows returned is indefinite but the value for col2 will be same. I want to display col2 value in first row only and display as NULL in other rows.
How can I do this without complicating the query too much?
You can use row_number() :
select t.col2, (case when t.seq = 1 then t.col2 end) as col2, t.col3
from (select t.col1, t.col2, t.col3,
row_number() over (partition by t.col2 order by ?) as seq
from t
where . . .
) t;
You can use row_number(), but there is no need for a subquery:
select
col1,
case when row_number() over(partition by col2 order by col1, col3) = 1 then col2 end col2
col3
from mytable t
where ...
order by t.col2, col1, col3
Note that for this to work (and for your question to make sense at all), you do need some kind of ordering rule in the result set, so it can unambiguously be told with row is the first. I assumed that you want to order rows using the two other columns (and I also ordered the resultset accordingly).
Also please note that this solution should work equally well if there is more than one distinct col2 in the resultset; results will be sorted, and only the first occurence of a given col2 value will be displayed.

SQL DISTINCT based on a single column, but keep all columns as output

--mytable
col1 col2 col3
1 A red
2 A green
3 B purple
4 C blue
Let's call the table above mytable. I want to select only distinct values from col2:
SELECT DISTINCT
col2
FROM
mytable
When I do this the output looks like this, which is expected:
col2
A
B
C
but how do I perform the same type of query, yet keep all columns? The output would look like below. In essence I'm going through mytable looking at col2, and when there's multiple occurrences of col2 I'm only keeping the first row.
col1 col2 col3
1 A red
3 B purple
4 C blue
Do SQL functions (eg DISTINCT) have arguments I could set? I could imagine it to be something like KeepAllColumns = TRUE for this DISTINCT function? Or do I need to perform JOINs to get what I want?
You can use window functions, particularly row_number():
select t.*
from (select t.*, row_number() over (partition by col2 order by col2) as seqnum
from mytable t
) t
where seqnum = 1;
row_number() enumerates the rows, starting with "1". You can control whether you get the oldest, earliest, biggest, smallest . . .
You can use the QUALIFY clause in Teradata:
SELECT col1, col2, col3
FROM mytable
QUALIFY ROW_NUMBER() OVER(PARTITION BY col2 ORDER BY col2) = 1 -- Get 1st row per group
If you want to change the ordering for how to determine which col2 row to get, just change the expression in the ORDER BY.
With NOT EXISTS:
select m.* from mytable m
where not exists (
select 1 from mytable
where col2 = m.col2 and col1 < m.col1
)
This code will return the rows for which there is not another row with the same col2 and a smaller value in col1.

update value in row based on value in another row

I have a table with timestamped data.
t_stamp | col1 | col2 | col3
I want to update the value in say col3 based on values in col1 and col2 from the previous row in the table. By previous row I mean the row with the next lowest timestamp value. I also want to do this for every row in the table.
For example:
col3 = col1.prev + col2
Note: The operation here is only provided as an example. I want to calculate a value for col3 given a function of col1, col2 and/or previous values of either.
I was able to use a window function to create a SELECT query to give me the desired values for col3
SELECT lag(col1) OVER (ORDER BY t_stamp ASC) + col2 AS col3
FROM table1
but this does not update the values in the table. Can I somehow apply this to the original table? Or is there a way to format an update query in the same way?
You just need to use the FROM clause along with the query you already have:
UPDATE test set col3 = prev_col1 + prev_col2
FROM (
SELECT t_stamp,
lag(col1) OVER (ORDER BY t_stamp ASC) prev_col1,
lag(col2) OVER (ORDER BY t_stamp ASC) prev_col2
FROM test) prev
WHERE prev.t_stamp = test.t_stamp;
I think you want a cumulative sum:
SELECT (sum(col2 + co1) OVER (ORDER BY t_stamp ASC) - col1) AS col3
FROM table1
Create the first 3 columns and than use Lag to create the last col.
How to get previous row data in sql server
Use Sub queries.. will solve.. here is a sample
select col2+p_col1 from
(
SELECT col1, col2, lag(col1) OVER (ORDER BY t_stamp ASC) as p_col1
FROM table1
) t

Numbering series of data in SQL

i have a little problem with SQL SELECT. I want to number continous groups of the same value in column nr 2:
1,'a'
2,'a,
3,'b'
4,'c'
5,'a'
6,'a'
7,'e'
8,'e'
The output i want :
1,'a',1
2,'a,,1
3,'b',2
4,'c',3
5,'a',4
6,'a',4
7,'e',5
8,'e',5
Is it possible to do it with just a select? I must do it in Vertica's SQL, its not supporting operations on variables in select, so i cant just declare a variable before and increment it somehow.
You could use CONDITIONAL_CHANGE_EVENT() which is pretty simple. Basically you send in the column that you want to trigger the sequence increment as a parameter, and you order it the way you need it in the window. It's a Vertica analytic function.
SELECT col1,
col2,
CONDITIONAL_CHANGE_EVENT(col2) OVER ( ORDER BY col1 )
FROM mytable
You can do this with window functions. One method uses lag() and then does a cumulative sum of when the value changes:
select t.col1, t.col2,
sum(case when col2 = prev_col2 then 0 else 1 end) over (order by col1) as newcol
from (select t.*,
lag(col2) over (order by col1) as prev_col2
from t
) t