Oracle SQL LAG Function - sql

I'd appreciate some help with this code, I'm getting a 'missing keyword' error. I've never used the Lag function before, so hopefully I using it correctly. Thanks for your help. Gav
CREATE VIEW GS_Date AS
SELECT
DATE_DATE,
DATE_FLAG,
CASE WHEN LAG ( DATE_FLAG) OVER ( ORDER BY DATE_DATE ) = '1' THEN DATE_STEP = ( LAG ( DATE_FLAG) OVER ( ORDER BY DATE_DATE ) ) + '1'
WHEN LAG ( DATE_FLAG) OVER ( ORDER BY DATE_DATE ) = '0' AND LAG ( DATE_FLAG) OVER ( ORDER BY DATE_DATE ) = '-1' THEN DATE_STEP = ( LAG ( DATE_FLAG) OVER ( ORDER BY DATE_DATE ) ) + '1'
ELSE DATE_STEP = LAG ( DATE_FLAG) OVER ( ORDER BY DATE_DATE ) END AS DATE_STEP
FROM DATE_GROUP

The problem is with the CASE expression; you were using LAG correctly.
Other points: Don't add strings like '1' and '-1' to numbers. Add numbers - you don't need the single quotes.
Also, if in a computation something is common and only the "last part" is different, you can use the CASE expression "at the end". Like below:
Note: On re-reading the original post, the formula needs to be more complicated (I didn't get it exactly right). Not changing the answer, since it still illustrates the same ideas I meant to share. BUT: Looking at the original post, there is a condition "when LAG = 0 and LAG = -1" - that can never be true. What was meant is probably "OR" instead of "AND". In the formula I wrote below, this means one more WHEN...THEN... branch.
LAG(DATE_FLAG) OVER (ORDER BY DATE)
+ CASE LAG(DATE_FLAG) OVER (ORDER BY DATE ) WHEN 1 THEN 1
WHEN 0 THEN -1
ELSE 0 END AS DATE_STEP
Further edit: Looking at it again, it seems when the flag is 1, 0 or -1 then we must add 1, otherwise add 0... then it's easier to use a "simple CASE expression" instead of a "searched CASE expression" as I did. Something like:
LAG(...) ...
+ CASE WHEN LAG(...) ... IN (-1, 0, 1) THEN 1
ELSE 0 END AS DATE_STEP

Try like this
CREATE VIEW GS_Date AS
SELECT DATE_DATE,
DATE_FLAG,
CASE
WHEN LAG(DATE_FLAG) OVER(ORDER BY DATE_DATE) = '1' THEN
(LAG(DATE_FLAG) OVER(ORDER BY DATE_DATE)) + '1'
WHEN LAG(DATE_FLAG) OVER(ORDER BY DATE_DATE) = '0' AND LAG(DATE_FLAG) OVER(ORDER BY DATE_DATE) = '-1' THEN
(LAG(DATE_FLAG) OVER(ORDER BY DATE_DATE)) + '1'
ELSE
LAG(DATE_FLAG) OVER(ORDER BY DATE_DATE)
END AS DATE_STEP
FROM DATE_GROUP

So you don't have to keep writing LAG( ... ) OVER ( ... ) statements, get the LAG value in a sub-query and then use CASE or DECODE in the outer query:
CREATE VIEW GS_Date AS
SELECT DATE_DATE,
DATE_FLAG,
DECODE(
DATE_STEP,
1, 2,
0, 1,
-1, 0,
DATE_STEP
) AS DATE_STEP
FROM (
SELECT DATE_DATE,
DATE_FLAG,
LAG ( DATE_FLAG ) OVER ( ORDER BY DATE_DATE ) AS DATE_STEP
FROM DATE_GROUP
)'
Also, your second WHEN clause will never be true:
WHEN LAG ( DATE_FLAG ) OVER ( ORDER BY DATE_DATE ) = '0'
AND LAG ( DATE_FLAG ) OVER ( ORDER BY DATE_DATE ) = '-1'
THEN ...
Since the value can never be both -1 and 0. I've assumed you meant to use OR rather than AND.

Related

SQL calculation with previous row + current row

I want to make a calculation based on the excel file. I succeed to obtain 2 of the first records with LAG (as you can check on the 2nd screenshot). Im out of ideas how to proceed from now and need help. I just need the Calculation column take its previous data. I want to automatically calculate it over all the dates. I also tried to make a LAG for the calculation but manually and the result was +1 row more data instead of NULL. This is a headache.
LAG(Data ingested, 1) OVER ( ORDER BY DATE ASC ) AS LAG
You seem to want cumulative sums:
select t.*,
(sum(reconciliation + aves - microa) over (order by date) -
first_value(aves - microa) over (order by date)
) as calculation
from CalcTable t;
Here is a SQL Fiddle.
EDIT:
Based on your comment, you just need to define a group:
select t.*,
(sum(reconciliation + aves - microa) over (partition by grp order by date) -
first_value(aves - microa) over (partition by grp order by date)
) as calculation
from (select t.*,
count(nullif(reconciliation, 0)) over (order by date) as grp
from CalcTable t
) t
order by date;
Imo this could be solved using a "gaps and islands" approach. When Reconciliation>0 then create a gap. SUM(GAP) OVER converts the gaps into island groupings. In the outer query the 'sum_over' column (which corresponds to the 'Calculation') is a cumumlative sum partitioned by the island groupings.
with
gap_cte as (
select *, case when [Reconciliation]>0 then 1 else 0 end gap
from CalcTable),
grp_cte as (
select *, sum(gap) over (order by [Date]) grp
from gap_cte)
select *, sum([Reconciliation]+
(case when gap=1 then 0 else Aves end)-
(case when gap=1 then 0 else Microa end))
over (partition by grp order by [Date]) sum_over
from grp_cte;
[EDIT]
The CASE statement could be CROSS APPLY'ed instead
with
grp_cte as (
select c.*, v.gap, sum(v.gap) over (order by [Date]) grp
from #CalcTable c
cross apply (values (case when [Reconciliation]>0 then 1 else 0 end)) v(gap))
select *, sum([Reconciliation]+
(case when gap=1 then 0 else Aves end)-
(case when gap=1 then 0 else Microa end))
over (partition by grp order by [Date]) sum_over
from grp_cte;
Here is a fiddle

Writing a Single Query w/ Multiple CTE Subqueries SQL/R

I have some data I would like to pull from a database, I'm using RStudio for my query. What I intend to do is write:
The first CTE statement to pull all my necessary information.
The second CTE statement will add two new columns for two row numbers, which are partitioned by different groups. Two additional columns will be added for Lead and Lag values.
The third CTE will produce two more columns where the two columns use nested case_when statements to give me NewOpen and NewClosed dates.
What I have so far:
q5<- sqlQuery(ch,paste("
;with CTE AS
(
select
oz.id as AccountID
,ac.PROD_TYPE_CDE as ProductTypeCode
,CASE WHEN ac.OPEN_DTE='0001-01-01' then null else ac.OPEN_DTE END as OpenDate
,CASE WHEN ac.CLOS_DTE = '0001-01-01' then null else ac.CLOS_DTE END as ClosedDate
,df.proc_dte as FullDate
FROM
dbs.tb_dbs_acct_fact df
inner join
dbs.tb_acct_details ac on df.dw_serv_id = ac.dw_serv_id
left outer join
dbs.tb_oz_id oz on df.proc_dte = oz.proc_dte
),
cte1 as
(
select *
,row_nbr = row_number() over( partition by AccountID order by AccountID, FullDate asc )
,row_nbr2 = row_number() over( partition by AccountID,ProductTypeCode order by AccountID, FullDate asc )
,lag(ProductTypeCode) over(partition by AccountID order by FullDate asc ) as Lagging
,LEAD(ProductTypeCode) over(partition by AccountID order order by FullDate asc ) as Leading
FROM CTE
),
cte2 as (select *
,case when cte1.row_nbr = 1 & cte1.Lagging=cte1.ProductTypeCode then cte1.OpenDate else
case when cte1.Lagging<>cte1.ProductTypeCode then cte1.FullDate else NULL END END as NewOpen
,case when cte1.ClosedDate IS NOT NULL then cte1.ClosedDate else
case when cte1.Leading <> cte1.ProductTypeCode then cte1.FullDate else NULL END END as NewClosed
FROM cte1
);"))
This code, however won't run.
As mentioned, WITH is a statement to define CTEs to be used in a final query. Your query only contains CTE definitions but never actually use any in a final statement. Additionally, you can combine the first two CTEs since window functions can run at any level. Possibly the last CTE can serve as your final SELECT statement.
sql <- "WITH CTE AS
(SELECT
oz.id AS AccountID
, ac.PROD_TYPE_CDE as ProductTypeCode
, CASE
WHEN ac.OPEN_DTE='0001-01-01'
THEN NULL
ELSE ac.OPEN_DTE
END AS OpenDate
, CASE
WHEN ac.CLOS_DTE = '0001-01-01'
THEN NULL
ELSE ac.CLOS_DTE
END AS ClosedDate
, df.proc_dte AS FullDate
, ROW_NUMBER() OVER (PARTITION BY oz.id
ORDER BY oz.id, df.proc_dte) AS row_nbr
, ROW_NUMBER() OVER (PARTITION BY oz.id, ac.PROD_TYPE_CDE
ORDER BY oz.id, df.proc_dte) AS row_nbr2
, LAG(ac.PROD_TYPE_CDE) OVER (PARTITION BY oz.id
ORDER BY df.proc_dte) AS Lagging
, LEAD(ac.PROD_TYPE_CDE) OVER (PARTITION BY oz.id
ORDER BY df.proc_dte) AS Leading
FROM
dbs.tb_dbs_acct_fact df
INNER JOIN
dbs.tb_acct_details ac ON df.dw_serv_id = ac.dw_serv_id
LEFT OUTER JOIN
dbs.tb_oz_id oz ON df.proc_dte = oz.proc_dte
)
SELECT *
, CASE
WHEN row_nbr = 1 & Lagging = ProductTypeCode
THEN OpenDate
ELSE
CASE
WHEN Lagging <> ProductTypeCode
THEN FullDate
ELSE NULL
END
END AS NewOpen
, CASE
WHEN ClosedDate IS NOT NULL
THEN ClosedDate
ELSE
CASE
WHEN Leading <> ProductTypeCode
THEN FullDate
ELSE NULL
END
END AS NewClosed
FROM CTE;"
q5 <- sqlQuery(ch, sql)

Repurposing IIF statement for Redshift CASE

I'm working with a query I was given from a client, but we have different SQL languages. We use Redshift, which doesn't include iif functions, and frankly, I've never used. I know it's basically a different way of a CASE statement, right? Here is the query
select
*
,iif(datediff(day,
lag(event_date, 1, '1900-01-01') over (partition by client_id, error_id order by event_date),
event_date) <= 1
,'yes', 'no') flag
from table.a
I thought this would work but it keeps firing back an error:
select
*,
CASE WHEN datediff(day, lag(event_date, 1, '1900-01-01')) OVER (PARTITION BY client_id, errord_id ORDER BY event_date) <= 1 THEN 'YES' ELSE 'NO' END flag
from dsa.sas_days
Can someone help me in reconfiguring this?
In redshift Lag, There are only two parameters in the function value_expr and offset.
LAG (value_expr [, offset ])
[ IGNORE NULLS | RESPECT NULLS ]
OVER ( [ PARTITION BY window_partition ] ORDER BY window_ordering )
so You can try this.
select
*,CASE WHEN
datediff(day, lag(event_date, 1) OVER (PARTITION BY client_id, errord_id ORDER BY event_date),event_date) <= 1
THEN 'YES'
ELSE 'NO'
END flag
from dsa.sas_days

CASE Statement inside a subquery

I was able to create the following query after help from the post below
select * from duppri t
where exists (
select 1
from duppri
where symbolUP = t.symbolUP
AND date = t.date
and price <> t.price)
ORDER BY date
SQL to check when pairs don't match
I have now realized that I need to add a case statement to indicate when all the above criteria fits, but the type value is equal between duppri and t.duppri. This occurs because of case sensitivity. This query is an attempt to clean up a portfolio accounting system that unfortunately allowed numerous duplicates because it didn't have strong referential integrity or constraints.
I would like the case statement to produce the column 'isMatch'
Date |Type|Symbol |SymbolUP |Concatt |Price |IsMatch
6/30/1995 |gaus|313586U72|313586U72|gaus313586U72|109.25|Different
6/30/1995 |gbus|313586U72|313586U72|gbus313586U72|108.94|Different
6/30/1995 |agus|SRR |SRR |agusSRR |10.25 |Different
6/30/1995 |lcus|SRR |SRR |lcusSRR |0.45 |Different
11/27/1996|lcus|LLY |LLY |lcusLLY |76.37 |Matched
11/27/1996|lcus|lly |LLY |lcusLLY |76 |Matched
11/28/1996|lcus|LLY |LLY |lcusLLY |76.37 |Matched
11/28/1996|lcus|lly |LLY |lcusLLY |76 |Matched
I tried the following CASE statement but it is creating errors
SELECT * from duppri t
where exists (
select 1,
CASE IsMatch WHEN [type] = [t.TYPE] THEN 'Matched' ELSE 'Different' END
from duppri
where symbolUP = t.symbolUP
AND date = t.date
and price <> t.price)
ORDER BY date
You could just use window functions, if I understand correctly:
select d.*,
(case when mint = maxt
then 'Matched' else 'Different'
end)
from (select d.*,
min(type) over (partition by symbolup, date) as mint,
max(type) over (partition by symbolup, date) as maxt,
min(price) over (partition by symbolup, date) as minp,
max(price) over (partition by symbolup, date) as maxp
from duppri d
) d
where minp <> maxp
order by date;
The subquery used with the exists predicate can't and won't return anything other than true/false but you can accomplish what you want using a subquery like this, which should work:
select
*,
(select
CASE when count(distinct type) = 1 THEN 'Matched' ELSE 'Different' END
from duppri
where symbol = t.symbol and date = t.date
) IsMatch
from duppri t
where exists (
select 1
from duppri
where symbol = t.symbol
and price <> t.price);

Oracle Select Query, distributing frequency of labels

If i know how much data i am receiving, i can distribute the appearance of a label in this way:
select
value_column
, CASE WHEN TO_CHAR(VALUE_DATE, 'mm') = '01' and MOD(extract(year from VALUE_DATE), 3) = 0
THEN TO_CHAR(VALUE_DATE, 'MON-yyyy')
else ' '
END VALUE_DATE_STRING
from SomeTable
this will show the data label on January every 3rd year.
Now, if i don't know how many years are coming back, i'd like to figure this out in the same select and display a total of 5 labels.
i reckon i'd need something like this (pseudo code):
CASE WHEN MOD(allRows / 5, ROW_NUM) = 0
i guess the only challenging part is getting the allRows in the same select.. since i'm calling this sql from a telerik report, there's limited support for declaring vars and running multiple statements..
This might do what you want:
select value_column,
(CASE WHEN mod(rownum, trunc(cnt / 5)) = 0 o
THEN TO_CHAR(VALUE_DATE, 'MON-yyyy')
else ' '
END) as VALUE_DATE_STRING
from (select t.*, count(*) over () as cnt
from SomeTable t
) t;
It might be trunc((cnt - 1) / 5).
DIT:
If this does what you want, you don't need a CTE or subquery. I just thought that it made more sense this way (and the subquery does not effect performance). You can do:
select value_column,
(CASE WHEN mod(row_number() over (order by value_date),
trunc(count(*) over () / 5)) = 0
THEN TO_CHAR(VALUE_DATE, 'MON-yyyy')
else ' '
END) as VALUE_DATE_STRING
from SomeTable t;
By the way, you should include order by if you are using rownum. Oracle does not guarantee the ordering of rows in a result set without an order by.
here's my solution.. it could still be optimized, and i believe the rounding will not always work in my favor..
WITH base as (
select
value_column
, value_date
, cnt.c
, case mod(c,2) when 0 then 6 else 5 end as divider -- get 5 or 6 dates.. try to minimize truncation consequences.. this may need work.
, row_number() over (order by value_date) as row_number
from MyView
left join (select count(*) c from MyView where id = 250170) cnt on 1=1
where id = 250170
order by value_date
)
select
value_column
, value_date
, CASE WHEN MOD(row_number, trunc(c/divider)) = 0
THEN TO_CHAR(VALUE_DATE, 'MON-yyyy')
else ' '
END VALUE_DATE_STRING
from base
order by value_date
UPDATE
simplified based on Gordon's answer
select
value_column
, value_date
, (
CASE WHEN
mod(
row_number() over (order by value_date),
trunc(count(*) over () / 5)) = 0
THEN TO_CHAR(VALUE_DATE, 'MON-yyyy')
else ' '
END) as LABEL_STRING
from MyView