Optimize select query (inner select + group) - sql-server-2005

My current version is :
SELECT DT, AVG(DP_H2O) AS Tx,
(SELECT AVG(Abs_P) / 1000000 AS expr1
FROM dbo.BACS_MinuteFlow_1
WHERE (DT =
(SELECT MAX(DT) AS Expr1
FROM dbo.BACS_MinuteFlow_1
WHERE DT <= dbo.BACS_KongPrima.DT ))
GROUP BY DT) AS Px
FROM dbo.BACS_KongPrima
GROUP BY DT
but it works very slow.
basically in inner select I'm selecting maximum near time to my time, then group by this nearest time.
Is there possible optimizations ? Maybe I can join it somehow , but the trouble I'm not sure how to group by this nearest date.
Thank you

You could try to rearrange it to use the code below using a cross apply. Am not sure if this will improve performance but generally I try to avoid at all costs using a query on a specific column and SQL Server is pretty good at optimising the Apply statement.
WITH Bacs_MinuteFlow_1 (Abs_P ,DT ) AS
(SELECT 5.3,'2011/10/10'
UNION SELECT 6.2,'2011/10/10'
UNION SELECT 7.8,'2011/10/10'
UNION SELECT 5.0,'2011/03/10'
UNION SELECT 4.3,'2011/03/10'),
BACS_KongPrima (DP_H2O ,DT)AS
(SELECT 2.3,'2011/10/15'
UNION SELECT 2.6,'2011/10/15'
UNION SELECT 10.2,'2011/03/15')
SELECT DT, AVG(DP_H2O) AS Tx,
a.Px
FROM BACS_KongPrima
CROSS APPLY
(
SELECT AVG(Abs_P) / 1000000 AS Px
FROM BACS_MinuteFlow_1
WHERE DT =
(SELECT MAX(DT) AS maxdt
FROM BACS_MinuteFlow_1
WHERE DT <= BACS_KongPrima.DT
)
) a
GROUP BY DT,a.Px
Cheers

Related

Slow query operation with recursive CTE

I have a query in which I use a recursive CTE. Unfortunately, when I'm extending the entry dates, the time increases significantly.
Would anyone be able to help me how to change the code to make the query work more efficiently?
The problem seems to be in CTE4.
I am surprised by the slow operation of the query, because the result is generally a simple excel action. Like this:
Performance = x_prev * (1 + x)
DECLARE #BegOfPeriod DATE = '20100101'
,#EndOfPeriod DATE = '20191231'
,#clientID INT = 200010;
WITH
CTE AS
(
SELECT d.Date
,1+COALESCE(twr.day_diff_pct,0) as DayPct_plus1
FROM Days d
LEFT JOIN [dbo].[DailyTWR] TWR ON d.Date=twr.date AND Clientid=#clientID
WHERE d.Date between #BegOfPeriod and #EndOfPeriod AND d.Date>=(SELECT min(date) FROM DailyTWR WHERE ClientiD=#clientID)
),
CTE2 AS
(
SELECT *
,LAG(DayPct_plus1,1,1) OVER (order BY date) as DayPct_plus1_Prev
,ROW_NUMBER() OVER (order by date) as rownum
FROM cte
),
CTE3 AS
(
SELECT *
,c2.DayPct_plus1*c2.DayPct_plus1_Prev Performance
FROM CTE2 c2
),
CTE4 AS
(
SELECT c3.date,c3.DayPct_plus1,c3.DayPct_plus1_prev,c3.rownum,c3.Performance
FROM CTE3 c3
WHERE rownum=1
union all
SELECT c3.date,c3.DayPct_plus1,c3.DayPct_plus1_prev,c3.rownum
,c3.DayPct_plus1*c4.Performance as Performance
FROM CTE4 c4
JOIN CTE3 c3 ON c3.rownum=c4.rownum+1
)
SELECT c4.Date,c4.Performance
FROM CTE4 c4
option (maxrecursion 0)

Split one row into multiple rows in SQL Server table

Please see the attached screenshot. I'm trying to figure out how we can achieve that using SQL Server.
1
Thanks.
You can achieve this using recursive CTE,
For Ex : (I assumed your date column is in MM/DD/YYYY format)
;with orig_src as
(
select CAST('01/01/2018' AS DATETIME) As Dt, 'Alpha' Name, 3 Freq
UNION ALL
select CAST('12/01/2018' AS DATETIME) As Dt, 'Beta' Name, 2 Freq
), freq_cte as
(
--start/anchor row
select dt, name, 1 freq_new, Freq rn from orig_src
--recursion
union all
select DATEADD(MONTH, 1, a.dt), a.name, 1 freq_new, a.rn - 1 from freq_cte a
--terminator/constraint for recursion
where a.rn - 1 ! = 0
)
select convert(varchar, dt, 101) dt, name, freq_new from freq_cte
order by 2,1
The way this recursive logic works is,
First we get all the rows from the table in a CTE (freq_cte), then we recursively call this CTE and decrement rn (original freq) till the terminator condition is met that is when (rn -1) = 0

How to use Dynamic Lag function to avoid joining a table to itself to retrieve date value

I'm currently writing code in SQL to add the column in red to the following table:
The logic is the following:
For every row:
if flag for this row =1 then use date of this row
if flag for this row =0 then find the latest row (based on date) on which flag was = 1 for the same party and return the date of that row. If no such row exists, return null
I've found a way to do this by joining the table to itself but I would like to avoid doing that as the size of the table is pretty massive.
What I have
select b.*, a.date,
from table a left join table b on a.party=b.party
where a.flag =1
Someone told me I could use the lag function, the partition over function and a case when to return the value I'm after but I haven't been able to figure it out.
Can someone help? Thank you so much!
try this
DECLARE #tab1 TABLE(PARTY CHAR(1),DATE DATE,Flag bit)
INSERT INTO #tab1
SELECT 'A','7-24-2018',1 Union ALL
SELECT 'A','7-28-2018',0 Union ALL
SELECT 'A','7-29-2018',0 Union ALL
SELECT 'A','7-29-2018',0 Union ALL
SELECT 'B','7-13-2018',1 Union ALL
SELECT 'B','7-17-2018',0 Union ALL
SELECT 'B','7-18-2018',0 Union ALL
SELECT 'C','7-8-2018',1 Union ALL
SELECT 'C','7-13-2018',0 Union ALL
SELECT 'C','7-19-2018',0 Union ALL
SELECT 'C','7-19-2018',0 Union ALL
SELECT 'C','7-20-2018',0
select t.*,
max(case when flag = 1 then date end) over (partition by PARTY order by date) as [Last Flag On Date]
from #tab1 t
try this :->
select b.*, a.date, from table a left join table b on a.party=b.party where a.flag = CASE WHEN a.flag = 1 THEN a.date WHEN a.flag = 0 THEN ( SELECT date FROM ( SELECT TOP 1 row_number() OVER ( ORDER BY a.date DESC ) rs , a.date FROM a WHERE a.flag = 1 GROUP BY a.date) s ) END
use CROSS APPLY() to obtain the latest row with flag 1
SELECT *
FROM yourtable t
CROSS APPLY
(
SELECT TOP 1 x.Date as [Last flag on date]
FROM yourtable x
WHERE x.Party = t.Party
AND x.Flag = 1
ORDER BY x.Date desc
) d
Yes it can be done by joining table, if written properly.
#Sahi query is also good and simple.
Since you were asking for Dynamic LAG()
This query may or may not be very performant,but it certainly worth learning.
Test this with various sample data and tell me for which scenario it do not work.
So that I correct my script accordingly.
DECLARE #tab1 TABLE(PARTY CHAR(1),DATE DATE,Flag bit)
INSERT INTO #tab1
SELECT 'A','7-24-2018',1 Union ALL
SELECT 'A','7-28-2018',0 Union ALL
SELECT 'A','7-29-2018',0 Union ALL
SELECT 'A','7-29-2018',0 Union ALL
SELECT 'B','7-13-2018',1 Union ALL
SELECT 'B','7-17-2018',0 Union ALL
SELECT 'B','7-18-2018',0 Union ALL
SELECT 'C','7-8-2018',1 Union ALL
SELECT 'C','7-13-2018',0 Union ALL
SELECT 'C','7-19-2018',0 Union ALL
SELECT 'C','7-19-2018',0 Union ALL
SELECT 'C','7-20-2018',0;
WITH cte
AS (SELECT *,
Row_number()
OVER (
partition BY party
ORDER BY flag DESC, [date] DESC ) rn
FROM #tab1)
SELECT *,
CASE
WHEN flag = 1 THEN [date]
ELSE Lag([date], (SELECT TOP 1 a.rn - a1.rn
FROM cte a1
WHERE a1.party = a.party))
OVER (
ORDER BY party )
END
FROM cte a

BigQuery - Cannot join on repeated field

Im trying to create a table that is 1 column with each row being a new date between 2 separate dates. The query works fine until I add a where clause that contains a subquery ie. NOT IN (SELECT ....). It works fine if I do something like NOT IN (TIMESTAMP('xyz')).
I keep getting an error saying "Cannot join on repeated field t2.f0__group.SomeDate"
I have no clue why this is happening. Also Im fairly new to BQ so if there is an easier way to do this please let me know. Thanks
SELECT SomeDate FROM
(
SELECT DATE_ADD(Day, i, "DAY") SomeDate
FROM
(
SELECT '2020-01-03' Day
) T1
CROSS JOIN
(
SELECT
POSITION(
SPLIT(
RPAD('', DATEDIFF('2020-01-30','2020-01-03') * 2, 'a,'))) i
FROM
(
SELECT NULL
)
) T2
)
WHERE SomeDate NOT IN (SELECT OtherDate FROM
(
SELECT TIMESTAMP('2020-01-04 00:00:00 UTC') AS OtherDate
),
(
SELECT TIMESTAMP('2020-01-06 00:00:00 UTC') AS OtherDate
),
(
SELECT TIMESTAMP('2020-01-08 00:00:00 UTC') AS OtherDate
)
)
I suggest to start over from scratch using below example
I think it does exactly what you are trying to achieve with probably minor adjustments
SELECT SomeDate
FROM (
SELECT
DATE(DATE_ADD(TIMESTAMP('2020-01-03'), pos - 1, "DAY")) AS SomeDate
FROM (
SELECT ROW_NUMBER() OVER() AS pos, *
FROM (FLATTEN((
SELECT SPLIT(RPAD('', 1 + DATEDIFF(TIMESTAMP('2020-01-30'), TIMESTAMP('2020-01-03')), '.'),'') AS h
FROM (SELECT NULL)),h
))
)
) a
LEFT JOIN (
SELECT OtherDate FROM
(SELECT '2020-01-04' AS OtherDate),
(SELECT '2020-01-06' AS OtherDate),
(SELECT '2020-01-08' AS OtherDate)
) b
ON b.OtherDate = a.SomeDate
WHERE b.OtherDate IS NULL

SQL stored procedure to add up values and stop once the maximum has been reached

I would like to write a SQL query (SQL Server) that will return rows (in a given order) but only up to a given total. My client has paid me a given amount, and I want to return only those rows that are <= to that amount.
For example, if the client paid me $370, and the data in the table is
id amount
1 100
2 122
3 134
4 23
5 200
then I would like to return only rows 1, 2 and 3
This needs to be efficient, since there will be thousands of rows, so a for loop would not be ideal, I guess. Or is SQL Server efficient enough to optimise a stored proc with for loops?
Thanks in advance. Jim.
A couple of options are.
1) Triangular Join
SELECT *
FROM YourTable Y1
WHERE (SELECT SUM(amount)
FROM YourTable Y2
WHERE Y1.id >= Y2.id ) <= 370
2) Recursive CTE
WITH RecursiveCTE
AS (
SELECT TOP 1 id, amount, CAST(amount AS BIGINT) AS Total
FROM YourTable
ORDER BY id
UNION ALL
SELECT R.id, R.amount, R.Total
FROM (
SELECT T.*,
T.amount + Total AS Total,
rn = ROW_NUMBER() OVER (ORDER BY T.id)
FROM YourTable T
JOIN RecursiveCTE R
ON R.id < T.id
) R
WHERE R.rn = 1 AND Total <= 370
)
SELECT id, amount, Total
FROM RecursiveCTE
OPTION (MAXRECURSION 0);
The 2nd one will likely perform better.
In SQL Server 2012 you will be able to so something like
;WITH CTE AS
(
SELECT id,
amount,
SUM(amount) OVER(ORDER BY id
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)
AS RunningTotal
FROM YourTable
)
SELECT *
FROM CTE
WHERE RunningTotal <=370
Though there will probably be a more efficient way (to stop the scan as soon as the total is reached)
Straight-forward approach :
SELECT a.id, a.amount
FROM table1 a
INNER JOIN table1 b ON (b.id <=a.id)
GROUP BY a.id, a.amount
HAVING SUM(b.amount) <= 370
Unfortunately, it has N^2 performance issue.
something like this:
select id from
(
select t1.id, t1.amount, sum( t2.amount ) s
from tst t1, tst t2
where t2.id <= t1.id
group by t1.id, t1.amount
)
where s < 370