How can I roll up rate-of-return into net asset value in SQL? - sql

Given the starting value #pStartingValue and a table which contains rorDate and ror what is the most efficient way to get the NAV at each date using just TSQL?
This mathematically trivial, and simple in code. I have a naive SQL implementation currently that relies on cursors.
On the first date, the NAV is #pStartingValue * ror
On every subsequent date, it's the previously calculated nav * ror or it's #pStartingValue * every previous ror
How would you efficiently do this only in MSSQL2005+?
DECLARE #rorDate DATE
DECLARE #getDate CURSOR
DECLARE #lastNAV as DECIMAL(19,7)
DECLARE #datedRoR as float
DECLARE #NAVTotals TABLE
(
NAV DECIMAL(19,7),
navDate DATE
)
SET #lastNAV = 100
SET #getDate = CURSOR FOR
SELECT
p.[DATE]
FROM
performance p
ORDER BY
p.[DATE]
OPEN #getDate
FETCH NEXT
FROM #getDate INTO #rorDate
WHILE ##FETCH_STATUS = 0
BEGIN
SELECT
#datedRoR = b.finalNetReturn
FROM
performance b
WHERE
b.date = #rorDate
INSERT INTO #NAVTotals (NAV, navDate)
VALUES (#lastNAV * (1 + #datedRoR), #rorDate)
SELECT
#lastNAV = c.NAV
FROM
#NAVTotals c
WHERE
c.navDate = #rorDate
FETCH NEXT
FROM #getDate INTO #rorDate
END
CLOSE #getDate
DEALLOCATE #getDate
select * from #NAVTotals

You'll have to do some testing to see if the performance improves but this is a way to do that same thing without using a cursor. It's untested so you'll want to make sure to test it. I also cast b.finalNetReturn as a float, if it's already a float you can remove that part.
DECLARE #lastNAV as DECIMAL(19,7)
SET #lastNAV = 100
DECLARE #NAVTotals TABLE
(
NAV DECIMAL(19,7),
navDate DATE
);
INSERT INTO #NAVTotals (navDate)
SELECT [DATE]
FROM performance
ORDER BY [DATE] ASC;
UPDATE NT
SET #lastNAV = Nav = (#lastNAV * (1.0 +
(Cast((SELECT b.finalNetReturn
FROM performance b
WHERE b.date = NT.navDate) AS FLOAT))))
FROM #NAVTotals NT;
SELECT * FROM #NAVTotals ORDER BY navDate;
By dropping the lastNAV variable into the update statement you can update both. It works similar to:
a = a + 1
There is an example of this same approach here. Including some good numbers that compare the efficiency of the approach to other approaches such as cursors.

Perhaps I'm not understanding it correctly, but you don't even need a stored proc to achieve this.
SELECT p.[DATE] AS navDate
, #pStartingValue * PRODUCT(1 + b.finalNetReturn) AS NAV
FROM performance p
INNER JOIN performance b
ON b.[DATE] <= p.[DATE]
GROUP BY p.[DATE]
ORDER BY p.[DATE]
However, there are a few "wierdness" that I don't grasp.
How come there is no range limit for p.[DATE]?
Does the "performance" table really have only one asset?

Related

How to improve while loop insert performance in sql server?

Here is my SQL Query. It's insert almost 6500+ row from temp table. But its takes 15+ mins! . How can i improve this ? Thanks
ALTER proc [dbo].[Process_bill]
#userid varchar(10),
#remark nvarchar(500),
#tdate date ,
#pdate date
as
BEGIN
IF OBJECT_ID('tempdb.dbo..#temptbl_bill', 'U') IS NOT NULL
DROP TABLE #temptbl_bill;
CREATE TABLE #temptbl_bill (
RowID int IDENTITY(1, 1),
------------
)
// instert into temp table
DECLARE #NumberRecords int, #RowCounter int
DECLARE #batch INT
SET #batch = 300
SET #NumberRecords = (SELECT COUNT(*) FROM #temptbl_bill)
SET #RowCounter = 1
SET NOCOUNT ON
BEGIN TRANSACTION
WHILE #RowCounter <= #NumberRecords
BEGIN
declare #clid int
declare #hlid int
declare #holdinNo nvarchar(150)
declare #clientid nvarchar(100)
declare #clientName nvarchar(50)
declare #floor int
declare #radius nvarchar(50)
declare #bill money
declare #others money
declare #frate int
declare #due money
DECLARE #fine money
DECLARE #rebate money
IF #RowCounter > 0 AND ((#RowCounter % #batch = 0) OR (#RowCounter = #NumberRecords))
BEGIN
COMMIT TRANSACTION
PRINT CONCAT('Transaction #', CEILING(#RowCounter/ CAST(#batch AS FLOAT)), ' committed (', #RowCounter,' rows)');
BEGIN TRANSACTION
END;
// multiple select
// insert to destination table
Print 'RowCount -' +cast(#RowCounter as varchar(20)) + 'batch -' + cast(#batch as varchar(20))
SET #RowCounter = #RowCounter + 1;
END
COMMIT TRANSACTION
PRINT CONCAT('Transaction #', CEILING(#RowCounter/ CAST(#batch AS FLOAT)), ' committed (',
#RowCounter,' rows)');
SET NOCOUNT OFF
DROP TABLE #temptbl_bill
END
GO
As has been said in comments, the loop is completely unnecessary. The way to improve the performance of any loop is to remove it completely. Loops are a last resort in SQL.
As far as I can tell your insert can be written with a single statement:
INSERT tbl_bill(clid, hlid, holdingNo,ClientID, ClientName, billno, date_month, unit, others, fine, due, bill, rebate, remark, payment_date, inserted_by, inserted_date)
SELECT clid = c.id,
hlid = h.id,
h.holdinNo ,
c.cliendID,
clientName = CAST(c.clientName AS NVARCHAR(50)),
BillNo = CONCAT(h.holdinNo, MONTH(#tdate), YEAR(#tdate)),
date_month = #tDate,
unit = 0,
others = CASE WHEN h.hfloor = 0 THEN rs.frate * (h.hfloor - 1) ELSE 0 END,
fine = bs.FineRate * b.Due / 100,
due = b.Due,
bill = #bill, -- This is declared but never assigned
rebate = bs.rebate,
remark = #remark,
payment_date = #pdate,
inserted_by = #userid,
inserted_date = GETDATE()
FROM ( SELECT id, clientdID, ClientName
FROM tbl_client
WHERE status = 1
) AS c
INNER JOIN
( SELECT id, holdinNo, [floor], connect_radius
FROM tx_holding
WHERE status = 1
AND connect_radius <> '0'
AND type = 'Residential'
) AS h
ON c.id = h.clid
LEFT JOIN tbl_radius_setting AS rs
ON rs.radius= CONVERT(real,h.connect_radius)
AND rs.status = 1
AND rs.type = 'Non-Govt.'
LEFT JOIN tbl_bill_setting AS bs
ON bs.Status = 1
LEFT JOIN
( SELECT hlid,
SUM(netbill) AS Due
FROM tbl_bill AS b
WHERE date_month < #tdate
AND (b.ispay = 0 OR b.ispay IS NULL)
GROUP BY hlid
) AS b
ON b.hlid = h.id
WHERE NOT EXISTS
( SELECT 1
FROM tbl_bill AS tb
WHERE EOMONTH(#tdate) = EOMONTH(date_month)
AND tb.holdingNo = h.holdinNo
AND (tb.update_by IS NOT NULL OR tb.ispay=1)
);
Please take this with a pinch of salt, it was quite hard work trying to piece together the logic, so it may need some minor tweaks and corrections
As well as adapting this to work as a single statement, I have made a number of modifications to your existing code:
Swapped NOT IN for NOT EXISTS to avoid any issues with null records. If holdingNo is nullable, they are equivalent, if holdingNo is nullable, NOT EXISTS is safer - Not Exists Vs Not IN
The join syntax you are using was replaced 27 years ago, so I switched from ANSI-89 join syntax to ANSI-92. - Bad habits to kick : using old-style JOINs
Changed predicates of YEAR(date_month) = YEAR(#tDate) AND MONTH(date_month) = MONTH(#tDate) to become EOMONTH(#tdate) = EOMONTH(date_month). These are syntactically the same, but EOMONTH is Sargable, whereas MONTH and YEAR are not.
Then a few further links/suggestions that are directly related to changes I have made
Although I removed the while lopp, don't fall into the trap of thinking this is better than a cursor. A properly declared cursor will out perform a while loop like yours - Bad Habits to Kick : Thinking a WHILE loop isn't a CURSOR
The general consensus is that prefixing object names is not a good idea. It should either be obvious from the context if an object is a table/view or function/procedure, or it should be irrelevant - i.e. There is no need to distinguish between a table or a view, and in fact, we may wish to change from one to the other, so having the prefix makes things worse, not better.
The average ratio of time spent reading code to time spent writing code is around 10:1 - It is therefore worth the effort to format your code when you are writing it so that it is easy to read. This is hugely subjective with SQL, and I would not recommend any particular conventions, but I cannot believe for a second you find your original code free flowing and easy to read. It took me about 10 minutes just unravel the first insert statement.
EDIT
The above is not correct, EOMONTH() is not sargable, so does not perform any better than YEAR(x) = YEAR(y) AND MONTH(x) = MONTH(y), although it is still a bit simpler. If you want a truly sargable predicate you will need to create a start and end date using #tdate, so you can use:
DATEADD(MONTH, DATEDIFF(MONTH, '19000101', #tdate), '19000101')
to get the first day of the month for #tdate, then almost the same forumla, but add months to 1st February 1900 rather than 1st January to get the start of the next month:
DATEADD(MONTH, DATEDIFF(MONTH, '19000201', #tdate), '19000201')
So the following:
DECLARE #Tdate DATE = '2019-10-11';
SELECT DATEADD(MONTH, DATEDIFF(MONTH, '19000101', #tdate), '19000101'),
DATEADD(MONTH, DATEDIFF(MONTH, '19000201', #tdate), '19000201');
Will return 1st October and 1st November respectively. Putting this back in your original query would give:
WHERE NOT EXISTS
( SELECT 1
FROM tbl_bill AS tb
WHERE date_month >= DATEADD(MONTH, DATEDIFF(MONTH, '19000101', #tdate), '19000101'),
AND date_month < DATEADD(MONTH, DATEDIFF(MONTH, '19000201', #tdate), '19000201')
AND tb.holdingNo = h.holdinNo
AND (tb.update_by IS NOT NULL OR tb.ispay=1)
);

Better alternative for Cursor using While loop to iterate Sql

Good Day,
I am trying to fetch dates from one of my table, containing records in millions, in database and save them in a variable using Cursor. After fetching dates i am inserting records in db at that particular date. For this I am using While Loop. It turns out that While loop is really slowing down the performance, it is taking about hours to complete execution. I am including a part of a query for further clearance.
declare #tranDateCursor cursor;
declare #today as date;
begin
set #tranDateCursor = cursor
for
select distinct transferDate
from transactions
where [type] = 'customer'
open #tranDateCursor
fetch next from #tranDateCursor
into #today
while ##fetch_status = 0
begin
declare #yesterday as date
set #yesterday = (
select top(1) transferDate
from (
select distinct(transferDate) AS transferDate
from transactions
where transferDate < #today
and [type] = 'customer'
) as ODC_DATES
order by ODC_DATES.transferDate desc
)
insert into transactions([type],transferDate)
select 'customer'
,#today
from transactions xt
right outer join x_itransaction as y
on y.customer_account = xt.customer_account
AND y.transferDate = #yesterday
AND xt.transferDate = #today
where xt.transactionId is null
and y.transferDate = #yesterday
and y.[type] ='customer'
end
end
I have tried using Table Variables instead of CURSOR with WHILE loop, but it turns out that it was too taking very much time to run. My concern is that, Is there a better alternative for a while loop in this particular scenario?

SQL Performance Difference of Looping vs Invoking a function

I have a stored procedure to insert records.
I have to calculate a date value for a column with a specific logic. Currently I've created a loop for the data being inserted, and do the calculation to populate the date.
The concern is I need to avoid using the loop to insert data and need to insert them as a batch. In order to do that I'll have to move the date calculation logic to a function.
What will be the differnce in performance wise of looping data (currently have) and using a function.
Here is my stored procedure:
WHILE #C <= #WeeklyDataCount
BEGIN
DECLARE #PopulateDate DATE;
SELECT
#Value = D.Value,
#FromDate = D.FromDate
FROM
#WeeklyData D
WHERE
D.AutoId = #C;
-- Sample Date calculation logic that needs to move to a function
#DayCount = SELECT COUNT(*)
FROM DayTable
#Counter2 = 1;
WHILE #Counter2 < #DayCount
BEGIN
SET #PopulateDate = DATEADD(DAY, (-1 * #Counter2), #FromDate);
SET #Counter2 = #Counter2 + 1;
END
-- End of Day Calculation Logic
INSERT INTO TABLE1 (Value, PopulateDay)
VALUES(#Value, #PopulateDate)
SET #C= #C +1;
END
Your whole loop can be replace with one statement (I assume table DayTable is equal for every row from #WeeklyData).
INSERT INTO TABLE1 (Value,PopulateDay)
SELECT
D.Value,
DATEADD(DAY,(-1 * ((DayCount * (DayCount - 1))/2)),D.FromDate)
FROM #WeeklyData D
CROSS JOIN (SELECT COUNT(*) AS DayCount FROM DayTable) C
WHERE D.AutoId <= #WeeklyDataCount

SQL query with start and end dates - what is the best option?

I am using MS SQL Server 2005 at work to build a database. I have been told that most tables will hold 1,000,000 to 500,000,000 rows of data in the near future after it is built... I have not worked with datasets this large. Most of the time I don't even know what I should be considering to figure out what the best answer might be for ways to set up schema, queries, stuff.
So... I need to know the start and end dates for something and a value that is associated with in ID during that time frame. SO... we can the table up two different ways:
create table xxx_test2 (id int identity(1,1), groupid int, dt datetime, i int)
create table xxx_test2 (id int identity(1,1), groupid int, start_dt datetime, end_dt datetime, i int)
Which is better? How do I define better? I filled the first table with about 100,000 rows of data and it takes about 10-12 seconds to set up in the format of the second table depending on the query...
select y.groupid,
y.dt as [start],
z.dt as [end],
(case when z.dt is null then 1 else 0 end) as latest,
y.i
from #x as y
outer apply (select top 1 *
from #x as x
where x.groupid = y.groupid and
x.dt > y.dt
order by x.dt asc) as z
or
http://consultingblogs.emc.com/jamiethomson/archive/2005/01/10/t-sql-deriving-start-and-end-date-from-a-single-effective-date.aspx
Buuuuut... with the second table.... to insert a new row, I have to go look and see if there is a previous row and then if so update its end date. So... is it a question of performance when retrieving data vs insert/update things? It seems silly to store that end date twice but maybe...... not? What things should I be looking at?
this is what i used to generate my fake data... if you want to play with it for some reason (if you change the maximum of the random number to something higher it will generate the fake stuff a lot faster):
declare #dt datetime
declare #i int
declare #id int
set #id = 1
declare #rowcount int
set #rowcount = 0
declare #numrows int
while (#rowcount<100000)
begin
set #i = 1
set #dt = getdate()
set #numrows = Cast(((5 + 1) - 1) *
Rand() + 1 As tinyint)
while #i<=#numrows
begin
insert into #x values (#id, dateadd(d,#i,#dt), #i)
set #i = #i + 1
end
set #rowcount = #rowcount + #numrows
set #id = #id + 1
print #rowcount
end
For your purposes, I think option 2 is the way to go for table design. This gives you flexibility, and will save you tons of work.
Having the effective date and end date will allow you to have a query that will only return currently effective data by having this in your where clause:
where sysdate between effectivedate and enddate
You can also then use it to join with other tables in a time-sensitive way.
Provided you set up the key properly and provide the right indexes, performance (on this table at least) should not be a problem.
for anyone who can use LEAD Analytic function of SQL Server 2012 (or Oracle, DB2, ...), retrieving data from the 1st table (that uses only 1 date column) would be much much quicker than without this feature:
select
groupid,
dt "start",
lead(dt) over (partition by groupid order by dt) "end",
case when lead(dt) over (partition by groupid order by dt) is null
then 1 else 0 end "latest",
i
from x

How to determine when a time stamp does not exist in a table

I have a table that receives data on an hourly basis. Part of this import process writes the timestamp of the import to the table. My question is, how can I build a query to produce a result set of the periods of time when the import did not write to the table?
My first thought is to have a table of static int and just do an outer join and look for nulls on the right side, but this seems kind of sloppy. Is there a more dynamic way to produce a result set for the times the import failed based on the timestamp?
This is a MS SQL 2000 box.
Update: I think I've got it. The two answers already provided are great, but instead what I'm working on is a function that returns a table of the values I am looking for for a given time frame. Once I get it finished I'll post the solution here.
Here's a slightly modified solution from this article in my blog:
Flattening timespans: SQL Server
DECLARE #t TABLE
(
q_start DATETIME NOT NULL,
q_end DATETIME NOT NULL
)
DECLARE #qs DATETIME
DECLARE #qe DATETIME
DECLARE #ms DATETIME
DECLARE #me DATETIME
DECLARE cr_span CURSOR FAST_FORWARD
FOR
SELECT s_timestamp AS q_start,
DATEADD(minute, 1, s_timestamp) AS q_end
FROM [20090611_timespans].t_span
ORDER BY
q_start
OPEN cr_span
FETCH NEXT
FROM cr_span
INTO #qs, #qe
SET #ms = #qs
SET #me = #qe
WHILE ##FETCH_STATUS = 0
BEGIN
FETCH NEXT
FROM cr_span
INTO #qs, #qe
IF #qs > #me
BEGIN
INSERT
INTO #t
VALUES (#ms, #me)
SET #ms = #qs
END
SET #me = CASE WHEN #qe > #me THEN #qe ELSE #me END
END
IF #ms IS NOT NULL
BEGIN
INSERT
INTO #t
VALUES (#ms, #me)
END
CLOSE cr_span
This will return you the consecutive ranges when updates did happen (with a minute resolution).
If you have an index on your timestamp field, you may issue the following query:
SELECT *
FROM records ro
WHERE NOT EXISTS
(
SELECT NULL
FROM records ri
WHERE ri.timestamp >= DATEADD(minute, -1, ro.timestamp)
AND ri.timestamp < ro.timestamp
)
I was thinking something like this:
select 'Start' MissingStatus, o1.LastUpdate MissingStart
from Orders o1
left join Orders o2
on o1.LastUpdate between
dateadd(ss,1,o2.LastUpdate) and dateadd(hh,1,o2.LastUpdate)
where o2.LastUpdate is null
union all
select 'End', o1.LastUpdate MissingEnd
from Orders o1
left join Orders o2
on o1.LastUpdate between
dateadd(hh,-1,o2.LastUpdate) and dateadd(ss,-1,o2.LastUpdate)
where o2.LastUpdate is null
order by 2