Sql Server select and update X rows

Sql Server select and update X rows - sql

I'm using Sql Server 2012.
I need to select rows from a table for processing. The number of rows needs to be variable. I need to update the rows I'm selecting to a "being processed" status - I have a guid to populate for this purpose.
I've encountered several examples of using row_number() and a couple of examples of ways of using CTE's, but I'm not sure on how to combine them (or if that's even the correct strategy). I would appreciate any insight.
Here is what I have so far:
DECLARE #SessionGuid uniqueidentifier, #rowcount bigint
SELECT #rowcount = 1000
SELECT #sessionguid = newid()
DECLARE #myProductChanges table (
ProductChangeId bigint
, ProductTypeId smallint
, SourceSystemId tinyint
, ChangeTypeId tinyint );
WITH NextPage AS
(
SELECT
ProductChangeId, ServiceSessionGuid,
ROW_NUMBER() OVER (ORDER BY ProductChangeId) AS 'RowNum'
FROM dbo.ProductChange
WHERE 'RowNum' < #rowcount
)
UPDATE dbo.ProductChange
SET ServiceSessionGuid = #sessionguid, ProcessingStateId = 2, UpdatedDate = getdate()
OUTPUT
INSERTED.ProductChangeId,
INSERTED.ProductTypeId,
INSERTED.SourceSystemId,
INSERTED.ChangeTypeId
INTO #myProductChanges
FROM dbo.ProductChange as pc join NextPage on pc.ProductChangeId = NextPage.ProductChangeId
From here I will select from my temp table and return the data:
SELECT mpc.ProductChangeId
, pt.ProductName as ProductType
, ss.Name as SourceSystem
, ct.ChangeDescription as ChangeType
FROM #myProductChanges as mpc
join dbo.R_ProductType pt on mpc.ProductTypeId = pt.ProductTypeId
join dbo.R_SourceSystem ss on mpc.SourceSystemId = ss.SourceSystemId
join dbo.R_ChangeType ct on mpc.ChangeTypeId = ct.ChangeTypeId
ORDER BY ProductType asc
So far this doesn't work for me. I get an error when I try to run it:
Msg 8114, Level 16, State 5, Line 20
Error converting data type varchar to bigint.
I'm not clear on what I'm doing wrong - so - any help is appreciated.
Thanks!
BTW, here are some of the questions I've used as reference to try and solve this:
https://stackoverflow.com/questions/9777178
https://stackoverflow.com/questions/3319842
https://stackoverflow.com/questions/6402103

This subquery makes no sense:
SELECT
ProductChangeId, ServiceSessionGuid,
ROW_NUMBER() OVER (ORDER BY ProductChangeId) AS 'RowNum'
FROM dbo.ProductChange
WHERE 'RowNum' < #rowcount
You can't reference the alias RowNum at the same scope (and you are trying to compare a string, not an alias, anyway), because when the WHERE clause is parsed, the SELECT list hasn't been materialized yet. What you need is either another nest:
SELECT ProductChangeId, ServiceSessionGuid, RowNum
FROM (SELECT ProductChangeId, ServiceSessionGuid,
ROW_NUMBER() OVER (ORDER BY ProductChangeId) AS RowNum
FROM dbo.ProductChange
) AS x WHERE RowNum < #rowcount
Or:
SELECT TOP (#rowcount-1) ProductChangeId, ServiceSessionGuid,
ROW_NUMBER() OVER (ORDER BY ProductChangeId) AS RowNum
FROM dbo.ProductChange
ORDER BY ProductChangeId
Also please stop using 'alias' - when you need to delimit aliases (you don't in this case), use [square brackets].

I'm guessing, but I think you want <= rather than < if you want to affect #rowcount rows, not one less.
Another tip is that CTEs can be updated directly*, as shown here:
WITH NextPage AS
(
SELECT TOP(#rowcount) *
FROM dbo.ProductChange
)
UPDATE NextPage
SET ServiceSessionGuid = #sessionguid, ProcessingStateId = 2, UpdatedDate = getdate()
OUTPUT
INSERTED.ProductChangeId,
INSERTED.ProductTypeId,
INSERTED.SourceSystemId,
INSERTED.ChangeTypeId
INTO #myProductChanges
* The updates affect the base table in the CTE, i.e. dbo.ProductChange

Related

Two different methods of obtaining max row

The first statement is how I have needed to pull a min row based on the org's needs I work for. At first, I would MIN(DATEFIELD) but if someone has two entries on the same day, we had problems. Next I tried MIN(OP__DOCID) where OP__DOCID is the table's unique key. Problem here is if someone ever back-dated an entry they forgot to create, the results would be inaccurate. So, I came up with the below statement. It ensures I get the most recent result from each unique admission.
SELECT OP__DocID
FROM FD__CNSLG_BASIS24 AS PC1
WHERE (OP__DOCID =
(SELECT TOP(1)OP__DocID
FROM FD__CNSLG_BASIS24 AS PC2
WHERE PC2.ClientKey = PC1.Clientkey and PC2.ProgramAdmitKey = PC1.Programadmitkey
ORDER BY Date_Screening
)
)
Recently, I have learned about OVER(PARTITION BY) and have been curious as to the subtle differences in how it works v.s. the statement above, because I do get different result.
SELECT OP__DocID = Min(OP__DOCID) OVER (Partition BY Clientkey, Programadmitkey)
FROM FD__CNSLG_BASIS24
Any insight, or links to other pages I could read would be extremely helpful.
Thanks!

Just use window functions:
select pc.*
from (select pc.*,
row_number() over (partition by Clientkey, ProgramAdmitKey
order by Date_Screening -- do you mean DESC?
) as seqnum
from FD__CNSLG_BASIS24 PC
) pc
where seqnum = 1;
Note: this gets the first record based on the screening date. You might want DESC to get the most recent.

My Solution, for those that were curious
I want to go back and substitute the SELECT TOP(1) for the ROW_Number() function, but I needed to get a report out, and this is providing what I need. Thanks for everyone's help.
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
BEGIN
SET NOCOUNT ON;
Declare
#StartDate Date,
#EndDate Date
SET #StartDate = '1/1/2016'
SET #EndDate = '6/1/2016'
WITH CNSL_Clients AS (
SELECT PC_CNT.Clientkey, PC_Cnt.ProgramAdmitKey, PC_Cnt.OP__DOCID
FROM FD__Primary_Client as PC_Cnt
INNER JOIN VW__Cnsl_Session_Count_IndvFamOnly as cnt
ON PC_Cnt.Clientkey = CNT.Clientkey AND PC_Cnt.ProgramAdmitKey = CNT.ProgramAdmitKey
WHERE ((pc_CNT.StartDate between #StartDate AND #EndDate) OR (pc_CNT.StartDate <= #StartDate AND pc_CNT.ENDDate >= #StartDate) OR (pc_CNT.StartDate <= #StartDate AND pc_CNT.ENDDate is null))
AND CNT.SessionCount>=6
),
FIRST_BASIS AS (
SELECT CB24_1.OP__DOCID, CB24_1.Date_Screening, CB24_1.ClientKey, CB24_1.ProgramAdmitKey, CB24_1.Composite_score, CB24_1.Depression_Results,CB24_1.Emotional_Results, CB24_1.Relationships_Results
FROM FD__CNSLG_BASIS24 AS CB24_1
WHERE (CB24_1.OP__DOCID =
(Select TOP(1) CB24_2.OP__DOCID
FROM FD__CNSLG_BASIS24 AS CB24_2
Inner JOIN CNSL_Clients
ON CB24_2.ClientKey = CNSL_Clients.ClientKey AND CB24_2.ProgramAdmitKey = CNSL_Clients.ProgramAdmitKey
WHERE (CB24_1.ClientKey = CB24_2.ClientKey) AND (CB24_1.ProgramAdmitKey = CB24_2.ProgramAdmitKey)
ORDER BY CB24_2.Date_Screening))
),
RECENT_BASIS AS (
SELECT CB24_1.OP__DOCID, CB24_1.Date_Screening, CB24_1.ClientKey, CB24_1.ProgramAdmitKey, CB24_1.Composite_score, CB24_1.Depression_Results,CB24_1.Emotional_Results, CB24_1.Relationships_Results
FROM FD__CNSLG_BASIS24 AS CB24_1
WHERE (CB24_1.OP__DOCID =
(Select TOP(1) CB24_2.OP__DOCID
FROM FD__CNSLG_BASIS24 AS CB24_2
Inner JOIN CNSL_Clients
ON CB24_2.ClientKey = CNSL_Clients.ClientKey AND CB24_2.ProgramAdmitKey = CNSL_Clients.ProgramAdmitKey
WHERE (CB24_1.ClientKey = CB24_2.ClientKey) AND (CB24_1.ProgramAdmitKey = CB24_2.ProgramAdmitKey)
ORDER BY CB24_2.Date_Screening DESC))
)
SELECT F.OP__DOCID AS First_DOCID,R.OP__DOCID as Recent_DOCID,F.ClientKey, F.ProgramAdmitKey, F.Composite_Score AS FComposite_Score, R.Composite_Score as RComposite_Score, Composite_Change = R.Composite_Score - F.Composite_Score, F.Depression_Results AS FDepression_Results, R.Depression_Results AS RDepression_Resluts, Depression_Change = R.Depression_Results - F.Depression_Results, F.Emotional_Results AS FEmotional_Resluts, R.Emotional_Results AS REmotionall_Reslu, Emotional_Change = R.Emotional_Results - F.Emotional_Results, F.Relationships_Results AS FRelationships_Resluts, R.Relationships_Results AS RRelationships_Resluts, Relationship_Change = R.Relationships_Results - F.Relationships_Results
FROM First_basis AS F
FULL Outer JOIN RECENT_BASIS AS R
ON F.ClientKey = R.ClientKey AND F.ProgramAdmitKey = R.ProgramAdmitKey
ORDER BY F.ClientKey
END
GO

Rank & find difference of value in the same column

I have the below table -
Here, I have created the "Order" column by using the rank function partitioned by case_identifier ordered by audit_date.
Now, I want to create a new column as below -
The logic for the new column would be -
select *,
case when [order] = '1' then [days_diff]
else (val of [days_diff] in rank 2) - (val of [days_diff] in rank 1) ...
end as '[New_Col]'
from TABLE
Can you please help me with the syntax? Thanks.

Take a look at the LAG function. It will provide you with what you want.
something like:
declare #temptable TABLE (case_id varchar(2), row_order int, days_diff float)
INSERT INTO #temptable values ('A',1,5)
INSERT INTO #temptable values ('A',2,3)
INSERT INTO #temptable values ('A',3,2)
INSERT INTO #temptable values ('B',1,5)
INSERT INTO #temptable values ('B',2,1)
--select * from #temptable
SELECT case_id,row_order, LAG(days_diff,1) OVER (PARTITION BY case_id ORDER BY row_order) AS prev_row,days_diff,
CASE
WHEN row_order = 1 THEN days_diff
ELSE LAG(days_diff,1) OVER (PARTITION BY case_id ORDER BY row_order) - days_diff
END AS newcolumn
FROM #temptable
order by case_id,row_order asc
SELECT case_id,row_order,LAG(days_diff,1) OVER (PARTITION BY case_id ORDER BY row_order) AS prev_row, days_diff,
COALESCE(LAG(days_diff,1) OVER (PARTITION BY case_id ORDER BY row_order) - days_diff , days_diff)
FROM #temptable
order by case_id,row_order asc
Other answers will use a coalesce in place of the CASE statement. It's probably faster, but I feel like this is clearer.
If you run both and look at the execution plans they are the same.

I believe the following query gets you what you want.
SELECT a.*,
'NEW DAYS DIFF' =
CASE
WHEN a.[order] = 1 THEN a.days_diff
ELSE a.days_diff - b.days_diff
END
FROM dbo.tblCaseDaysDiff a
INNER JOIN dbo.tblCaseDaysDiff b
ON
(b.CASE_ID = a.CASE_ID AND b.[order] + 1 = a.[order] ) -- Get the current row and compare with the next highest order
OR (b.CASE_ID = a.CASE_ID AND b.[order] = 1 AND a.[order] = 1) --WHEN ORDER = 1 Get days_diff value
ORDER BY a.CASE_ID, a.[order]

As it happens, you're already hip-deep in window functions, and as others have pointed out, LAG will do the trick. In general, though, you can always get the difference of two rows by making one row: by joining the table to itself.
with T (CASE_IDENTIFIER, AUDIT_DATE, order, days_diff)
as (
... your query ...
)
select a.*,
a.days_diff - coalesce(b.days_diff, 0) as delta_days_diff
from T as a left join T as b
on a.CASE_IDENTIFIER = b.CASE_IDENTIFIER
and b.days_diff = a.days_diff - 1

LAG METHOD
SELECT
CASE_IDENTIFIER
,AUDIT_DATE
,[order]
,days_diff
,days_diff - ISNULL(LAG(days_diff,1) OVER (PARTITION BY CASE_IDENTIFIER ORDER BY [order]),0) AS New_Column
FROM #Table
SELF JOIN METHOD
SELECT
t1.CASE_IDENTIFIER
,AUDIT_DATE
,t1.[order]
,t1.days_diff
,t1.days_diff - ISNULL(t2.days_diff,0) AS New_Column
FROM
#Table t1
LEFT JOIN #Table t2
ON t1.CASE_IDENTIFIER = t2.CASE_IDENTIFIER
AND t1.[order] - 1 = t2.[order]
I feel like a lot of the other answers are on the right track but there are some nuances or easier ways of writing some of them. Or also some of the answer provide the write direction but had something wrong with their join or syntax. Anyways, you don't need the CASE STATEMENT whether you use the LAG of SELF JOIN Method. Next COALESCE() is great but you are only comparing 2 values so ISNULL() works fine too for sql-server but either will do.

Update table with another column in the same table

I have a table like this
Test_order
Order Num Order ID Prev Order ID
987Y7OP89 919325 0
987Y7OP90 1006626 919325
987Y7OP91 1029350 1006626
987Y7OP92 1756689 0
987Y7OP93 1756690 0
987Y7OP94 1950100 1756690
987Y7OP95 1977570 1950100
987Y7OP96 2160462 1977570
987Y7OP97 2288982 2160462
Target table should be like below,
Order Num Order ID Prev Order ID
987Y7OP89 919325 0
987Y7OP90 1006626 919325
987Y7OP91 1029350 1006626
987Y7OP92 1756689 1029350
987Y7OP93 1756690 1756689
987Y7OP94 1950100 1756690
987Y7OP95 1977570 1950100
987Y7OP96 2160462 1977570
987Y7OP97 2288982 2160462
987Y7OP97 2288900 2288982
Prev Order ID should be updated with the Order ID from the previous record from the same table.
I'm trying to create a dummy data set and update..but it's not working..
WITH A AS
(SELECT ORDER_NUM, ORDER_ID, PRIOR_ORDER_ID,ROWNUM RID1 FROM TEST_ORDER),B AS (SELECT ORDER_NUM, ORDER_ID, PRIOR_ORDER_ID,ROWNUM+1 RID2 FROM TEST_ORDER)
SELECT A.ORDER_NUM,B.ORDER_ID,A.PRIOR_ORDER_ID,B.PRIOR_ORDER_ID FROM A,B WHERE RID1 = RID2

You could use Oracles Analytical Functions (also called Window functions) to pick up the value from the previous order:
UPDATE Test_Order
SET ORDERID = LAG(ORDERID, 1, 0) OVER (ORDER BY ORDERNUM ASC)
WHERE PrevOrderId = 0
See here for the documentation on LAG()

In sql-server you cannot use window function in update statement, not positive but don't think so in Oracle either. Anyway to get around that you can just update a cte as follows.
WITH cte AS (
SELECT
*
,NewPreviousOrderId = LAG(OrderId,1,0) OVER (ORDER BY OrderNum)
FROM
TableName
)
UPDATE cte
SET PrevOrderId = NewPreviousOrderId
And if you want to stick with the ROW_NUMBER route you were going this would be the way of doing it.
;WITH cte AS (
SELECT
*
,ROW_NUMBER() OVER (ORDER BY OrderNum) AS RowNum
FROM
TableName
)
UPDATE c1
SET PrevOrderId = c2.OrderId
FROM
cte c1
INNER JOIN cte c2
ON (c1.RowNum - 1) = c2.RowNum

Stored procedure order of execution

CREATE PROCEDURE [dbo].[uspGetLogs]
(
#StartDate DATETIME,
#EndDate DATETIME
)
AS
SELECT
sl.ID,
LOG10(sl.Value)
FROM
dbo.SampleList sl
INNER JOIN
(
SELECT
ID,
RANK() OVER(PARTITION BY Codec ORDER BY TimeStampUTC DESC, d.ID DESC) ranked
FROM
dbo.SampleList
WHERE
ListDate BETWEEN #StartDate AND #EndDate
) r
ON
r.ID = sl.ID AND
r.ranked = 1
I tried this stored procedure with this #StartDate = 2014-01-29 #EndDate = 2015-03-14.
And gets this error
An invalid floating point operation occurred
The reason of the error "An Invalid floating point operation occured" is the invalid usage of mathematical function.
SELECT LOG10(-3);
SELECT LOG10(0);
If the above functions are run it will return the error.
I able to get a single value from the whole table set where value is less than one. But the ListDate for that value is 2015-03-14 so it should not be included because it is not coverted by the date range passed in the stored procedure.
So it seems that the stored procedure executes the function in the whole set first before joining and filtering the dataset with date range.
Is this expected?

I think there could be a data issue here as the basic logic of your code doesn't cause an issue, see the below sample:
CREATE TABLE #temp1 ( id INT, val INT )
CREATE TABLE #temp2 ( id INT, val INT )
INSERT INTO #temp1 ( id, val )
VALUES ( 1, 1 ), ( 2, 10 ), ( 3, -1 ) -- Negative value for id=3 exculded in subquery
INSERT INTO #temp2 ( id, val )
VALUES ( 1, 1 ), ( 2, 10 ), ( 3, 20 )
SELECT t1.id ,
LOG10(t1.val) AS Val
FROM #temp1 t1
INNER JOIN ( SELECT * ,
RANK() OVER ( PARTITION BY id ORDER BY val ) ranked
FROM #temp2
WHERE id BETWEEN 1 AND 2 -- excludes id 3
) t2 ON t2.id = t1.id
AND t2.ranked = 1
DROP TABLE #temp1
DROP TABLE #temp2
Produces:
id val
1 0
2 1
If you modify the BETWEEN clause to WHERE id BETWEEN 1 AND 3, you do see the error as the negative value is included.
So I'd triple check the data and if there's still an issue, try to post a small sample that recreates the issue.

I dont think there's a guaranteed point where function gets executed, so my answer is it depends, it depends on query plan, on how "early" the optimizer decides to execute the function - before or after the join. To make sure the function is only executed for valid values, you can change to:
CASE WHEN sl.Value < 1 THEN 0 ELSE LOG10(sl.Value) END

Reuse subquery result in WHERE-Clause for INSERT

i am using Microsoft SQL Server 2008
i would like to save the result of a subquery to reuse it in a following subquery.
Is this possible?
What is best practice to do this? (I am very new to SQL)
My query looks like:
INSERT INTO [dbo].[TestTable]
(
[a]
,[b]
)
SELECT
(
SELECT TOP 1 MAT_WS_ID
FROM #TempTableX AS X_ALIAS
WHERE OUTERBASETABLE.LT_ALL_MATERIAL = X_ALIAS.MAT_RM_NAME
)
,(
SELECT TOP 1 MAT_WS_NAME
FROM #TempTableY AS Y_ALIAS
WHERE Y_ALIAS.MAT_WS_ID = MAT_WS_ID
--(
--SELECT TOP 1 MAT_WS_ID
--FROM #TempTableX AS X_ALIAS
--WHERE OUTERBASETABLE.LT_ALL_MATERIAL = X_ALIAS.MAT_RM_NAME
--)
)
FROM [dbo].[LASERTECHNO] AS OUTERBASETABLE
My question is:
Is this correct what i did.
I replaced the second SELECT Statement in the WHERE-Clause for [b] (which is commented out and exactly the same as for [a]), with the result of the first SELECT Statement of [a] (=MAT_WS_ID).
It seems to give the right results.
But i dont understand why!
I mean MAT_WS_ID is part of both temporary tables X_ALIAS and Y_ALIAS.
So in the SELECT statement for [b], in the scope of the [b]-select-query, MAT_WS_ID could only be known from the Y_ALIAS table. (Or am i wrong, i am more a C++, maybe the scope things in SQL and C++ are totally different)
I just wannt to know what is the best way in SQL Server to reuse an scalar select result.
Or should i just dont care and copy the select for every column and the sql server optimizes it by its own?

One approach would be outer apply:
SELECT mat.MAT_WS_ID
, (
SELECT TOP 1 MAT_WS_NAME
FROM #TempTableY AS Y_ALIAS
WHERE Y_ALIAS.MAT_WS_ID = mat.MAT_WS_ID
)
FROM [dbo].[LASERTECHNO] AS OUTERBASETABLE
OUTER APPLY
(
SELECT TOP 1 MAT_WS_ID
FROM #TempTableX AS X_ALIAS
WHERE OUTERBASETABLE.LT_ALL_MATERIAL = X_ALIAS.MAT_RM_NAME
) as mat

You could rank rows in #TempTableX and #TempTableY partitioning them by MAT_RM_NAME in the former and by MAT_WS_ID in the latter, then use normal joins with filtering by rownum = 1 in both tables (rownum being the column containing the ranking numbers in each of the two tables):
WITH x_ranked AS (
SELECT
*,
rownum = ROW_NUMBER() OVER (PARTITION BY MAT_RM_NAME ORDER BY (SELECT 1))
FROM #TempTableX
),
y_ranked AS (
SELECT
*,
rownum = ROW_NUMBER() OVER (PARTITION BY MAT_WS_ID ORDER BY (SELECT 1))
FROM #TempTableY
)
INSERT INTO dbo.TestTable (a, b)
SELECT
x.MAT_WS_ID,
y.MAT_WS_NAME
FROM dbo.LASERTECHNO t
LEFT JOIN x_ranked x ON t.LT_ALL_MATERIAL = x.MAT_RM_NAME AND x.rownum = 1
LEFT JOIN y_ranked y ON x.MAT_WS_ID = y.MAT_WS_ID AND y.rownum = 1
;
The ORDER BY (SELECT 1) bit is a trick to specify an indeterminate ordering, which, accordingly, would result in indeterminate rownum = 1 rows picked by the query. That is to more or less duplicate your TOP 1 without an explicit order, but I would recommend you to specify a more sensible ORDER BY clause to make the results more predictable.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Sql Server select and update X rows - sql

Related

Two different methods of obtaining max row

Rank & find difference of value in the same column

Update table with another column in the same table

Stored procedure order of execution

Reuse subquery result in WHERE-Clause for INSERT

Categories

Resources