SQL Grouping with condition - sql

I want to sum rows in table. The algorithm is rather simple in theory but hard (at least for me) when I need to build a query.
Generally, I want to sum "values" of a "sub-group". Sub-group is defined as a range of elements starting with first row where type=0 and finishing with last row where type=1. the sub-group should contain only one (first) row with type=0.
The sample below presents correct (left) and incorrect (right) behavior.
I tried several approaches including grouping and partitioning. Unfortunately w/o any success. Anybody had similar problem?
I used MS SQL Server (so T-SQL 'magic' is allowed)
EDIT:
The results I want:
"ab",6
"cdef",20
"ghi",10
"kl",8

You can identify the groups by doing a cumulative sum of zeros. Then use aggregation or window functions.
Note that SQL tables represent unordered sets, so you need a column to specify the ordering. The code below assumes that this column is id.
select min(id), max(id), sum(value)
from (select t.*,
sum(case when type = 0 then 1 else 0 end) over (order by id) as grp
from t
) t
group by grp
order by min(id);

You can use window function with cumulative approach :
select t.*, sum(value) over (partition by grp)
from (select t.*, sum(case when type = 0 then 1 else 0 end) over (order by id) as grp
from table t
) t
where grp > 0;

Solution with a cursor and output-table.
As Gordon wrote it is not defined how the set will be ordered, so ID is also used here.
declare #output as table (
ID_sum nvarchar(max)
,value_sum int
)
DECLARE #ID as nvarchar(1)
,#value as int
,#type as int
,#ID_sum as nvarchar(max)
,#value_sum as int
,#last_type as int
DECLARE group_cursor CURSOR FOR
SELECT [ID],[value],[type]
FROM [t]
ORDER BY ID
OPEN group_cursor
FETCH NEXT FROM group_cursor
INTO #ID, #value,#type
WHILE ##FETCH_STATUS = 0
BEGIN
if (#last_type is null and #type = 0)
begin
set #ID_sum = #ID
set #value_sum = #value
end
if (#last_type in(0,1) and #type = 1)
begin
set #ID_sum += #ID
set #value_sum += #value
end
if (#last_type = 1 and #type = 0)
begin
insert into #output values (#ID_sum, #value_sum)
set #ID_sum = #ID
set #value_sum = #value
end
if (#last_type = 0 and #type = 0)
begin
set #ID_sum = #ID
set #value_sum = #value
end
set #last_type = #type
FETCH NEXT FROM group_cursor
INTO #ID, #value,#type
END
CLOSE group_cursor;
DEALLOCATE group_cursor;
if (#last_type = 1)
begin
insert into #output values (#ID_sum, #value_sum)
end
select *
from #output

Related

How to generate row number based on certain condition?

I have column named type having values 1 and 2. i want to generate the expected results columns.
In this column value 2 should be converted to 0 and for consecutive 1's it should generate the Row number.
Please Refer this image...
Any advise how can we achieve this.
running out of logic.:(
Please try this :
select *,0 as RowNo into #tmp from YourTable
declare #id int
set #id=0
update #tmp
set #id = case typeId when 1 then #id+1 else 0 end,
RowNo = case typeId when 1 then #id else 0 end
select * from #tmp
drop table #tmp
This is the best way to use Cursors
CREATE TABLE [dbo].[Table_1](
[Type] [int] NULL)
Code
Declare #Type int = 0
DECLARE #Test TABLE(
[Type] INT ,
[ExpectedResult] INT
)
DECLARE vendor_cursor CURSOR FOR
SELECT [Type]
FROM [dbo].[Table_1]
OPEN vendor_cursor
FETCH NEXT FROM vendor_cursor
INTO #Type
Declare #ExpectedResult INT = 0
WHILE ##FETCH_STATUS = 0
BEGIN
IF #Type = 2
SET #ExpectedResult = 0
ELSE
SET #ExpectedResult+= 1
INSERT INTO #Test ([Type] ,[ExpectedResult] ) VALUES(#Type , #ExpectedResult)
FETCH NEXT FROM vendor_cursor
INTO #Type
END
CLOSE vendor_cursor;
DEALLOCATE vendor_cursor;
SELECT * FROM #Test
First, you are supposing that your rows have an ordering, but no ordering column is specified. SQL tables represent unordered sets. There is not ordering without such a column.
Let me assume you have one.
Then this is a gaps and islands problem. You want the islands of type = 1 so you can enumerate them. You can identify them by doing a cumulative sum of type = 2 -- this cumulative sum defines the grouping of adjacent type = 1 records. The rest is just row_number():
select t.*,
(case when type = 2 then 0
else row_number() over (partition by type, grp order by <ordering col>)
end) as expected_result
from (select t.*,
sum(case when type = 2 then 1 else 0 end) over (order by <ordering col>) as grp
from t
) t;
Here is a db<>fiddle.

Cursor in SQL Server: Use loop / condition to find and replace a value

I have a issue with my table in SQL Server. Sometime during a insert a normal value (20-50-80) changed by 1000000. It's really rare but to secure the average i need to make a fix before finding a new solution.
I want to take the value that exceeds 1000000 and replace them by the average of the value between it.
This picture show the problem.
I'm looking at the Cursor in SQL.
Here a exemple of my code. Some issue about the result.
CREATE PROCEDURE [dbo].[Avg_Kwh_TagValuesArchive]
AS
BEGIN
DECLARE #tagId INT
DECLARE #localTime DATE
DECLARE #tagValue FLOAT
DECLARE #limit FLOAT
DECLARE #temp FLOAT
DECLARE #tagValueBefore FLOAT
DECLARE #tagValueAfter FLOAT
SET #limit = 999999.9
DECLARE Cursor_FalseValues CURSOR
FOR
SELECT TagID, LocalTime, TagValue
FROM TagValuesArchive
ORDER BY LocalTime DESC
OPEN Cursor_FalseValues
FETCH Cursor_FalseValues
INTO #tagId, #localTime, #tagValue
WHILE(##FETCH_STATUS = 0)
BEGIN
IF (#tagValue>=#limit)
BEGIN
SET #tagValueBefore =
(
SELECT TOP 1 TagValue
FROM TagValuesArchive
WHERE LocalTime < #localTime
AND TagID = #tagID
AND TagValue IS NOT NULL
ORDER BY LocalTime DESC
)
SET #tagValueAfter =
(
SELECT TOP 1 TagValue
FROM TagValuesArchive
WHERE LocalTime > #localTime
AND TagID = #tagID
AND TagValue IS NOT NULL
ORDER BY LocalTime DESC
)
UPDATE dbo.TagValuesArchive
SET TagValue= ((SUM( #tagValueBefore + #tagValueAfter ))/2)
FROM dbo.TagValuesArchive
WHERE LocalTime = #localTime
AND TagID = #tagID
FETCH NEXT FROM Cursor_FalseValues
INTO #tagId, #localTime, #tagValue
END
ELSE
BEGIN
-- Fetch of the Cursos increment the line
FETCH NEXT FROM Cursor_FalseValues
INTO #tagId, #localTime, #tagValue
END
-- Fetch of the Cursos increment the line
--FETCH NEXT FROM Cursor_FalseValues
--INTO #tagId, #localTime, #tagValue
END
CLOSE Cursor_FalseValues
DEALLOCATE Cursor_FalseValues
END
I think my problem is a good example to use Cursor, but it's not very clear in my head.
I can take the wrong value and the values between it. But the Update in the database doesn't work.
I don't know if it's a cursor problem or a update. Maybe just a code syntax problem.
Thanks for any informations.
You can try something like this:
DECLARE #t TABLE (
id int,
val float
)
INSERT INTO #t (id, val)
VALUES
(1,.5),
(2,.7),
(3,.3),
(4,.74),
(5,.2341234),
(6,10000000),
(7,.9),
(8,.8),
(9,.87123),
(10,100000000),
(11,.99)
SELECT * FROM #t
DECLARE #limit FLOAT = 1000000;
;WITH OutOfOBoundsValues AS (
SELECT id FROM #t WHERE val >= #limit
), Neighbourvalues AS (
SELECT O.id, (t1.val+t2.val)/2 newval FROM OutOfOBoundsValues O
JOIN #t t1 ON t1.id = O.id-1
JOIN #t t2 ON t2.id = O.id+1
)
UPDATE #t
SET val = N.newval
FROM #t t
JOIN Neighbourvalues N ON t.id = N.Id
SELECT * FROM #t
What happens here is that we select the data same as and above the limit.
Then we get the neighbouring values and calculates the mean value from them.
Lastly we update the out of bounds values with the mean value.
This should be much faster than your cursor.

SQL Query with Cursor optimization

I have a query where I iterate through a table -> for each entry I iterate through another table and then compute some results. I use a cursor for iterating through the table. This query takes ages to complete. Always more than 3 minutes. If I do something similar in C# where the tables are arrays or dictionaries it doesn't even take a second. What am I doing wrong and how can I improve the efficiency?
DELETE FROM [QueryScores]
GO
INSERT INTO [QueryScores] (Id)
SELECT Id FROM [Documents]
DECLARE #Id NVARCHAR(50)
DECLARE myCursor CURSOR LOCAL FAST_FORWARD FOR
SELECT [Id] FROM [QueryScores]
OPEN myCursor
FETCH NEXT FROM myCursor INTO #Id
WHILE ##FETCH_STATUS = 0
BEGIN
DECLARE #Score FLOAT = 0.0
DECLARE #CounterMax INT = (SELECT COUNT(*) FROM [Query])
DECLARE #Counter INT = 0
PRINT 'Document: ' + CAST(#Id AS VARCHAR)
PRINT 'Score: ' + CAST(#Score AS VARCHAR)
WHILE #Counter < #CounterMax
BEGIN
DECLARE #StemId INT = (SELECT [Query].[StemId] FROM [Query] WHERE [Query].[Id] = #Counter)
DECLARE #Weight FLOAT = (SELECT [tfidf].[Weight] FROM [TfidfWeights] AS [tfidf] WHERE [tfidf].[StemId] = #StemId AND [tfidf].[DocumentId] = #Id)
PRINT 'WEIGHT: ' + CAST(#Weight AS VARCHAR)
IF(#Weight > 0.0)
BEGIN
DECLARE #QWeight FLOAT = (SELECT [Query].[Weight] FROM [Query] WHERE [Query].[StemId] = #StemId)
SET #Score = #Score + (#QWeight * #Weight)
PRINT 'Score: ' + CAST(#Score AS VARCHAR)
END
SET #Counter = #Counter + 1
END
UPDATE [QueryScores] SET Score = #Score WHERE Id = #Id
FETCH NEXT FROM myCursor INTO #Id
END
CLOSE myCursor
DEALLOCATE myCursor
The logic is that i have a list of docs. And I have a question/query. I iterate through each and every doc and then have a nested iteration through the query terms/words to find if the doc contains these terms. If it does then I add/multiply pre-calculated scores.
The problem is that you're trying to use a set-based language to iterate through things like a procedural language. SQL requires a different mindset. You should almost never be thinking in terms of loops in SQL.
From what I can gather from your code, this should do what you're trying to do in all of those loops, but it does it in a single statement in a set-based manner, which is what SQL is good at.
INSERT INTO QueryScores (id, score)
SELECT
D.id,
SUM(CASE WHEN W.[Weight] > 0 THEN W.[Weight] * Q.[Weight] ELSE NULL END)
FROM
Documents D
CROSS JOIN Query Q
LEFT OUTER JOIN TfidfWeights W ON W.StemId = Q.StemId AND W.DocumentId = D.id
GROUP BY
D.id
Of course, without a description of your requirements or sample data with expected output I don't know if this is actually what you're looking to get, but it's my best guess given your code.
You should read: https://stackoverflow.com/help/how-to-ask
The query I came up with is very similar to the one from Tom H.
There's a lot of unknowns about the problem OP code is trying to solve. Is there a particular reason the code only checks for rows in the Query table where the Id value is between 0 and one less than the number of rows in the table? Or is the intent really just to get all of the rows from Query?
Here's my version:
INSERT INTO QueryScores (Id, Score)
SELECT d.Id
, SUM(CASE WHEN w.Weight > 0 THEN w.Weight * q.Weight ELSE NULL END) AS Score
FROM [Documents] d
CROSS
JOIN [Query] q
LEFT
JOIN [TfidfWeights] w
ON w.StemId = q.StemId
AND w.DocumentId = d.Id
GROUP BY d.Id
Processing RBAR (row by agonizing row) is almost always going to be slower than processing as a set. SQL is designed to operate on sets of data. There is overhead for each individual SQL statement, and for each context switch between the procedure and the SQL engine. Sure, there might be room to improve performance of individual parts of the procedure, but the big gain is going to be doing an operation on the entire set, in a single SQL statement.
If there's some reason you need to process one document at a time, using a cursor, then get rid of the loops and individual selects and all those PRINT, and just use a single query to get the score for the document.
OPEN myCursor
FETCH NEXT FROM myCursor INTO #Id
WHILE ##FETCH_STATUS = 0
BEGIN
UPDATE [QueryScores]
SET Score
= ( SELECT SUM( CASE WHEN w.Weight > 0
THEN w.Weight * q.Weight
ELSE NULL END
)
FROM [Query] q
JOIN [TfidfWeights] w
ON w.StemId = q.StemId
WHERE w.DocumentId = #Id
)
WHERE Id = #Id
FETCH NEXT FROM myCursor INTO #Id
END
CLOSE myCursor
DEALLOCATE myCursor
You might not even need documents
INSERT INTO QueryScores (id, score)
SELECT W.DocumentId as [id]
, SUM(W.[Weight] + Q.[Weight]) as [score]
FROM Query Q
JOIN TfidfWeights W
ON W.StemId = Q.StemId
AND W.[Weight] > 0
GROUP BY W.DocumentId

Loop through all the rows of a temp table and call a stored procedure for each row

I have declared a temp table to hold all the required values as follows:
DECLARE #temp TABLE
(
Password INT,
IdTran INT,
Kind VARCHAR(16)
)
INSERT INTO #temp
SELECT s.Password, s.IdTran, 'test'
from signal s inner join vefify v
on s.Password = v.Password
and s.IdTran = v.IdTran
and v.type = 'DEV'
where s.[Type] = 'start'
AND NOT EXISTS (SELECT * FROM signal s2
WHERE s.Password = s2.Password
and s.IdTran = s2.IdTran
and s2.[Type] = 'progress' )
INSERT INTO #temp
SELECT s.Password, s.IdTran, 'test'
FROM signal s inner join vefify v
on s.Password = v.Password
and s.IdTran = v.IdTran
and v.type = 'PROD'
where s.[Type] = 'progress'
AND NOT EXISTS (SELECT * FROM signal s2
WHERE s.Password = s2.Password
and s.IdTran = s2.IdTran
and s2.[Type] = 'finish' )
Now i need to loop through the rows in the #temp table and and for each row call a sp that takes all the parameters of #temp table as input.
How can I achieve this?
you could use a cursor:
DECLARE #id int
DECLARE #pass varchar(100)
DECLARE cur CURSOR FOR SELECT Id, Password FROM #temp
OPEN cur
FETCH NEXT FROM cur INTO #id, #pass
WHILE ##FETCH_STATUS = 0 BEGIN
EXEC mysp #id, #pass ... -- call your sp here
FETCH NEXT FROM cur INTO #id, #pass
END
CLOSE cur
DEALLOCATE cur
Try returning the dataset from your stored procedure to your datatable in C# or VB.Net. Then the large amount of data in your datatable can be copied to your destination table using a Bulk Copy. I have used BulkCopy for loading large datatables with thousands of rows, into Sql tables with great success in terms of performance.
You may want to experiment with BulkCopy in your C# or VB.Net code.
something like this?
DECLARE maxval, val, #ind INT;
SELECT MAX(ID) as maxval FROM table;
while (ind <= maxval ) DO
select `value` as val from `table` where `ID`=ind;
CALL fn(val);
SET ind = ind+1;
end while;
You can do something like this
Declare #min int=0, #max int =0 --Initialize variable here which will be use in loop
Declare #Recordid int,#TO nvarchar(30),#Subject nvarchar(250),#Body nvarchar(max) --Initialize variable here which are useful for your
select ROW_NUMBER() OVER(ORDER BY [Recordid] ) AS Rownumber, Recordid, [To], [Subject], [Body], [Flag]
into #temp_Mail_Mstr FROM Mail_Mstr where Flag='1' --select your condition with row number & get into a temp table
set #min = (select MIN(Rownumber) from #temp_Mail_Mstr); --Get minimum row number from temp table
set #max = (select Max(Rownumber) from #temp_Mail_Mstr); --Get maximum row number from temp table
while(#min <= #max)
BEGIN
select #Recordid=Recordid, #To=[To], #Subject=[Subject], #Body=Body from #temp_Mail_Mstr where Rownumber=#min
-- You can use your variables (like #Recordid,#To,#Subject,#Body) here
-- Do your work here
set #min=#min+1 --Increment of current row number
END
You always don't need a cursor for this. You can do it with a while loop. You should avoid cursors whenever possible. While loop is faster than cursors.

Need to incorporate if else statement within my code

I am trying to achieve if else statement or case statement within my code below. I want to use one of these statement (if or case) to see if my RT_Ch_Pres_PX1 values are within certain min or max specs if they are above or below I want to say that if my RT_Ch_Pres_PX1 is below my min value by this much then display that value and indicate by how much is it of by and same this for exceeding max value. for example if my RT_Ch_Pres_PX1 value is 5 and I want to use my min valuse at 6 and max at 10. So my Rt_Ch_pres_px1 value is off by 1 so I would like to display this and say this value is of by 1 value. if RT_Ch_Pres_PX1 is within min and max values do nothing. Please see code below.
DECLARE #Result TABLE
(
RT_DateTime datetime,
RT_Phase_Name varchar(30),
RT_PhaseChangeCount int,
RT_Phase_Type int,
RT_Ch_Pres_PX1 float
);
/* Variables used to track changes to Phase Name */
DECLARE #RT_DateTime datetime;
DECLARE #RT_Phase_Name varchar(30);
DECLARE #RT_PhaseChangeCount int;
DECLARE #RT_Phase_Type int;
DECLARE #RT_Ch_Pres_PX1 float;
DECLARE #PhaseNameHold varchar(30);
DECLARE #PhaseChangeCount int;
SELECT #PhaseNameHold = ' ';
SELECT #PhaseChangeCount = 0;
SELECT #RT_PhaseChangeCount = 0;
/* Declare a cursor for determining when Phases change */
DECLARE ImportCursor CURSOR FAST_FORWARD FOR
SELECT
CONVERT(datetime, dbo.CycleData.Date_Time) as TimeConvert,
[dbo].[LookupPhases].[Phase_Name],
[dbo].[cycledata].[phase_type],
[dbo].[cycledata].[Ch_Pres_PX1]
FROM
CycleData INNER JOIN
CycleDataHeader ON CycleData.Unit_Number = CycleDataHeader.Unit_Number AND CycleData.Cycle_Counter_No = CycleDataHeader.Cycle_Counter_No INNER JOIN
LookupPhases ON CycleData.Phase_Type = LookupPhases.Phase_Type INNER JOIN
LookupEvent ON CycleData.Event_Type = LookupEvent.Event_Id LEFT OUTER JOIN
LookupAlarm ON CycleData.Alarm_Type = LookupAlarm.Alarm_Id
WHERE
[dbo].[CycleDataHeader].[Entered_Load_No1] = 'T14-0008'
ORDER BY
/* Appears to be the order that needs to be reported on */
Cycle_Time
-- dbo.CycleData.Unit_Number,
-- TimeConvert;
OPEN ImportCursor;
FETCH NEXT FROM ImportCursor INTO #RT_DateTime,
#RT_Phase_Name,
#RT_Phase_Type,
#RT_Ch_Pres_PX1
WHILE ##FETCH_STATUS = 0
BEGIN
IF (#RT_Phase_Name <> #PhaseNameHold)
BEGIN
SET #PhaseNameHold = #RT_Phase_Name;
SET #RT_PhaseChangeCount = #RT_PhaseChangeCount + 1;
END
INSERT INTO #Result VALUES(#RT_DateTime, #RT_Phase_Name,#RT_PhaseChangeCount,#RT_Phase_Type,#RT_Ch_Pres_PX1);
FETCH NEXT FROM ImportCursor INTO #RT_DateTime, #RT_Phase_Name,#RT_Phase_Type,#RT_Ch_Pres_PX1;
END
CLOSE ImportCursor;
DEALLOCATE ImportCursor;
SELECT
RT_DateTime,
RT_Phase_Name,
RT_PhaseChangeCount,
RT_Phase_Type,
RT_Ch_Pres_PX1
FROM #Result;
This case will generate the value you want:
case
when RT_Ch_Pres_PX1 < some_min then RT_Ch_Pres_PX1 - some_min
when RT_Ch_Pres_PX1 > some_max then RT_Ch_Pres_PX1 - some_max
else 0
end
The value created for the undershoot is negative (a good idea I think). If you want it to be positive, flip the calculation.