is "select count(*) from..." more reliable than ##ROWCOUNT?

is "select count(*) from..." more reliable than ##ROWCOUNT? - sql

I have a proc that inserts records to a temp table. In pseudocode it looks like this:
Create temp table
a. Insert rows into temp table based on stringent criteria
b. if no rows were inserted, insert based on less stringent criteria
c. if there are still no rows, try again with even less stringent criteria
select from temp table
There are a lot of IF ##rowcount = 0 checks in the code to see if the table has any rows. The issue is that the proc isn't really doing what it looks like it should be doing and it's inserting the same row twice (steps a and c are being executed). However, if I change it to check this way IF ( (select count(*) from #temp) = 0) the proc works exactly as expected.
Which makes me think that ##rowcount isn't the best solution to this problem. But I'm adding in extra work via the SELECT COUNT(*). Which is the better solution?

##rowcount is the better solution. The work is already done. Selecting count(*) causes the database to do more work.
You need to make sure you are not doing something that will affect the value of ##rowcount before checking the value of ##rowcount. It is usually best to check ##rowcount immediately after performing the insert statement. If necessary assign the value to a variable so you can check it later:
DECLARE #rows int
...
[insert or update a table]
SET #rows = ##rowcount
Storing the row count immediately after any operation that changes row count will allow you to use the value more than once.

##ROWCOUNT will just check "affected rows" from the previous query. This can be a little far reaching as many things can create rows or "affect" rows and some options don't return a value to the client.
Microsoft says:
Statements that make an assignment in a query or use RETURN in a query set the ##ROWCOUNT value to the number of rows affected or read by the query, for example: SELECT #local_variable = c1 FROM t1.
Data manipulation language (DML) statements set the ##ROWCOUNT value to the number of rows affected by the query and return that value to the client. The DML statements may not send any rows to the client.
DECLARE CURSOR and FETCH set the ##ROWCOUNT value to 1.
EXECUTE statements preserve the previous ##ROWCOUNT.
Statements such as USE, SET , DEALLOCATE CURSOR, CLOSE CURSOR, BEGIN TRANSACTION or COMMIT TRANSACTION reset the ROWCOUNT value to 0.
If you are not getting what you expect from ##ROWCOUNT (which probably means your query is more complex than your example) I would definitely be looking at using the SELECT COUNT(*) option, or, if you're worried about the performance hit do something like this:
INSERT INTO temptable
(cols...)
SELECT
COL = #VAL
FROM
sourcetableorquery
LEFT JOIN temptable on [check for existing row]
WHERE
temptable.id is null
This will be faster than the count option if you are looping over a big recordset.

If it's ever being used twice in a row, that could be a problem:
http://beyondrelational.com/blogs/madhivanan/archive/2010/08/09/proper-usage-of-rowcount.aspx

Related

simple UPDATE query on large table has bad performance

I need to do the following update query through a stored procedure:
UPDATE table1
SET name = #name (this is the stored procedure inputparameter)
WHERE name IS NULL
Table1 has no indexes or keys, 5 columns which are 4 integers and 1 varchar (updatable column 'name' is the varchar column)
The NULL records are about 15.000.000 rows that need updating. This takes about 50 minutes, which I think is too long.
I'm running an Azure SQL DB Standard S6 (400DTU's).
Can anyone give me an advise to improve performance?

As you don't have any keys, or indexes, I can suggest following approach.
1- Create a new table using INTO (which will copy the data) like following query.
SELECT
CASE
WHEN NAME IS NULL THEN #name
ELSE NAME
END AS NAME,
<other columns >
INTO dbo.newtable
FROM table1
2- Drop the old table
drop table table1
3- Rename the new table to table1
exec sp_rename 'dbo.newtable', 'table1'
Another approach can be using batch update, sometime you get better performance compared to bulk update (You need to test by adjusting the batch size).
WHILE EXISTS (SELECT 1 FROM table1 WHERE name is null)
BEGIN
UPDATE TOP (10000) table1
SET name = #name
WHERE n ame is null
END

can you do with following method ?
UPDATE table1
SET name = ISNULL(name,#name)
for null values it will update with #name and rest will be updated with same value.

No. You are updating 15,000,000 rows which is going to take a long time. Each update has overhead for finding the row and logging the value.
With so many rows to update, it is unlikely that the overhead is finding the rows. If you add an index on name, the update is going to actually have to update the index as well as updating the original values.
If your concern is locking the database, you can set up a loop where you do something like this over and over:
UPDATE TOP (100000) table1
SET name = #name (this is the stored procedure inputparameter)
WHERE name IS NULL;
100,000 rows should be about 30 seconds or so.
In this case, an index on name does help. Otherwise, each iteration of the loop would in essence be reading the entire table.

Safe solutions for INSERT OR UPDATE on SQL Server 2016

Assume a table structure of MyTable(MyTableId NVARCHAR(MAX) PRIMARY KEY, NumberOfInserts INTEGER).
I often need to either update i.e. increment a counter of an existing record, or insert a new record if it doesn't exist with a value of 0 for NumberOfInserts.
Essentially:
IF (MyTableId exists)
run UPDATE command
ELSE
run INSERT command
My concern is losing data due to race conditions, etc.
What's the safest way to write this?
I need it to be 100% accurate if possible, and willing to sacrifice speed where necessary.

MERGE statement can perform both UPDATE and INSERT (and DELETE if needed).
Even though it is a single atomic statement, it is important to use HOLDLOCK query hint to prevent race condition. There is a blog post “UPSERT” Race Condition With MERGE by Dan Guzman where he explains in great details how it works and provides a test script to verify it.
The query itself is straight-forward:
DECLARE #NewKey NVARCHAR(MAX) = ...;
MERGE INTO dbo.MyTable WITH (HOLDLOCK) AS Dst
USING
(
SELECT #NewKey AS NewKey
) AS Src
ON Src.NewKey = Dst.[Key]
WHEN MATCHED THEN
UPDATE
SET NumberOfInserts = NumberOfInserts + 1
WHEN NOT MATCHED THEN
INSERT
(
[Key]
,NumberOfInserts
)
VALUES
(
#NewKey
,0
);
Of course, you can also use explicit two-step approach with a separate check if a row exists and separate UPDATE and INSERT statements. Just make sure to wrap them all in a transaction with appropriate table locking hints.
See Conditional INSERT/UPDATE Race Condition by Dan Guzman for details.

Deleting two rows based on successful deletion of another row which is based on a column condition - Sybase

I am trying to delete two rows based on only if another 3rd row is deleted. Now, the catch here is, the 3rd row has a condition where only for value "A" of column "MAN", it should delete that corresponding row.
i am doing it in Sybase (SQL Anywhere using Interactive SQL IDE).
I am unable to use ##ROWCOUNT.
Could someone please give me the query for this or how to achieve this please.
ALGORITHM:
DELETE FROM EFS
WHERE NAME='MAN' AND VAL='A'
If (DeleteSuccessful)
{
DELETE FROM EFS
WHERE NAME='MAN_1' AND NAME='MAN_2'
}
I want to achieve this in a single query in Sybase.
ALTERNATIVE APPROACH:
I can also think of achieve this by first checking the value of VAL column and if it is XX, then I can write a query to delete all 3 rows using WHERE NAME ='MAN AND NAME ='MAN_1' AND NAME ='MAN_2'. this is also one approach. But don't know how to do it using syntax in Sybase in a single query/

I think you should check if these values exists and do it in one query
DELETE FROM EFS
WHERE (NAME='MAN' AND VAL='A')
OR
(
(NAME='MAN_1' AND NAME='MAN_2')
AND
EXISTS (SELECT * FROM EFS WHERE NAME='MAN' AND VAL='A')
)

I believe both you and valex are both getting the correct result, but your syntax is a little off. Also better to use IN vs. OR for performance.
IF EXISTS (SELECT * FROM EFS WHERE NAME='MAN' AND VAL='A')
BEGIN
DELETE FROM EFS WHERE NAME IN ('MAN','MAN_2','MAN_3')
END
And, I think you can use ##rowcount, perhaps you don't consider this a single statement approach but it also works, e.g.
select * from tempdb..test
irecord
1
2
3
99
declare #deleted INT
delete from tempdb..test where irecord = 3
select #deleted = ##rowcount
if #deleted > 0
begin
print 'deletion detected'
delete from tempdb..test where irecord IN (1,2)
end
else
print 'no deletion detected'
1 row(s) affected.
1 row(s) affected.
deletion detected
2 row(s) affected.
select * from tempdb..test
irecord
99

how to know how many rows will be affected before running a query in microsoft sql server 2008

i've read a bit about ROWCOUNT but its not exactly what im looking for. from my understanding rowcount states the number of rows affected AFTER you run the query. what im looking for is knowing BEFORE you run the query. is this possible?

You can also use BEGIN TRANSACTION before the operation is executed. You can see the number of rows affected. From there, either COMMIT the results or use ROLLBACK to put the data back in the original state.
BEGIN TRANSACTION;
UPDATE table
SET col = 'something'
WHERE col2 = 'something else';
Review changed data and then:
COMMIT;
or
ROLLBACK;

Short answer is no..
You cannot get the number of rows before executing the query..atleast in SQL server.
The best way to do it is use
Select count(*) from <table> where <condtion>
then execute your actual query
[delete]or [update] [set col='val']
from <table> where <condtion>

The estimated execution plan is going to give you rows affected based on statistics, so it won't really help you in this case.
What I would recommend is copying your UPDATE statement or DELETE statement and turning it into a SELECT. Run that to see how many rows come back and you have your answer to how many rows would have been updated or deleted.
Eg:
UPDATE t
SET t.Value = 'Something'
FROM MyTable t
WHERE t.OtherValue = 'Something Else'
becomes:
SELECT COUNT(*)
FROM MyTable t
WHERE t.OtherValue = 'Something Else'

Simplest solution is to replace the columns in the SELECT * FROM... with SELECT Count(*) FROM ... and the rest of your query(the WHERE clause needs to be the same) before you run it. This will tell you how many rows will be affected

The simplest solution doesn't seem to work in a case where theres a subquery. How would you select count(*) of this update:
BEGIN TRANSACTION;
update Table1 t1 set t1.column = t2.column
from (
SELECT column from Table2 t2
) AA
where t1.[Identity] = t2.[Identity]
COMMIT;
Here I think you need the BEGIN TRANSACTION

select the rows affected by an update

If I have a table with this fields:
int:id_account
int:session
string:password
Now for a login statement I run this sql UPDATE command:
UPDATE tbl_name
SET session = session + 1
WHERE id_account = 17 AND password = 'apple'
Then I check if a row was affected, and if one indeed was affected I know that the password was correct.
Next what I want to do is retrieve all the info of this affected row so I'll have the rest of the fields info.
I can use a simple SELECT statement but I'm sure I'm missing something here, there must be a neater way you gurus know, and going to tell me about (:
Besides it bothered me since the first login sql statement I ever written.
Is there any performance-wise way to combine a SELECT into an UPDATE if the UPDATE did update a row?
Or am I better leaving it simple with two statements? Atomicity isn't needed, so I might better stay away from table locks for example, no?

You should use the same WHERE statement for SELECT. It will return the modified rows, because your UPDATE did not change any columns used for lookup:
UPDATE tbl_name
SET session = session + 1
WHERE id_account = 17 AND password = 'apple';
SELECT *
FROM tbl_name
WHERE id_account = 17 AND password = 'apple';
An advice: never store passwords as plain text! Use a hash function, like this:
MD5('apple')

There is ROW_COUNT() (do read about details in the docs).
Following up by SQL is ok and simple (which is always good), but it might unnecessary stress the system.

This won't work for statements such as...
Update Table
Set Value = 'Something Else'
Where Value is Null
Select Value From Table
Where Value is Null
You would have changed the value with the update and would be unable to recover the affected records unless you stored them beforehand.
Select * Into #TempTable
From Table
Where Value is Null
Update Table
Set Value = 'Something Else'
Where Value is Null
Select Value, UniqueValue
From #TempTable TT
Join Table T
TT.UniqueValue = T.UniqueValue
If you're lucky, you may be able to join the temp table's records to a unique field within Table to verify the update. This is just one small example of why it is important to enumerate records.

You can get the effected rows by just using ##RowCount..
select top (Select ##RowCount) * from YourTable order by 1 desc

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

is "select count(*) from..." more reliable than ##ROWCOUNT? - sql

If it's ever being used twice in a row, that could be a problem: http://beyondrelational.com/blogs/madhivanan/archive/2010/08/09/proper-usage-of-rowcount.aspx

Related

simple UPDATE query on large table has bad performance

Safe solutions for INSERT OR UPDATE on SQL Server 2016

Deleting two rows based on successful deletion of another row which is based on a column condition - Sybase

how to know how many rows will be affected before running a query in microsoft sql server 2008

select the rows affected by an update

Categories

Resources