Select query using Entity Framework stresses the SQL server - sql

I'm having insane simple query: It pulls an ID from one table. The implementation is done using EF 3.5.
This query is repeated in a loop, where I collected a ID from a file and do the search in the database. When running this program, the SQL server is stressed like crazy (the processor utilization soars to 100% for all 16 cores).
It looks like the table of this query is completely locked and nobody gets in anymore. I've read about the necessity to use DbTransaction (begin transaction, commit) or TransactionScope, but the thing is I'm only selecting/reading.
Also it's one query, which is atomic in itself, so the use of Transaction(Scope) is shady at best.
I did try an implementation, but that doesn't seem to do it.
My (LINQ) query: Image image = context.Images.First(i => i.ImageUid == identifier)
Any thoughts on why this is happening? Again I'd like to stress that I'm only selecting/reading records. I don't delete or update records in the database. This is so insanely straight forward that it is frustrating!
For sake of being complete (my attempt at a fix):
// This defaults the isolation level to 'READ COMMITTED' which
// doesn't lock the table when querying.
DbTransaction trx = context.Connection.BeginTransaction();
string isolationLevel = trx.IsolationLevel.ToString();
Image image = context.Images.First(i => i.ImageUid == identifier);
trx.Commit();
NEW: The profiler shows that the Entity framework is doing a SELECT TOP(1) in the image table. This amounts to a MASSIVE amount of reads, hundreds of thousands!
That would suggest that there is no index, but I've looked it up (see comments) and there is one! Also very weird, on the logout, again hundreds of thousands of reads.
I decided to throw out the Entity Framework and do this query using SqlConnection and SqlCommand, but the result is the same!
Next we copied the sp_executesql in the management console and found it took an amazing 4 seconds to execute. Doing the query 'direct' gives an instant result.
Something in the sp_executesql appears to slow things to a crawl. Any ideas?

I think I got it... After finding out that sp_executesql was the culprit it became clear.
See http://yasirbam.blogspot.nl/2009/06/spexecutesql-may-cause-slow-perfomance.html
Due to the stupid conversion, the index on the table is NOT used!
That explains everything visible in the SQL Profiler.
Right now the tool is being tested and it's as fast as lighting!!

Related

How does query execution on SQL Server from .NET differ from Management Studio?

I investigated a problem when running a certain set of searches (from a .NET 3.5 application) against a Full Text Search DB on a SQL Server 2008 R2. Using profiler I extracted the long running query (120 seconds until Command Timeout was reached) and ran it in my SQL Server Management Studio. Duration was "0 Seconds" and depending on which one I tried 0 to 6 rows were returned.
The query looks like follows:
exec sp_executesql
N'SELECT TOP 1000 [DBNAME].[dbo].[FTSTABLE].[ID] AS [Id], [DBNAME].[dbo].[FTSTABLE].[Title], [DBNAME].[dbo].[FTSTABLE].[FirstName], [ABOUT 20 OTHERS]
FROM [DBNAME].[dbo].[FTSTABLE]
WHERE ( (
( Contains(([DBNAME].[dbo].[FTSTABLE].[Title], [DBNAME].[dbo].[FTSTABLE].[FirstName], [ABOUT 10 OTHERS]), #FieldsList1))
AND ( Contains(([DBNAME].[dbo].[FTSTABLE].[Title], [DBNAME].[dbo].[FTSTABLE].[FirstName], [ABOUT 10 OTHERS]), #FieldsList2))
AND ( Contains(([DBNAME].[dbo].[FTSTABLE].[Title], [DBNAME].[dbo].[FTSTABLE].[FirstName], [ABOUT 10 OTHERS]), #FieldsList3))
))'
,N'#FieldsList1 nvarchar(10),#FieldsList2 nvarchar(10),#FieldsList3 nvarchar(16)'
,#FieldsList1=N'"SomeString1*"'
,#FieldsList2=N'"SomeString2*"'
,#FieldsList3=N'"SomeString3*"'
The query looks a little weird as it is generated from an OR Mapper, but right now I don't want to optimize the query, as in SSMS it runs in less than one second, which shows it is not really the query making trouble.
I wrote a small testprogram:
SqlConnection conn = new SqlConnection("EXACTSAMECONNECTIONSTRING_USING_SAME_USER_ETC")
conn.Open();
SqlCommand command = conn.CreateCommand()
command.CommandText = "EXACTLY SAME STRING, LITERALLY, AS ABOVE IN SSMS- exec sp_executessql.....";
command.CommandTimeout = 120;
var reader = command.ExecuteReader();
while(reader.NextResult())
{
Console.WriteLine(reader[0]);
}
I got from my local PC also a SQLException after 120 seconds when command timeout was exeeded.
The SQL Server was at no moment under load heavier than a few single percent. There were no blocks at that table at any time during my tests.
I solved it after some time: I reduced the TOP 1000 to TOP 200 and suddenly the query from .NET code executed also in less than a second.
The questions I have:
Why in general is there such a huge difference between SSMS and simplest SQLCommand .NET code?
Why did reducing to TOP 200 have any effect, especially considering there were max 6 rows in the result.
This is tied to how query plans are built. When you run it in SSMS, you probably replace the variables manually, so it's not the same.
You can read a full explanation here : http://www.sommarskog.se/query-plan-mysteries.html
edit : maybe start with the paragraph "The Default Settings" and look at the results with manual enabling or disabling of ARITHABORT. This is the most common cause.
So the preliminary answer (not yet fully verified due to its complexity) can be derived from Keorl's answer, or mostly from the link provided therein.
To describe the different symptoms, I'll explain what happens:
The SQL Server cached the query against the fulltext indexed table, which includes the execution plan of the query. This means, if the first query to run (which puts the plan into the cache) is a very rare query with an absurd execution plan, this plan is cached and used for all subsequent queries, ruining performance for most runs.
One thing I could reproduce in the end: rerunning the FT indexer/gatherer solved the problem (this time). Also here the explanation is simple: an index update throws away precompiled/cached queries. Thus a better query than the previously cached one could run as the first and store a much better overall plan in the cache.
Answer to Q1: Why in general is there such a huge difference between SSMS and simplest SQLCommand .NET code?
So why didn't this happen with SSMS? Also this can be extracted from Keorl's answer: SSMS circumvents this in setting ARITHABORT option, which results in its own newly compiled query which is then cached. Thus the different observations for the same query just using SSMS and Code.
Answer to Q2: Why did reducing to TOP 200 have any effect, especially considering there were max 6 rows in the result?
For Dynamic SQL as used in example above, cache is stored depending on hashes of the complete query. As the query is different for TOP 200 and TOP 1000 two different compiles would cached. Parameters are not part of the hash though, so queries with just changing parameters would still result in same cache entry being used.
Concluding this: Thanks Keorl for providing the means to find an answer.

What problems may occur while querying SQL databases with big amount of data over internet

I am having this big database on one MSSQL server that contains data indexed by a web crawler.
Every day I want to update SOLR SearchEngine Index using DataImportHandler which is situated in another server and another network.
Solr DataImportHandler uses query to get data from SQL. For example this query
SELECT * FROM DB.Table WHERE DateModified > Config.LastUpdateDate
The ImportHandler does 8 selects of this types. Each select will get arround 1000 rows from database.
To connect to SQL SERVER i am using com.microsoft.sqlserver.jdbc.SQLServerDriver
The parameters I can add for connection are:
responseBuffering="adaptive/all"
batchSize="integer"
So my question is:
What can go wrong while doing this queries every day ? ( except network errors )
I want to know how is SQL Server working in this context ?
Further more I have to take a decicion regarding the way I will implement this importing and how to handle errors, but first I need to know what errors can arise.
Thanks!
Later edit
My problem is that I don't know how can this SQL Queries fail. When i am calling this importer every day it does 10 queries to the database. If 5th query fails I have to options:
rollback the entire transaction and do it again, or commit the data I got from the first 4 queries and redo somehow the queries 5 to 10. But if this queries always fails, because of some other problems, I need to think another way to import this data.
Can this sql queries over internet fail because of timeout operations or something like this?
The only problem i identified after working with this type of import is:
Network problem - If the network connection fails: in this case SOLR is rolling back any changes and the commit doesn't take place. In my program I identify this as an error and don't log the changes in the database.
Thanks #GuidEmpty for providing his comment and clarifying out this for me.
There could be issues with permissions (not sure if you control these).
Might be a good idea to catch exceptions you can think of and include a catch all (Exception exp).
Then take the overall one as a worst case and roll-back (where you can) and log the exception to include later on.
You don't say what types you are selecting either, keep in mind text/blob can take a lot more space and could cause issues internally if you buffer any data etc.
Though just a quick re-read and you don't need to roll-back if you are only selecting.
I think you would be better having a think about what you are hoping to achieve and whether knowing all possible problems will help?
HTH

SQL error:8152, but not over max?

I'm part of a team writing an ERP using , Seam, and Jboss, and on one of my pages, I keep getting an SQL error: 8152 whenever I try to input something. SQL error:8152, for those of you who don't know, is when you try to input a value over the maximum limit of the column.
I've double checked my entity and the database, and their maximum value limits are the same (50 nvarchars). In addition, I'm pretty sure that we're not using audit tables. I then put System.out.println(""); all over the place, and found that the error was happening in between these two println(s):
System.out.println("Flushing");
entityManager.flush();
System.out.println("Flushing complete");
Which is part of a method that process all changes to the table. But I'm pretty new to programming and not sure what's going on.
Any help would be appreciated, thanks in advance, Jeff.
P.s. Code on request, but I didn't post it because there is a lot of it all over the place.
I would verify the SQL that is being executed when the flush() is performed. That way you can see the length of your data and verify that it is too big as shown by the DB error.
If you are using Hibernate, you can output SQL to the console. You don't say what your DB is, but if it's SQL Server you can use the profiler to see what SQL is being executed.

All of a Sudden , Sql Server Timeout

We got a legacy vb.net applicaction that was working for years
But all of a sudden it stops working yesterday and gives sql server timeout
Most part of application gives time out error , one part for example is below code :
command2 = New SqlCommand("select * from Acc order by AccDate,AccNo,AccSeq", SBSConnection2)
reader2 = command2.ExecuteReader()
If reader2.HasRows() Then
While reader2.Read()
If IndiAccNo <> reader2("AccNo") Then
CAccNo = CAccNo + 1
CAccSeq = 10001
IndiAccNo = reader2("AccNo")
Else
CAccSeq = CAccSeq + 1
End If
command3 = New SqlCommand("update Acc Set AccNo=#NewAccNo,AccSeq=#NewAccSeq where AccNo=#AccNo and AccSeq=#AccSeq", SBSConnection3)
command3.Parameters.Add("#AccNo", SqlDbType.Int).Value = reader2("AccNo")
command3.Parameters.Add("#AccSeq", SqlDbType.Int).Value = reader2("AccSeq")
command3.Parameters.Add("#NewAccNo", SqlDbType.Int).Value = CAccNo
command3.Parameters.Add("#NewAccSeq", SqlDbType.Int).Value = CAccSeq
command3.ExecuteNonQuery()
End While
End If
It was working and now gives time out in command3.ExecuteNonQuery()
Any ideas ?
~~~~~~~~~~~
Some information :
There isnt anything that has been changed on network and the app uses local database
The main issue is that even in development environment it donest work anymore
I'll state the obvious - something changed. It could be an upgrade that isn't having the desired effect - it could be a network component going south - it could be a flakey disk - it could be many things - but something in the access path has changed. What other problem indications are you seeing, including problems not directly related to this application? Where is the database stored (local disk, network storage box, written by angels on the head of a pin, other)? Has your system administrator "helped" or "improved" things somehow? The code has not worn out - something else has happened.
Is it possible that this query has been getting slower over time and is now just exceeded the default timeout?
How many records would be in the acc table and are there indexes on AccNo and AccSeq?
Also what version of SQL are you using?
How long since you updated statistics and rebuilt indexes?
How much has your data grown? Queries that work fine for small datasets can be bad for large ones.
Are you getting locking issues? [AMJ] Have you checked activity monitor to see if there are locks when the timeout occurs?
Have you run profiler to grab the query that is timing out and then run it directly onthe server? Is it faster then? Could also be network issues in moving the information from the database server to the application. That would at least tell you if it s SQl Server issue or a network issue.
And like Bob Jarvis said, what has recently changed on the server? Has something changed in the database structure itself? Has someone added a trigger?
I would suggest that there is a lock on one of the records that you are trying to update, or there are transactions that haven't been completed.
I know this is not part of your question, but after seeing your sample code i have to make this comment: is there any chance you could change your method of executing sql on your database? It is bad on so many levels.
Perhaps should you set the CommandTimeout property to a higher delay?
Doing so will allow your command to wait a little longer for the underlying database to respond. As I see it, perhaps are you not letting time enough for your database engine to perform all what is required before creating another command to perform your update.
Know that the SqlDataReader continues to "SELECT" while feeding the in-memory objects. Then, while reading, you require your code to update some other table, which your DBE just can't handle, by the time your SqlCommand requires, than times out.
any chances of a "quotes" as part of the strings you are passing to queries?
any chances of date dependent queries where a special condition is not working anymore?
Have you tested the obvious?
Have you run the "update Acc Set AccNo=#NewAccNo,AccSeq=#NewAccSeq where AccNo=#AccNo and AccSeq=#AccSeq" query directly on your SQL Server Management Studio? (Please replace the variables with some hard coded values)
Have you run the same test on another colleague's PC?
Can we make sure that the SQLConnection is working fine. It could be the case that SQL login criteria is changed and connection is getting a timeout. It will be probably more helpful if you post the error message here.
You can rewrite the update as a single query. This will run much faster than the original query.
UPDATE subquery
SET AccNo = NewAccNo, AccSeq = NewAccSeq
FROM
(SELECT AccNo, AccSeq,
DENSE_RANK() OVER (PARTITION BY AccNo ORDER BY AccNo) NewAccNo,
ROW_NUMBER() OVER (PARTITION BY AccNo ORDER BY AccDate, AccSeq)
+ 10000 NewAccSeq
FROM Acc) subquery
After HLGEM's suggestions, I would check the data and make sure it is okay. In cases like this, 95% of the time it is the data.
Make sure disk is defragged. Yes, I know, but it does make a difference. Not the built-in defragger. One that defrags and optimizes like PerfectDisk.
This may be a bit of a long shot, but if your entire application has stopped working, have you run out of space for the transaction log in your database? Either it's been specified to an absolute size, and that has been reached, or your disk is just full.
May be your tables include more information, and defined SqlConnection.ConnectionTimeout property value in config file with little value. And this value isn't necessary to execute your queries.
you can trying optimize your queries, and also rebuilt indexes.

How to recover the old data from table

I have made an update statement in a table in SQL 2008 which updated the table with some wrong data.
I didn't have a backup for the DB.
It's some important dates which got updated.
Is there anyway where i can recover the old data from the table.
Thanks
SNA
Basically no unless you want to use a commercial log reader and try go through it with a fine tooth comb. No backup of the database can be an 'update resume, leave town' scenario - harsh but it just should not happen.
Andrew basically has called it. I just want to add a few ideas you can consider if you are desperate:
Are there any reports or printouts lying around? Perhaps you can reconstruct the data from there.
Was this data entered via a web application? If so, there is a remote chance you can find the original data in the web server logs, depending upon how the app was constructed, etc.
Does this app interface (pass data to) any other applications? They may have a buffered copy of data...
Can the data be derived from any other existing data? Is there an audit log table, or another date in your schema based on this one, from which you can reconstruct the original date?
Edit:
Some commenters are mentioning that is is a good idea to test your update/delete statements before running them. For this to become habit, it helps if you have an easy method. I usually create my DELETE statements like this:
--delete --select *
from [User]
where UserID=27
To run the select in order to test your query, highlight everything from select onwards. To then run the delete if you are satisfied with the filter criteria, highlight everything from delete onwards. The two dashes in front of delete are so that if the query accidentally gets run, it will just crash due to invalid syntax.
You can use a similar construct for UPDATE statements, although it is not quite as clean.
SQL server keeps log for every transation.So you can recover your modified data from the log as well without backup.
Select [PAGE ID],[Slot ID],[AllocUnitId],[Transaction ID]
,[RowLog Contents 0], [RowLog Contents 1],[RowLog Contents 3],[RowLog Contents 4]
,[Log Record]
FROM sys.fn_dblog(NULL, NULL)
WHERE
AllocUnitId IN
(Select [Allocation_unit_id] from sys.allocation_units allocunits
INNER JOIN sys.partitions partitions ON (allocunits.type IN (1, 3)
AND partitions.hobt_id = allocunits.container_id) OR (allocunits.type = 2
AND partitions.partition_id = allocunits.container_id)
Where object_id=object_ID('' + 'dbo.student' + ''))
AND Operation in ('LOP_MODIFY_ROW','LOP_MODIFY_COLUMNS')
And [Context] IN ('LCX_HEAP','LCX_CLUSTERED')
Here is the artcile, that explains step by step, how to do it.
http://raresql.com/2012/02/01/how-to-recover-modified-records-from-sql-server-part-1/
Imran
Thanks for all the responses.
The problem was actually accidentally ---i missed to select the where condition in the update statement.---Rest !.
It was a quick 5 minutes task --Like just changing the date to test for one customer data--so we didn't think of taking a backup.
Yes of course you are true ..This is a lesson.
Now onwards i will be careful to write "my update statements in a transaction." or "test my update statements"
Thanks once again--for spending your time to give some insight rather ignoring the question since the only answer is "NO".
Thanks
SNA
Always take a backup before major UPDATE statements, even if it's not used, there's the peace of mind
Especially with Red Gate's Object Level Restore, one can restore individual table/row now given a backup file
Good luck, I'd suggest finding an old copy elsewhere (DEV/QA) etc...
Isn't it possible to do a rollback on an UPDATE statement?
Late one but hopefully useful…
If database is in full recovery mode then all transactions are logged in transaction log and can be retrieved. Problem is that this is not natively supported because this is not the main purpose of the transaction log.
Options are:
Commercial tools such as Apex Log (more expensive, more options) or Quest Toad (less expensive, less options for this purpose main focus is on SQL Server management)
Trying to do this yourself, like user1059637 pointed out. Problem with this approach is that it can’t read transaction log backups and is more tedious.
It comes down to how much your data is worth to you in terms of time and $.