Extremely slow query, no blocking, not bad execution plan: Why?

Extremely slow query, no blocking, not bad execution plan: Why? - sql

I have a table valued function that insists of 8 INSERT INTO statements. For each INSERT INTO statement, they follow the same structure as:
DECLARE #TABLE1 TABLE (Column1, 2, .....);
INSERT INTO #TABLE1 (column1, 2, ...)
SELECT * FROM BASETABLE1
The function returns a table that joins those 8 tables (using LEFT HASH JOIN). In the past, this function would run about 7 minutes to complete (the base table(s) are big that of over 1 million data), but most recently, this function is very slow and took forever.
Here are what I've done to get an idea of the slowness:
I've checked the blockings, no blocks of the server;
I've checked the execution plan, no major changes recently, and the base tables are properly indexed with updated statistics;
I looked at the sp_who2, I have to admit this server is pretty busy, a lot of agent jobs and tableau connections are going on at the same time, and quite some processes involve with this function. However, I would say the level of business is the same as before and now, and why the slowness is just happened recently?
I also checked the active expensive queries, and obviously the most expensive ones are those INSERT INTO queries;
We patched the windows (that hosting this SQL Server) about 10 days ago, so I wondered if that patch has any impacts on this slowness? We also rebooted the SQL Server 4 days ago trying to fix the issue. The situation was better after the reboot (but not as good as the issue happened before), and today, it became worse again.
Anything else that I might missed?

Your
insert into ( columns1, 2, etc... )
select * from BaseTable
Is that your actual query? The *, I would expect you to explicitly indicate the columns you wanted to pull. I have seen weird things if a table structure was altered, are you getting what you really think are the correct columns and same sequence as insert is expecting? Also, you have no WHERE clause, so you are pulling the entire database from one into a memory table and for what benefit / purpose?
You mention a table valued function. Is that what this overly broad insert statement is, or is there some other context to it and you are just masking it for us?

Related

Inserting records into a heap takes a really long time

I have a pre-defined table of about 35 fields, mostly defined as nvarchar(255) and nullable.
There are no constraints, indexes or triggers of any kind on this table.
I have a query that returns roughly 32k records. If I insert this directly into a new permanent table on the fly, ie
SELECT <<fields>> INTO dbo.MadeUpTable01 FROM <<my query goes here>>
it takes about 3 seconds.
When I try to insert into my real, pre-defined table, either directly from my query or from my new permanent table, like this:
INSERT dbo.MyRealTable SELECT * FROM dbo.MadeUpTable01
then it takes upwards of 25 seconds.
I'm the only person working on this server, and the DBA has confirmed there is no issue with I/O, CPU, temp table, etc. I'm stumped. What is the problem?
EDIT: you can see here that the table insert is 65% of the query cost, though there should be nothing on this table to inhibit an insert.

The difference in time is caused by logging.
In case of
SELECT <> INTO dbo.MadeUpTable01 FROM <>
you have minimal logging in simple and bulk logged recovery models.
I guess your db is in simple because if no your select into took nearly same time that insert into.
INSERT dbo.MyRealTable SELECT * FROM dbo.MadeUpTable01
is fully logged in your case because you did not use tablock hint on the destination table.
Here The Data Loading Performance Guide you can find more information on logging and minimal logging conditions.

stored proc time outs

We are currently having difficulties with a sql server procedure timing out on queries. 9 times out of 10 the query will run within 5 second max, however, on occasions, the proc can continue to run in excess of 2 mins and causing time outs on the front end (.net MVC application)..
They have been investigating this for over a week now, checking jobs, server performance and all seems to be ok..
The DBA's have narrowed it down to a particular table which is being bombarded from different application with inserts / updates. This in combination with the complex select query that is causing the time out that joins on that table (im being told) is causing the time outs..
Are there any suggestions at all to how to get around these time outs?
ie.
replicate the table and query the new table?
Any additional debugging that can prove that this is actually the issue?
Perhaps cache the data on the front end, if a time out, call data from cache?

A table being bombarded with updates is a table being bombarded with locks. And yes, this can affect performance.
First, copy the table and run the query multiple times. There are other possibilities for the performance issue.
One cause of unstable stored procedure performance in SQL Server is compilation. The code in the stored procedure is compiled the first time it is executed -- the resulting execution plan might work for some inputs and not others. This is readily fixed by using the option to recompile the queries each time (although this adds overhead).
Then, think about the query. Does it need the most up-to-date data? If not, perhaps you can just copy the table once per hour or once per day.
If the most recent data is needed, you might need to re-think the architecture. A table that does insert-only using a clustered identity column always inserts at the end of the table. This is less likely to interfere with queries on the table.
Replication may or may not help the problem. After all, full replication will be doing the updates on the replicated copy. You don't solve the "bombardment" problem by bombarding two tables.
If your queries involve a lot of historical data, then partitioning might help. Only the most recent partition would be "bombarded", leaving the others more responsive to queries.

The DBA's have narrowed it down to a particular table which is being bombarded from different application with inserts / updates. This in combination with the complex select query that is causing the time out that joins on that table (im being told) is causing the time outs
We used to face many time outs and used to get a lot of escalations..This is the approach we followed for reducing time outs..
Some may be applicable in your case,some may not...but following will not cause any harm
Change below sql server settings:
1.Remote login timeout :60
2.Remote query timeout:0
Also if your windows server is set to use Dynamic ram,try changing it to static ram..
You may also have to tune,some of windows server settings
TCP Offloading/Chimney & RSS…What is it and should I disable it?
Following above steps,reduced our time outs by 99%..
For the rest 1%,we have dealt each case seperately
1.Update statistics for those tables involved in the query
2.Try fine tuning the query further
This helped us reduce time outs totally..

Select ... for update vs select on large amount of data

Assume a huge table with several hundred million records and columns are well indexed. Is there any performance concern between
SELECT * from HUGE_TABLE where ... AND ... FOR UPDATE
and
SELECT * from HUGE_TABLE where ... AND ...
the main reason for the FOR UPDATE clause is because we may have several instance of the application running the same query at same time, but need to avoid update conflicts.
At this point is there I concern about two performance issue: 1. if there is no other query running, is select for update slower?. 2. if there are many other active query to select / update on the huge table, what will be the performance to this entire situation on this table (also update in the question)

SELECT ... FOR UPDATE creates at least two performance issues compared to a regular SELECT:
Blocking other sessions. This is a bit obvious and you already understand this, but it's worth mentioning that of course creating more locks can cause performance issues. The good news is that Oracle never escalates locks, so it will only lock exactly what you ask it too.
Writing lock data. The FOR UPDATE works by making a small update to each relevant block. It acts like a regular change in some ways, and will create redo and undo records as you can see by running queries like select used_urec from gv$transaction;. Depending on how many rows are locked, this could be significantly more expensive than a regular SELECT, even if no other sessions are involved.

TOP 100 causing SQL Server 2008 hang?

I have inherited a VERY poorly designed and maintained database and have been using my knowledge of SQL Server and a little luck keeping this HIGH availability server up and not completing coming down in flames (the previous developer, who quit basically just kept the system up for 4 years).
I have come across a very strange problem today. I hope someone can explain this to me so if this happens again there is a way to fix it.
Anyway, there is a stored proc that is pretty simple. It joins two tables together between a SHORT date/time range (5 mins range) and passes back the results (this query runs every 5 mins via a windows service). The largest table has 100k rows, the smallest table has 10k rows. The stored proc is very simple and does:
NOTE:The table and columns names have been changed to protect the innocent.
SELECT TOP 100 m.*
FROM dbo.mytable1 m WITH (nolock)
INNER JOIN dbo.mytable2 s WITH (nolock) ON m.Table2ID = s.Table2ID
WHERE m.RowActive = 1
AND s.DateStarted <= DATEADD(minute, -5, getdate())
ORDER BY m.DateStarted
Now, if I keep "TOP 100" in the query, the query hangs until I stop it (running in SMS or in the stored proc). If I remove the TOP 100, the query works as planned and returns 50-ish rows, like it should (we don't want it to return more than 100 rows if we can help it).
So, I did some investigating, using sp_who, sp_who2, and looked at the master..sysprocesses and used DBCC INPUTBUFFER to look for any SPIDs that might be locking or blocking. No blocks and no locking.
This JUST STARTED today with no changes to these these two tables designs and from what I gather the last time this query/tables have been touched was 3 years ago and has been running without error since.
Now, a side note, and I don't know if this would have anything to do with this. But I reindexed both these tables about 24 hours before because they were 99% fragmented (remember, I said this was poorly designed and poorly maintained server).
Can anyone explain why SQL Server 2008 would do this?

THE ORDER BY is the killer. it has to read all rows, sort by that order by column, and then give you the first 100 rows.

The absolute first thing I would do would do a side by side comparison of the query plans of the full and the top 100 queries and see if the top 100 is not performant. You might need to update stats or even have missing indexes.

I'd presume there's no index on mytable1.DateStarted. I think something might be deciding to perform the sorting earlier on in the query process when you did SELECT TOP 100.

Select million+ records while huge insert is running

I am trying to extract application log file from a single table. The select query statement is pretty straightforward.
select top 200000 *
from dbo.transactionlog
where rowid>7
and rowid <700000 and
Project='AmWINS'
The query time for above select is above 5 mins. Is it considered long? While the select is running, the bulk insertion is also running.
[EDIT]
Actually, I am having serious problem on my current Production logging database,
Basically, we only have one table (transactionlog). all the application log will be insert into this table. For Project like AmWINS, base on select count result, we have about 800K++ records inserted per day. The insertion of record are running 24 hours daily in Production environment. User would like to extract data from the table if user want to check the transaction logs. Therefore, we need to select the records out from the table if necessary.
I tried to simulate on UAT enviroment to pump in the volumn as per Production which already grow up to 10millions records until today. and while i try to extract records, at the same time, I simulate with a bulk insertion to make it look like as per production environment. It took like 5 mins just to extract 200k records.
During the extraction running, I monitor on the SQL phyiscal server CPU is spike up to 95%.
the tables have 13 fields and a identity turn on(rowid) with bigint. rowid is the PK.
Indexes are create on Date, Project, module and RefNumber.
the tables are created on rowlock and pagelock enabled.
I am using SQL server 2005.
Hope you guys can give me some professional advices to enlighten me. Thanks.

It may be possible for you to use the "Nolock" table hint, as described here:
Table Hints MSDN
Your SQL would become something like this:
select top 200000 * from dbo.transactionlog with (no lock) ...
This would achieve better performance if you aren't concerned about the complete accuracy of the data returned.

What are you doing with the 200,000 rows? Are you running this over a network? Depending on the width of your table, just getting that amount of data across the network could be the bulk of the time spent.

It depends on your hardware. Pulling 200000 rows out while there is data being inserted requires some serious IO, so unless you have a 30+disk system, it will be slow.
Also, is your rowID column indexed? This will help with the select, but could slow down the bulk insert.

I am not sure, but doesn't bulk insert in MS SQL lock the whole table?

As ck already said. Indexing is important. So make sure you have an appropriate index ready. I would not only set an index on rowId but also on Project. Also I would rewrite the where-clause to:
WHERE Project = 'AmWINS' AND rowid BETWEEN 8 AND 699999
Reason: I guess Project is more restrictive than rowid and - correct me, if I'm wrong - BETWEEN is faster than a < and > comparison.

You could also export this as a local dat or sql file.

No amount of indexing will help here because it's a SELECT * query so it's most likely a PK scan or an horrendous bookup lookup
And the TOP is meaningless because there is no ORDER BY.
The simultaneous insert is probably misleading as far as I can tell, unless the table only has 2 columns and the bulk insert is locking the whole table. With a simple int IDENTITY column the insert and select may not interfere with each other too.
Especially if the bulk insert is only a few 1000s of rows (or even 10,000s)
Edit. The TOP and rowid values do not imply a million plus

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas