Select Query from web application times out but completes with an error? - sql

Background:
Two nights ago the old-as-hell and very poorly designed website for the company I work for got attacked by a bot that submitted about 5000+ phony orders. In the course of deleting all of those false orders from the database, SQL Management Studio crashed, and the application had to be stopped via task manager and restarted. After that I was getting optimistic concurrence control errors when trying to delete some of the fake records, and had to complete the cleanup via DELETE statement.
(yes, I KNOW it's generally bad practice to delete records from the results pane, but for people like me who aren't actually programmers but get stuck with the IT work because we're the only ones who know how to find the on switch, it makes me less paranoid that I won't delete a record I didn't mean to)
Ever since then, there is a specific page in the admin section of the site that takes a VERY long time to perform a SELECT query for a specific range. The query will complete if you sit there long enough, but here's a screenshot of the ColdFusion error box that comes up with it:
ColdFusion error message
I suspect that between the bot attack and Studio Express crashing in the middle of an DELETE query, part of the table is corrupted, which is why it exceeds the allowable time limit. I don't know if our webhost has a backup of the database (I've been in contact with them the last couple days).
What tools can I use to check for and repair errors on that table?

Related

Current session is no longer available due to structural changes in the database - Tabular

We are using a SQL Server Tabular model which we use for self-service BI purposes. At monthly basis we have some 90 distinct persons who are using the model. Recently we encountered some issues/errors in the client tools(Excel and Power BI) that are connecting to the Tabular model. See screenshots. We did not make any significant changes to the model the past period.
We noticed that the errors keep showing up after our incremental load, i.e. a full process of a number of partitions we process these partitions every 15 minutes. The process is kicked of by a SSIS job which is scheduled every 15 minutes and processes 5 partitions in 3 tables.
Edit: After some research I figured out that the problem lies in the perspectives. Everytime I do a full process on any object. The error appears. This does not happen on the default model view. Still not found a solution though.
The error occurs when you make a change to the power bi report or the excel file. For example when you do a refresh, or when you click a filter. If you press refresh multiple times the connection comes back and everything works as it is supposed to. It seems like the clients lose their connection to the model. After 15 minutes the problem occurs again.
This is very aggravating for the users. Especially when they are in the middle of a presentation.
This is what we tried:
We tried searching Google for a solution
Checked that we have the latest SQL Server 2016 update (13.0.5149.0)
SSAS Builds from Visual Studio(2015 en 2017)
No full process on tables, only on
partitions.
Upgrading the server from 4 to 8 cpu cores.
I hope somebody can help us.
You shouldn't have the error that you are seeing with just a full process of a partition or even the full table. We do this every hour for a number of core tables and we do not see any issues like this (and we would)
I am starting from the hypothesis that
Your 15 minute process is doing more than just processing the partitions with a refresh command
Something else is happening on the environment (either scheduled or not). Who has permissions to change the schema? Could it be users / developers deliberately or not making changes?
The only things that should cause that kind of error would be Alter, Delete or CreateOrReplace TMSL commands
So unless that triggers your own ideas on a diagnostic process I would do the following steps
Note: I presume that your users also see this issue on your test environment when you run your 15 min processing routine on that. You should do the following on that test environment where nothing else is running to eliminate the possibility of someone else interfering with the experiment. If you don't have a representative test environment then you will have to do on live but I would do this out of hours or under some kind of change control process with your 15 minute refresh turned off and admin permissions to the cube heavily locked down to ensure that nothing can interfere with your experiment.
First prove that you can reproduce this issue with the 15 minute routine
Get your sample PowerBI report that is known to present the error (I'd prefer Power BI for a repro as it is slightly simpler than Excel)
Refresh your PowerBI and explore the data to prove that the error doesn't occur
Run your 15 minute process
You should now see the problem reported. If you do, great, you have a reproduceable issue! If you don't then it is not quite as you thought it was and you need to find the way of reliably reproducing these errors. (perhaps something else is happening that isn't the 15 minute process)
So now you are sure how you can reproduce the issue, you need to isolate whether it is really the processing that is causing the problem
Refresh your PowerBI and explore the data to prove that the error doesn't occur
Execute (via SSMS) your XMLA that processes the entire database for one of your tables
it should look something like this
{
"refresh": {
"type": "full",
"objects": [
{
"database": "yourdbname"
}
]
}
}
Do the thing that your users do when they see the issue.
If you too see the issue, then I would raise to Microsoft Support as this shouldn't happen
If you don't see the issue then you can refine this processing to just be the partition for a single table. But as we have done a process for the entire db above if shouldn't change the result
If you still don't see the issue then it isn't the processing that is causing this issue (which I suspect) and it is something else in the 15 minute routine that is causing it. Look deeper into that process and understand what else it is doing.
Alongside this checking the logs should show if there are any other processing tasks or types of XMLA happening.
I hope these ideas get you closer to finding the actual activity that is causing this experience for your users. It would be great if you could post with how you got on and what you found.
I have the same problem here if I install the latest CU on my SQL Server 2017. My production environment is still running with CU3 (Jan/2018) due to this problem.
Knowing that I would suggest reverting your installation to a previous release. Maybe 13.0.5026.0 (SP2) or even to the 13.0.4466.4 (Jan/2018).
I am facing the same issue with SQL Server 2017 CU 11 installed.
The issue indeed occurs in case of a 'full refresh' in combination with the use of a 'perspective' in an existing connection. The workaround to use the default 'Model' in the connection does indeed 'solve' the issue.

SQL Server Locking

I have an application that connects with a SQL Server database and cycles through batches of records to perform various tasks and then updates the database accordingly (i.e. "success", "error", etc...).
The potential problem I'm running into is that since it takes roughly a minute or so to get through each record (long story), if I have more than one user running the application there's a high chance of "data collisions" or users trying to process the same records at the same time. Which cannot happen if it is to execute properly.
Initially, I thought about adding a LOCKED column to help the application determine if the record was already opened by another user, however if the app were to crash or to be exited without completing the record it was currently on, then it would show that record as opened by another user indefinitely... right? Or am I missing an easy solution here?
Anyway, what would be ideal is if it were possible to have the application SELECT 100 records at a time, and "lock them out" on the database while the application processes them AND so that other users can run the application and SELECT a different set of 100 so as not to overlap. Is that possible? I've tried to do some research on the matter, but to be honest my experience in SQL Server is very limited. Thanks for any and all help!

RavenDB taking forever to show updates

I'm starting to assess our company using RavenDB for storing some stuff that doesn't really belong in a relational database (we're traditionally a SQL Server shop). I installed RavenDB locally on my machine, created a database, added a document. Nice!
Being a DBA, I decided to see how backups/restores work. I backed up my database, deleted it, then restored it from the backup. After refreshing my admin screen, I saw my database. I clicked on it, and got a message that the database doesn't exist.
After a couple hours, I tried again. Still doesn't exist. A full day later, I walk into work, and try again. This time the database works. I've had similar situations with updating documents. The update seems to take anywhere between 1 second - several hours to show an update...
Is this normal for RavenDB?? Am I completely misconfigured?? I run SQL Server on my local machine and it's lightning-fast, so I can't imagine updating a single document could take that long. As-is, I can't imagine recommending we use RavenDB for anything.
Are you querying using indexes or getting documents by ID? Documents should be updated immediately (ACID). If indexes are slow to update (check their status using RavenDB Studio), it could be a configuration problem or something external like an anti-virus software can cause them to update slowly.
Apparently, at least for the document-update latency, the default for caching in queries is enabled, so I was getting cached results.
Jeffery,
No, that isn't normal by a long short. You should be able to immediately see what was changed.
Note that certain AV products will interfere with the HTTP pipeline and can affect RavenDB's usage. The studio will also auto update things only every 5 seconds (to reduce UI jitter), but that is about it.
Restoring a database (from the same machine), should take only as long as it take to copy the files (pure I/O bound operation).
If this is from another machine using a different version of Windows, we might need to run a check on the file, which can take a bit of time, but that doesn't sound like your scenario

SQL timeouts at 3 minutes past full hour

I have a problem that has puzzled me for a while now. Once in a while, say 4-5 times a week we get timeouts from the database at HH:03 (or HH:02 sometimes I think).
I've been digging into the scheduled tasks on the server to investigate if there is something that puts the server to it's knees in performance without any findings.
I've also gone so fas so that I've made a watchdog for the application so that when the query only has 1 seconds left of it's max query time it checks the processlist for the database and emails this to me. The process list always contains just one entry and that's the entry that is about to get an timeout exception.
To further add to the complexity we have many customers to this application but it's only one of the customers that get this timeout. All customers run the same code but have different databases, different application pools with different application pool identities.
The application is an ASP.net application. The database is a Microsoft SQL 2008 R2 Express edition.
Has anyone heard of something like this? Can anyone give me any pointers about what to investigate in order to resolve this issue?
Kind regards

Query causes mysql server to go away

We have an application that has been deployed to 50+ websites. Across these sites we have noticed a piece of strange behaviour, we have now tracked this to one specific query. Very occasionally, once or twice a day usually, one of our debugging scripts reports
2006 : MySQL server has gone away
I know there are a number of reasons this error can be thrown but the thing that is most strange is that every single time it is thrown it happens from the same SQL query being run. There is nothing strange or complex about this query, it looks like this:
SELECT `advert_only` FROM `products` WHERE `id` = '6197'
This query must run tens of thousands of times a day, for various different product IDs so it certainly doesnt fail each time. It fails randomly on seemingly random sites across our 4 servers. There is seemingly no commonality, one small thing we have noticed is that it sometimes will happen on 2 or 3 page loads in a row for 1 specific person as we also track the IP of the person it has happened to.
This is on CentOS 5 servers running MySQL 5.0.81
This is kind of in left field, but you should check your harddisk SMART for any errors. If there are issues reading from "that" sector then there may be issues. If you have a raid unit I wouldn't worry too much about this. I wouldn't give a high probability to this being the problem, but if you are really stumped then it might be worth it.
on http://bugs.mysql.com/bug.php?id=1011 the second comment says that: "the 'MySQL server has gone away' error is caused by a query longer than the max_allowed_packet."
there is some more information on fixing it here: http://bogdan.org.ua/2008/12/25/how-to-fix-mysql-server-has-gone-away-error-2006.html
That means that sql connection was idle for too long. Check if there are some slow operations performed before your sql-query.