where are the query ressults written to during runtime - ssms

I installed SSMS on my windows 10 laptop for school and while on break my mates thought it would be fun to try and crash it, they did this by executing a cartesian product of the database;
select * from CAMPAIGN, COUNTRY, INVENTORY_LEVELS, ORDER_DETAILS,
ORDER_HEADER, ORDER_METHOD, PRODUCT, PRODUCT_FORECAST, PRODUCT_LINE,
PRODUCT_TYPE, PROMOTION, RETAILER, RETAILER_SITE, RETAILER_TYPE,
RETURN_REASON, RETURNED_ITEM, SALES_BRANCH, SALES_STAFF, SALES_TARGET
As i thought it would be fun to see how long it would take my pc to come up with an answer i left it running.
33 million rows (and about 20 minutes) later it stopped with the message "insufficient disk space available" and after forcing SSMS to close I checked my files only to see that my C:\ disk had 0 bytes left, after opening SSMS again it asked if i wanted to open the last recovered version or if i wanted to delete it, naturally i clicked delete but it only gave me back 4 GB on my c:\, which previously had around 32GB free.
so my question; where is the output saved on my pc? i've been looking for over an hour and i can't seem to find it

https://dba.stackexchange.com/questions/21895/system-disk-run-out-of-space-when-running-heavy-sql-queries-on-sql-server-2012
found it a couple of secconds after posting this, so quick answer:
C:\Users\Vecro\AppData\Local\Temp
it turned out to be a 31.7 million kb file see picture
i guess i learned my lesson to always, ALWAYS lock your PC when you leave it with your friends

Related

Problems with TempDb on the SQL Server

I got some problems with my SQL Server. Some external queries write into the Temp db and every 2-3 days it is full and we have to restart the SQL database. I got who is active on it. And also we can check monitor it over grafana. So I get a exact time when the query starts to write a lot of data into the temp db. Can someone give me a tip on how I can search for the user when I get the exact time?
select top 40 User_Account, start_date, tempdb_allocations
from Whoisactive
order by tempdb_allocation, desc
where start_date between ('15-02-2023 14:12:14.13' and '15-02-2023 15:12:14.13')
User_Account
Start_Date
tempdb_allocations
kkarla1
15-02-2023 14:12:14.13
12
bbert2
11-02-2023 12:12:14.13
0
ubert5
15-02-2023 15:12:14.13
888889
I would add this as a comment but I don’t have the necessary reputation points.
At any rate - you might find this helpful.
https://dba.stackexchange.com/questions/182596/temp-tables-in-tempdb-are-not-cleaned-up-by-the-system
It isn’t without its own drawbacks but I think that if the alternative is restarting the server every 2 or 3 days this may be good enough.
It might also be helpful if you add some more details about the jobs that are blowing up your tempdb.
Is this problematic job calling your database once a day? Once a minute? More?
I ask because if it’s more like once a day then I think the answer in the link is more likely to be helpful.

Finding statistical outliers in timestamp intervals with SQL Server

We have a bunch of devices in the field (various customer sites) that "call home" at regular intervals, configurable at the device but defaulting to 4 hours.
I have a view in SQL Server that displays the following information in descending chronological order:
DeviceInstanceId uniqueidentifier not null
AccountId int not null
CheckinTimestamp datetimeoffset(7) not null
SoftwareVersion string not null
Each time the device checks in, it will report its id and current software version which we store in a SQL Server db.
Some of these devices are in places with flaky network connectivity, which obviously prevents them from operating properly. There are also a bunch in datacenters where administrators regularly forget about it and change firewall/ proxy settings, accidentally preventing outbound communication for the device. We need to proactively identify this bad connectivity so we can start investigating the issue before finding out from an unhappy customer... because even if the problem is 99% certainly on their end, they tend to feel (and as far as we are concerned, correctly) that we should know about it and be bringing it to their attention rather than vice-versa.
I am trying to come up with a way to query all distinct DeviceInstanceId that have currently not checked in for a period of 150% their normal check-in interval. For example, let's say device 87C92D22-6C31-4091-8985-AA6877AD9B40 has, for the last 1000 checkins, checked in every 4 hours or so (give or take a few seconds)... but the last time it checked in was just a little over 6 hours ago now. This is information I would like to highlight for immediate review, along with device E117C276-9DF8-431F-A1D2-7EB7812A8350 which normally checks in every 2 hours, but it's been a little over 3 hours since the last check-in.
It seems relatively straightforward to brute-force this, looping through all the devices, examining the average interval between check-ins, seeing what the last check-in was, comparing that to current time, etc... but there's thousands of these, and the device count grows larger every day. I need an efficient query to quickly generate this list of uncommunicative devices at least every hour... I just can't picture how to write that query.
Can someone help me with this? Maybe point me in the right direction? Thanks.
I am trying to come up with a way to query all distinct DeviceInstanceId that have currently not checked in for a period of 150% their normal check-in interval.
I think you can do:
select *
from (select DeviceInstanceId,
datediff(second, min(CheckinTimestamp), max(CheckinTimestamp)) / nullif(count(*) - 1, 0) as avg_secs,
max(CheckinTimestamp) as max_CheckinTimestamp
from t
group by DeviceInstanceId
) t
where max_CheckinTimestamp < dateadd(second, - avg_secs * 1.5, getdate());

R: openxlsx and sqldf

I have a question about using R to read in a file from Excel. I am reading in a few tabs from an Excel worksheet and performing some basic sql commands and merging them using sqldf. My problem is my RAM gets bogged down a lot after reading in the Excel data. I can run the program but had to install 8GB of RAM to not use like 80% of my available RAM.
I know if I have a text file, I can read it in directly using read.csv.sql() and performing the sql in the "read" command so my RAM doesn't get bogged down. I also know you can save the table as a tempfile() so it doesn't take up RAM space. The summarized data using sqldf does not have very many rows so does not bog the memory down.
The only solution I've been able to come up with is to set up an R program that just reads in the data and creates the text files. Close R down and run a second program that reads it back in from the text files using sqldf and performs the SQL commands and merges the data. I don't like this solution as much because it still involves using a lot of RAM in the initial read-in program and uses 2 programs which I would like to just use 1.
I could also manually create the text file from the Excel tab but they are some updates being made on a regular basis at the moment so I'd rather not have to do that. Also I'd like something more automated to create the text files.
For reference, the tables are are 4 tables of the following sizes:
3k rows x 9 columns
200K x 20
4k x 16
80k x 13
100K x 12
My read-in's look like this:
table<-read.xlsx(filename, sheet="Sheet")
summary<-sqldf("SQL code")
rm(table)
gc()
I have tried running the rm(table) and gc() commands after each read-in and sql manipulation (after which I no longer need the entire table) but these commands do not seem to free up much RAM. Only by closing the R session do I get the 1-2 GB back.
So is there any way to read in an Excel file to R and not take up RAM in the process? I also want to note this is on a work computer for which I do not have admin rights so anything I would want to install requiring such rights I would have to request from IT which is a barrier I'd like to avoid.

out of memory sql execution

I have the following script:
SELECT
DEPT.F03 AS F03, DEPT.F238 AS F238, SDP.F04 AS F04, SDP.F1022 AS F1022,
CAT.F17 AS F17, CAT.F1023 AS F1023, CAT.F1946 AS F1946
FROM
DEPT_TAB DEPT
LEFT OUTER JOIN
SDP_TAB SDP ON SDP.F03 = DEPT.F03,
CAT_TAB CAT
ORDER BY
DEPT.F03
The tables are huge, when I execute the script in SQL Server directly it takes around 4 min to execute, but when I run it in the third party program (SMS LOC based on Delphi) it gives me the error
<msg> out of memory</msg> <sql> the code </sql>
Is there anyway I can lighten the script to be executed? or did anyone had the same problem and solved it somehow?
I remember having had to resort to the ROBUST PLAN query hint once on a query where the query-optimizer kind of lost track and tried to work it out in a way that the hardware couldn't handle.
=> http://technet.microsoft.com/en-us/library/ms181714.aspx
But I'm not sure I understand why it would work for one 'technology' and not another.
Then again, the error message might not be from SQL but rather from the 3rd-party program that gathers the output and does so in a 'less than ideal' way.
Consider adding paging to the user edit screen and the underlying data call. The point being you dont need to see all the rows at one time, but they are available to the user upon request.
This will alleviate much of your performance problem.
I had a project where I had to add over 7 million individual lines of T-SQL code via batch (couldn't figure out how to programatically leverage the new SEQUENCE command). The problem was that there was limited amount of memory available on my VM (I was allocated the max amount of memory for this VM). Because of the large amount lines of T-SQL code I had to first test how many lines it could take before the server crashed. For whatever reason, SQL (2012) doesn't release the memory it uses for large batch jobs such as mine (we're talking around 12 GB of memory) so I had to reboot the server every million or so lines. This is what you may have to do if resources are limited for your project.

SSRS - copying report to a new folder increases speed 10x

I've got a SQL 2008R2 report that runs 12,000 times a month. It averages 60-90 seconds per execution.
I've been using SQL for 12 years, but I just started this job 2-3 weeks ago, and am still trying to get my head around some of these SSRS performance problems. It goes without saying I've been re-indexing everything in order to help this report.
Here is a picture / dump of my execution log:
SELECT ReportPath, TimeDataRetrieval, TimeProcessing, TimeRendering, Source, [RowCount]
FROM ReportServer.dbo.ExecutionLog2
WHERE UserName = '_________' AND ReportAction = 'Render'
ORDER BY timeStart desc
http://accessadp.com/?attachment_id=562
ReportPath TimeDataRetrieval TimeProcessing TimeRendering Source RowCount
/CubeReports/Freight Allocation 2954 4402 2039 Live 2348
/RS Reports/Freight Allocation 39954 4087 2380 Live 2348
/RS Reports/Freight Allocation 37718 3948 1888 Live 2348
/RS Reports/Freight Allocation 39534 4317 1937 Live 2348
/CubeReports/Freight Allocation 3257 4206 2422 Live 2348
/RS Reports/Freight Allocation 37517 4164 2402 Live 2348
/RS Reports/Freight Allocation 36127 4151 1986 Live 2348
/RS Reports/Freight Allocation 36415 39888 2569 Live 19048
/RS Reports/Freight Allocation 37544 41644 2071 Live 19048
/RS Reports/Freight Allocation 37970 41003 2187 Live 19048
/RS Reports/Freight Allocation 38057 48085 1885 Live 19048
/CubeReports/Freight Allocation 3030 4558 2056 Live 2348
/CubeReports/Freight Allocation 3534 5232 2422 Live 2348
Please note, I do believe I know what the difference in 'RowCount' is. I had a subreport that was running a dataset (that wasn't important) and I removed it.
I thought that this was the reason for the increase in performance.. but I've double-checked and triple-checked that the subReports no longer have the other dataset (and this is refereced in the decrease in rowcount). Unfortunately, that didn't translate into a decrease in processing time.
I downloaded the report from 'RS Reports', and deployed it to 'CubeReports'.. and I didn't change anything else on this version of the report.
I run it with the same parameters.. and now the copy of the report 'CubeReports' literally runs 10x faster.
I just can't figure out WHY this is happening?
I REALLY need to find the solution and move it into production.
I've checked snapshots, history, execution caching.. none of that is turned on, it all looks like the default setting for both reports.. I've checked all the other options, and I just can't find anything that would explain this.
The only three options I see:
Report Builder 3.0 isn't 'compiling the report' as well as BIDs
does.
Having 3-4 people running the primary report at the same
time I'm doing the test is causing this problem. (We have 300
employees, I really can't test anywhere else, because people run
this all day every day).
Dropping the report and re-deploying the
report, and crossing my fingers that this is going to make it run
10x faster
Unfortunately, I've been able to duplicate the 10x speed increase consistently, I've ran it about 10 times each with the same parameters with the same result. Keep in mind, there is only 1 SSRS server, going against 1 database server. Same sprocs, same parameters.
10x worse performance in the production copy of this report.
10x better performance when I copy it to a new folder.
Primary ERP database is ~100gb, only 4 cores, only 16gb RAM. SSRS Server is on a VM, it is only 2 cores, only 8gb RAM.
There is one additional database that lives on the SSRS Server; it's actually a fairly large database- but not a TON of activity. The other database (Bartender) is only 9gb data / 3gb log.
You might have a problem with the two words folder name. Try in another folder with a space in it's name and check there.
Just my 2 cents... but i lived to see the holy "i" declared global.
I encountered a similar performance problem recently. Using SQL Server Profiler, I tracked it down to the exact same query executing. It would sometimes cause ~1000 times the reads as other queries. The differences appeared to be whether the query was called as SQL or through RPC.
Digging into this further, and by some trial and error, I found the key difference in my case was that the option for ARITHABORT was set differently for the different connections or users.
Unfortunately, I don't remember which setting was the fast one in my case. I wasn't getting a failed query, but the state of this option caused different execution plans to be used. Placing the statement SET ARITHABORT ON or SET ARITHABORT OFF at the beginning of my query brought everything into alignment. ARITHIGNORE and ANSI_WARNINGS are similar settings, so you might look at those as well.
I've discovered that the report that runs faster says Credentials used to run this report are not stored and the report that runs slow says Default report parameter values are missing. I'm going to go back and double-check my defaults on the parameters.
I DO have a discrepancy in the default parameters. I'm probably going to go forward with dropping the report and re-deploying it.