How data will load in to memory in sql server - sql

I am a beginner in sql server.I have read about buffer cache in sql server.I think it is the location in system RAM where the data will be stored once the query is executed.If that is correct,i have a few questions about query execution.
1)If my RAM size is 2GB, and i have data in my sql server of 10GB size and if i execute the sqlstatment for retreiving all the data(10 GB) from database,what will happen(Work/Not work)?
2)In the same case, if multiple user execute queries for retrieving 5 GB each(total 10 GB),what will happen?

When you issue a SELECT * FROM [MyTable] and your table has 10Gb on a system that has only 2Gb of RAM a database does not have to read the entire 10 Gb at once in memory. The select will start a scan of the data, starting with the first data page. For this it only needs that page (which is 8Kb) in memory, so it reads the page and consumes 8Kb of RAM. As it scans this page, it produces output that you see as the result set. As soon as is done with this page, it needs the next page so it read is in memory. It scans the records in it and produces output for your result. Then next page, then next page. The key point is that once is done with a page, its no longer needed. As it keeps adding these 8kb pages into RAM, they will eventually add up and consume all free RAM. In that moment SQL will free the old, unused, pages in RAM and thus make room for new ones. It will keep doing so until the entire 10Gb of your table are read.
If there are two users reading a table of 5GB each, things work exactly the same. Each user's query is scanning only one page at a time, and as they make progress and keep reading pages, they will fill up the RAM. When all the available RAM is used, SQL will start discarding old pages from RAM to make room for new ones.
In the real world things are fancier because of considerations like read-ahead.
And as a side note, you should never scan 10Gb of data. Your application should always request only the data it needs, and the data should be retrievable fast by using an index, exactly to avoid such large scan that needs to examine the entire table.

Thankfully you don't have to worry about this. Sure, it is important to tune your queries, minimize resultsets for network transfer, etc. etc. but SQL Server has been around a long time and it's very good at its own memory management. Unless you encounter a specific query that misbehaves, I'd say don't worry about it.

As you noted, data retrieved goes into the buffer cache. Some of that is real memory, some is spooled off to disk.
You can observe whether or not you're reusing memory by watching the Page Life Expectancy performance counter (there other indicators too, but this one is a quick shorthand). If you run a query that returns huge amounts of data and pushes other data out of the cache, you can watch the page life expectancy drop. As you run lots & lots of small queries, you can see the page life expectancy grow, sometimes to a lenght of days & days.
It's not a great measure for memory pressure on the system, but it can give you an indication of how well you're seeing the data in cache get reused.

the full resultset of a query is not stored on the RAM by sql server for reusing. What can be stored is the execution plan used on a query.
SQL Server stores data to manipulate it, like on an update statement, it reads the data from the DB, stores on the RAM, edit it and then another process writes it back to the DB. BUt as #n8wrl said, you dont need to worry about it.

Related

Is there any logic in just maxing tempdb and never having it change size?

The reason I ask it we have a dedicated RAID10 array with ~150GB for the tempdb (the "t" drive). It is only used for storing tempdb. The t drive isn't used by by SQL Server or any other process for anything else.
Our DBA has tempdb setup with 15GB initial size and autogrow 20% increments. Everytime the server starts it resized to 15GB and then over the course of the day grows to ~80GB (on average). Now IT is looking into making initial size larger say 30 or 40GB but given the drive is ONLY used for tempdb my thinking is why not "max it" right away.
Is the any negative effect to simply create 4 data files in the primary group for tempdb give them each an initial size of 30GB (120GB total), turn autogrow off and be done with it?
Are there any limits on SQL Server ability to span multiple tempdb data files in one query? i.e. will it cause problems if the tempdb has say 70GB total free but the file used by one process is full (30 of 30GB used)?
I would size them to about 100GB and leave autogrow on, this way you don't have to wait for it to grow every time, I would also add multiple files
Is the any negative effect to simply
create 4 data files in the primary
group for tempdb give them each an
initial size of 30GB, turn autogrow
off and be done with it?
Sounds like a good plan to me, however I would leave autogrow on just in case someone decides to do a sort operation on a big table which doesn't have an index on that column
See also here: http://technet.microsoft.com/en-us/library/cc966534.aspx
It is recommended to have .25 to 1
data files (per filegroup) for each
CPU on the host server.
This is especially true for TEMPDB
where the recommendation is 1 data
file per CPU.
Dual core counts as 2 CPUs; logical
procs (hyperthreading) do not.
We have found it very useful to create large TempDB data and log files. Any actions that limit server OS activities such as resizing TempDB increase server efficiencies. We have a 16 processor machine with 113 GB dedicated to TempDB data space. This machine is dedicated to large SSIS ETL processes, thus resulting in mass data operations.
The bulk of our ETL operations spawn up to 4 SQL threads. After initially configuring a TempDB file for each processor (16), we quickly realized via performance monitoring that our configuration was forcing SQL\windows to unnecessarily span the multiple TempDB files. We settled on 5 larger TempDB data files and realized performance improvements. We have since moved on to a 24 processor box and are using 8 TempDB files.
Please note that this is a large data migration server; I’m sure transaction-oriented systems would still benefit from the recommended 1-1 processor to TempDB file configuration. It should also be noted that having a large increase % on a TempDB file may force a critical transaction to take the windows operation hit and thus may not be appropriate for your specific application.

SQL Server 2005 "Pin" data in Memory

We're running our application's database on dedicated box running only SQL Server 2005.
This DB server has 32 Gb of RAM... and the database file itself is only 6 Gb.
I'd like to force several of the heavily read/queried tables into the SQL Memory buffer to increase speed.
I understand that SQL server is really good about keeping necessary data cached in memory once it's read from disk... But our clients would probably prefer their query running quickly the FIRST time.
"Fastest Performance the Second Time" isn't exactly a product highlight.
Short of the old "Pin Table" DBCC command.. any thoughts?
I've written a "CacheTableToSQLMemory" Proc which Loops through all of a table's Indexes (Clustered & Non) , performing a "Select *" into a Temp table. I've scheduled SQL Agent to run a "cache lots of tables" Proc every 15 minutes in an attempt to keep pages in Memory.
It works to a large extent.. but even after I cache all of a query's relevant tables, running a query still increased the Count of Cached pages for that table. then it's faster the 2nd time.
thoughts?
We're running PAE & AWE. SQL is set to use between 8 & 20 GB of RAM.
The x86 bottleneck is your real issue. AWE can serve only data pages, as they can be mapped in and out of the AWE areas, but every other memory allocation has to cram in the 2GB of the process virtual address space. That would include every thread stack, all the code, all the data currently mappen 'in use' from AWE and, most importantly, every single cached plan, execution plan, cached security token, cached metadata and so on and so forth. and I'm not even counting CLR, I hope you don't use it.
Given that the system has 32GB of RAM, you can't even try /3GB and see if that helps, because of the total PAE reduction to 16GB in that case that would make half your RAM invisible...
You realy, really, have to move to x64. AWE can help only that much. You could collect performance counters from the Buffer Manager and Memory Manager objects and monitor sys.dm_os_memory_clerks so you could get a better picture of how is the instance memory behaving (where does the memory in use go, who is consuming it etc). I don't expect that will help you solve the issue really, but I do expect it will give you enough information to make a case for the upgrade to x64.
There is no way to pin tables in memory in SQL Server 2005. If SQL Server is dropping the tables from memory, it's because there is memory pressure from other parts of the system. Since your database is only 6GB, the database should stay in memory... provided that there are no other databases on the server.
There are a few things you can do to try to keep data in memory, though. Depending on the patch level and edition of your SQL Server installation, you might be able to make use of the lock pages in memory functionality to ensure that SQL Server's memory never gets paged out.
You can also change the memory allocation on the server to be a fixed size. Unless there's something else on your database server, you can set SQL Server's min and max memory to the same value. This won't necessarily prevent this from happening in the future (it's a function of how SQL Server is supposed to work) but it certainly won't hurt to set your SQL Server to use a fixed amount of memory (if you have no other memory concerns).

Is having multiple data/log files a good thing even on the same LUN?

I have read that it is a good idea to have one file per CPU/CPU Core so that SQL can more efficiently stream data to and from the disks. Ok, I can see the benefit if they are on different spindles, but what if I only have one spindle (4 drives in Raid 10) for my data files (.mdf and .ndf), will I still benefit from splitting the data files (from just the .mdf file to a .mdf and several .ndf files)? Same goes for the log file, although I see no benefit to it as the data has to be written serially and you're limited by the spindle's sequential write speed...
FYI, this is in regards to SQL Server 2005/2008...
Thanks.
The recommendation for multiple tempdb data files is definitely not about IOPS. It is about contention on the allocation pages (GAM, SGAM, PFS) in tempdb. SQL 2005+ doesn't require as big of a load on these pages, but contention still occurs. Not all system require a 1 file to 1 core mapping. Most sytems will perform well with 1 file to 2 or 4 cores. Having too many files adds overhead for managing the files. A good recommendation is to start with 1:4 or 1:2 and increasing if contention continues. Don't go above 1:1.
For other databases, this is not recommended.
And yes, only 1 log file ... always.
8 Steps to better Transaction Log throughput:
Create only ONE transaction log file.
Even though you can create multiple
transaction log files, you only need
one... SQL Server DOES not "stripe"
across multiple transaction log files.
Instead, SQL Server uses the
transaction log files sequentially.
Misconceptions around TF 1118:
Why is the trace flag not required so
much in 2005 and 2008? In SQL Server
2005, my team changed the allocation
system for tempdb to reduce the
possibility of contention. There is
now a cache of temp tables. When a new
temp table is created on a cold system
(just after startup) it uses the same
mechanism as for SQL 2000. When it is
dropped though, instead of all the
pages being deallocated completely,
one IAM page and one data page are
left allocated, and the temp table is
put into a special cache. Subsequent
temp table creations will look in the
cache to see if they can just grab a
pre-created temp table 'off the
shelf'. If so, this avoids accessing
the allocation bitmaps completely. The
temp table cache isn't huge (I think
it's 32 tables), but this can still
lead to a big drop in latch
contention in tempdb.
So the answer is NO to both questions. Log striping was never an issue, and one-NDF-per-CPU is largely a myth, one that will take a very long time to die out. Multiple files IMHO make sense only if you can stripe IO (separate LUNs). Multiple filegroups though make sense, but not for IO reasons, for administrative purposes: piecemeal restores and archive read-only filegroups.
Still good. This is not about IOPS - it is about SQL Server BLOCKING a file for certain operations. mostly when file extends are allocated to a table / index. If you do a lot of inserts / updates, multiple files basically mean another thread will block another file, not wait on the first one.
So, this is not really about IOPS loads, it is about a blocking behavior.

mysql slow on first query, then fast for related queries

I have been struggling with a problem that only happens when the database has been idle for a period of time for the data queried. The first query will be extremely slow, on the order of 30 seconds and then related queries will be fast like 0.1 seconds. I am assuming this is related to caching, but I have been unable to find the cause of it.
Changing the mysql variables tmp_table_size, max_heap_table_size to a larger size had no effect except to create the temp tables in memory.
I do not think this is related to the query itself as it is well indexed and after the first slow query, variants of the same query do not show up in the slow query log. I am most interested in trying to determine the cause of this or a way to reset the offending cache so I can troubleshoot the issue.
Pages of the innodb data files get cached in the innodb buffer pool. This is what you'd expect. Reading files is slow, even on good hard drives, especially random reads which is mostly what databases see.
It may be that your first query is doing some kind of table scan which pulls a lot of pages into the buffer pool, then accessing them is fast. Or something similar.
This is what I'd expect.
Ideally, use the same engine for all tables (exceptions: system tables, temporary tables (perhaps) and very small tables or short-lived ones). If you don't do this then they have to fight for ram.
Assuming all your tables are innodb, make the buffer pool use up to 75% of the server's physical ram (assuming you don't run too many other tasks on the machine).
Then you will be able to fit around 12G of your database into ram, so once it's "warmed up", the "most used" 12G of your database will be in ram, where accessing it is nice and fast.
Some users of mysql tend to "warm up" production servers following a restart by sending them queries copied from another machine for a while (these will be replication slaves) until they add them into their production pool. This avoids the extreme slowness seen while the cache is cold. For example, Youtube does this (or at least it used to; Google bought them and they may now use Google-fu)
MySQL Workbench:
The below isn't 100% related to this SO question, but the symptoms are very related and this is the first SO result when searching for "mysql workbench slow" or related terms, so hopefully it's useful for others.
Clear the query history! - following the process at MySql workbench query history ( last executed query / queries ) i.e. create / alter table, select, insert update queries to clear MySQL Workbench's query history really sped up the program for me.
In summary: change the Output pane to History Output, right click on a Date and select Delete All Logs.
The issue I was experiencing was "slow first query" in that it would take a few seconds to load the results even though the duration/fetch were well under 1 second. After clearing my query history, the duration/fetch times stayed the same (well under 1 second, so no DB behavior actually changed), but now the results loaded instantly rather than after a few second delay.
Is anything else running on your mysql server? My thought is that maybe after the first query, your table is still cached in memory. Once it's idle, another process is causing it to be de-cached. Just a guess though.
How much memory do you have any what else is running?
I had an SSIS package that was timing out. The query was very simple, from a single MySQL table, but it sometimes returned a lot of records and would sometimes take a few minutes initially to execute, then only a few milliseconds afterwards if I wanted to query it again. We were stuck with the ADO connection, which meant it would time out after 30 seconds, so about half the databases we were trying to load were failing.
After beating my head against the wall I tried performing an initial query first; very simple and only returning a few rows. Since it was so simple it executed fast and set the table in the cache for faster querying. In the next step of the package I would do the more complex query which returned the large data set that kept timing out. Problem solved - all tables loaded. I may start doing this on a regular basis, the complex queries execute much faster by doing a simple query first.
Ttry and compare the output of "vmstat 1" on the linux command line when running the query after a period of time, vs when you re-run it and get results fast. Specifically check the "bi" column (that's the kb read from disk per second).
You may find the operating system is caching the disk blocks in the fast case (and thus a low "bi" figure), but not in the slow case (and hence a large "bi" figure).
You might also find that vmstat shows high/low cpu usage in either case. If it's low when fast, and disk throughput is also low, then your system may still be returning a cached query, even though you've indicated the relevant config value is set to zero. Perhaps check the output of show engine innodb status and SHOW VARIABLES and confirm.
innodb_buffer_pool_size may also be set high (it should be...), which would cache the blocks even before the OS can return them.
You might also find that "key_buffer" is set high - this would cache the keys in the indexes, which could make your select blindingly fast.
Try check the mysql performance blog site for lots of useful info.
I had issue when MySQL 5.6 was slow on first query after idle period. This was a connection problem, not MySQL instance problem, e.g. if you run MYSQL Query Browser execute "select * from some_queue", leave it alone for a couple of hours, then execute any query, it runs slow, while at the same time processes on server or new instance of Browser will select from same tables instantly.
Adding skip-host-cache, skip-name-resolve to my.ini file solved this problem.
I don't know why is that. Why I tried this: MySQL 5.1 without those options was slowly establishing connections from other networks (e.g. server is 192.168.1.100, 192.168.1.101 connects fast, 192.168.2.100 connects slow), MySQL 5.6 didn't have such problem to start with so we didn't add these to my.ini initially.
UPD: Solved half the cases, actually. Setting wait_timeout to maximum integer fixed the other half. Maybe I even now can remove skip-host-cache, skip-name-resolve and it won't slow down in 100% of the cases

High PF Usage on SQL Server

We are running SQL Server 2005 on 64 bit. The PF Usage reaches close to 25 GB every week. Queries that normally take less than a second to run become very slow during this time. What could be causing this?
After running PerfMon, the two counters, Total Server Memory and Target Server Memory show 20 GB and 29 GB respectively. Processor Queue Length and Disk Queue Length are zero.
Sounds like you don't have enough memory, how much is on the Server? More than your page file?
Also could mean Sql Server has "paged out" meaning Windows decided to swap all the info it stored in memory onto the disk.
Open Perfmon ( Goto a command prompt, and type perfmon ) and add these counters:
SQLServer:Buffer Manager - Buffer cache hit ratio
SQLServer:Buffer Manager - Page life expectancy
SQLServer:Memory Manager - Memory Grants Pending
If Buffer Cache Hit Ratio is < 95% it means Sql is using the disk instead of memory a lot you need more memory.
If your page life expectancy is < 500 it mean SqlServer is not keeping results cached in memory, you need more memory.
If you have a lot of Memory Grants Pending, you need more memory.
There are also two stats which let you know how much memory SqlServer wants and how much its actually using. They are called something like "Total Memory Used" and "Total Memory Requested". If Requested > Used, guess what, you need more memory.
There are also some dmv's you can use to determine if your queries are being held up while waiting for memory to free up. I think its sys_dmv.os_wait_stats, something like that.
I community wikied this so a real dba can come in here and clean this up. Don't know the stats off the top of my head.
This is a great read on how to use DMV's to determine memory consumption on sql: http://technet.microsoft.com/en-us/library/cc966540.aspx - look for the 'memory bottlenecks' section.
One big thing is to determine if the memory pressure is internal to SQL (usually the case) or external (rare, but possible). For example, perhaps nothing is wrong with your server, but a driver installed by your antivirus program is leaking memory.