Performance questions for SQL Cache Dependency - sql

I'm working on a project where we are thinking of using SQLCacheDependency with SQL Server 2005/2008 and we are wondering how this will affect the performance of the system.
So we are wondering about the following questions
Can the number of SQLCacheDependency objects (query notifications) have negative effect on SQL Server performance i.e. on insert, update and delete operations on affected tables ?
What effect (performance wise) would for example 50000 different query notifications on a single table have in SQL Server 2005/2008 on insertion and deletion on that table.
Are there any recommendations of how to use SQLCacheDependencies? Any official do‘s and don‘ts? We have found some information on the internet but haven‘t found information on performance implications.
If there is anyone here that has some answers to these questions that would be great.

The SQL Cache dependency using the polling mechanism should not be a load on the sql server or the application server.
Lets see what all steps are there for sqlcachedependency to work and analyze them:
Database is enabled for sqlcachedependency.
A table say 'Employee' is enabled for sqlcachedependency. (can be any number of tables)
Web.config is updated to enable sqlcachedependency.
The Page where u r using sql cache dependency is configured.
thats it.
Internally:
step 1. creates a table 'ASPnet_sqlcachetablesforchangenotification' in database which will store the 'Employee' table name for which sqlcachedependency is enabled. and add some stored procedures aswell.
step 2. inserts a 'Employee' table entry in the 'ASPnet_sqlcachetablesforchangenotification' table. Also creates an insert update delete trigger on this 'Employee' table.
step 3. enables application for sqlcachedependency by providing the connectionstring and polltime.
whenever there is a change in 'Employee' table, trigger is fired which inturn updates the 'ASPnet_sqlcachetablesforchangenotification' table.
Now application polls the database say every 5000ms and checks for any changes to the 'ASPnet_sqlcachetablesforchangenotification' table. if there r any changes the respective caches is removed from memory.
The great benefit of caching combined with freshness of data ( atmost data can be 5 seconds stale). The polling is taken care by a background process with should not be a performance hurdle. because as u see from above list the task are least CPU demanding.

SQLCacheDependency is implemented as an indexed view and every time the table is modified this views index gets changed. so many views (SQLCacheDependency objects) on the same table mean quite a perf hit for modifications. however if you have 1 view (SQLCacheDependency object) per table you should have no problems.
the cache changed notification is async and is triggered when the server has resources.

You're right, not much information on this is provided but there's a phrase related to your question in this page http://msdn.microsoft.com/en-us/library/ms178604%28VS.80%29.aspx
"The database operations associated with SQL cache dependency are simple and therefore do not incur a heavy processing cost on the server."
Hope this helps you although your question is a little bit old already.

This page appears to have some good info on setup which technique to use well (granted I did just skim it).

All I can provide is anecdotal evidence for performance, but we use SqlCacheDependency as a sort of "messaging solution" for a large enterprise application that processes on the order of ten thousand messages per hour.
The basic architecture is that our company uses Perforce for source control and we have a "subscription service" that receives messages from a trigger webservice call than gets called on every p4 commit and inserts a record into a SQL database. Our application has the dependency setup to send subscription notifications for every changeliest that affects a branch or path that you are monitoring.
The performance is fine. Trigger runs on the order of 200ms and we have never had a complaint about the latency of relaying the messages to end users.
As always, your mileage may vary.

Related

SQL Server sp_clean_db_free_space

I just want to ask if I can use the stored procedure sp_clean_db_free_space as part of preventive maintenance?
What are the pros and cons of this built in stored procedure? I just want to hear the inputs of someone who already uses this.
Thank you
You would typically not need to explicitly run this, unless you have specific reasons.
Quoting from SQL Server Books Online's page for sp_clean_db_free_space:
Delete operations from a table or update operations that cause a row
to move can immediately free up space on a page by removing references
to the row. However, under certain circumstances, the row can
physically remain on the data page as a ghost record. Ghost records
are periodically removed by a background process. This residual data
is not returned by the Database Engine in response to queries.
However, in environments in which the physical security of the data or
backup files is at risk, you can use sp_clean_db_free_space to clean
these ghost records.
Notice that SQL Server already has a background process to achieve the same result as sp_clean_db_free_space.
The only reason you might wish to explicitly run sp_clean_db_free_space is if there is a risk that the underlying database files or backups can be compromised and analysed. In such cases, any data that has not yet been swept up by the background process can be exposed. Of course, if your system has been compromised in such a way, you probably also have bigger problems on your hands!
Another reason might be that you have a time-bound requirement that deleted data should not be retained in any readable form. If you have only a general requirement that is not time-bound, then it would be acceptable to wait for the regular background process to perform this automatically.
The same page also mentions:
Because running sp_clean_db_free_space can significantly affect I/O
activity, we recommend that you run this procedure outside usual
operation hours.
Before you run sp_clean_db_free_space, we recommend
that you create a full database backup.
To summarize:
You'd use the sp_clean_db_free_space stored procedure only if you have a time-bound requirement that deleted data should not be retained in any readable form.
Running sp_clean_db_free_space is IO intensive.
Microsoft recommend a full database backup prior to this, which has its own IO and space requirements.
Take a look at this related question on dba.stackexchange.com: https://dba.stackexchange.com/questions/11280/how-can-i-truly-delete-data-from-a-sql-server-table-still-shows-up-in-notepad/11281

SQL or Code: difference between table snapshot and current table; which is more efficient?

I need to take a snapshot of a table at a given time, and determine the difference between the snapshot and the current data. What is the most effective way to do that? Can it be done in pure SQL (MS SQL), or do my app server do that in Delphi code?
I'm using an app server that keeps track of these changes, and transmit them over a Telnet protocol to any number of clients/ on the same machine or not.
Because of the txt protocol, I have to use the difference of the tables because it is impractical to send all the data (~10k records) every time something changes.
The apps involved are, Swordfish (an Automatic Trading System/ ATS), not written by me. The app server (Chef), and the client (Diner), both written by me. The ATS uses MS SQL as a layer for its API, so Chef, sends and receives data to the MS SQL server, essentially controlling the ATS. The client communicates what it wants done to Chef, and then Chef talks to Swordfish through the DBMS, and the the other Diners, through Telnet.
Code. Is the most efficient way to do this. According to all the info that I could find on the web
It may be possible to know with pure SQL what rows were added, but I could find nothing (in SQL) to detect changes to already existent rows or row deletes, both of which I need knowledge of to keep my app server (that is aware and synced with the SQL table) and my app clients synchronized.
Keeping an in-memory table of 10-15k records isn't that serious, a different error in my code (to do with TFDQuery) made me think that my "offline" or "in memory" snapshot op the tables needed A LOT of memory (every sql add command created it's own instance of TFDQuery, requiring 30mb per record that leaked when destroying the TFDQuery, now I create the instance of TFDQuery once, and reuse the instance for every record added, and my memory usage total stays ~50mb, which I have no problem with)
So, every time Service Broker detects a change in the dataset of the sql table, I save the old dataset to a in-memory table, and do 3 compares between dataset and dataset (dataset saved/old and dataset current/the newest version of the SQL table). 1. Scan for addition. 2. Scan for changes. 3 Scan for deletion, DONE :-)
Then its' a simple task of encoding the text for the Telnet protocol, and all my clients and my SQL server and my app server are happily synced!

Should I create separate SQL Server database for each user?

I am working on Asp.Net MVC web application, back-end is SQL Server 2012.
This application will provide billing, accounting, and inventory management. The user will create an account by signup. just like http://www.quickbooks.in. Each user will create some masters and various transactions. There is no limit, user can make unlimited records in the database.
I want to keep stable database performance, after heavy data load. I am maintaining proper indexing and primary keys in it, but there would be a heavy load on the database, per user.
So, should I create a separate database for each user, or should maintain one database with UserID. Add UserID in each table and making a partition based on UserID?
I am not an expert in SQL Server, so please provide suggestions with clear specifications.
Please inform me if there is any lack of information.
A DB per user is what happens when customers need to be able pack up and leave taking the actual database with them. Think of a self hosted wordpress website. Or if there are incredible risks to one user accidentally seeing another user's data, so it's safer to rely on the servers security model than to rely on remembering to add the UserId filter to all your queries. I can't imagine a scenario like that, but who knows-- maybe if the privacy laws allowed for jail time, I would rather data partitioned by security rules rather than carefully writing WHERE clauses.
If you did do user-per-database, creating a new user will be 10x more effort. While INSERT, UPDATE and so on stay the same from version to version, with each upgrade the syntax for database, user creation, permission granting and so on will evolve enough to break those scripts each SQL version upgrade.
Also, this will multiply your migration headaches by the number of users. Let's say you have 5000 users and you need to add some new columns, change a columns data type, update a trigger, and so on. Instead of needing to run that change script 1x, you need to run it 5000 times.
Per user Dbs also probably wastes disk space. Each of those databases is going to have a transaction log, sitting idle taking up the minimum log space.
As for load, if collectively your 5000 users are doing 1 billion inserts, updates and so on per day, my intuition tells me that it's going to be faster on one database, unless there is some sort of contension issue (everyone reading and writing to the same table at the same time and the same pages of the same table). Each database has machine resources (probably threads and memory) per database doing housekeeping, so these extra DBs can't be free.
Anyhow, the best thing to do is to simulate the two architectures and use a random data generator to simulate load and see how they perform.
It's not an easy answer to give.
First, there is logical design to be considered. Then you have integrity, security, management and performance (in this very order).
A database is a logical unit of data, self contained. Ideally, you should be able to take a database, move it to another instance, probably change the connection strings and be running again.
All the constraints are database-level. No foreign keys can exist referencing some object outside the database.
So, try thinking in these terms first.
How would you reliably prevent one user messing up the other user's data? Keep in mind that it's just a matter of time before someone opens an excel sheet and fire up queries on the database bypassing your application. Row level security in SQL Server is something you don't want to deal with.
Multiple databases mean that all management tasks should be scripted out and executed on all databases. Yes, there is some overhead to it, but once you set it up it's just the matter of monitoring. If a database goes suspect, it's a single customer down, not all of them. You can even have different versions for different customes if each customer have it's own database. Additionally, if you roll an upgrade, you can do it per customer, so the inpact will be much less.
Performance is the least relevant factor here. Of course, it really depends on how many customers and how much data, but proper indexing will solve these issues. Scale-out is much easier with multiple databases.
BTW, partitioning, as you mentioned it, is never a performance booster, it's simply a management feature, allowing for faster loading and evicting of data from a table.
I'd probably put each customer in separate database, but it's up to you eventually to make a decision for yourself. Hope I've helped some with this.

Single user mode "missing behavior" on SQL Server

I run a script provided by one of Microsoft employee to find out about which indexes need to Rebuild/Reorganize depending on the average fragmentation. I got back a reasonable list but while trying to rebuild some of them on a specific database I kept receiving errors :
The first idea I got is to set the database in single user mode, rebuild the indexes and then bring it back to life. Well that did not help because the database is being populated by a Windows service that ironically uses the same user I am connected with and the only available to me with enough permissions to do so. I am working on a corporate environment so the moon is a bit closer than getting another user credentials. I also cannot stop the service while executing my tasks because it is used for many other things.
My question is simple: How can I force single-user mode to force single connection source? In other words how to hide the database or eventually the SQL server from the service? It will correctly handle the absence as a network issue so I don't have to worry about that part.
I found a good solution to use that might help others. I start by getting the list of transactions with locks on the current table using :
USE [Your DB Name]
SELECT REQUEST_MODE, REQUEST_TYPE, REQUEST_SESSION_ID
FROM sys.dm_tran_locks
WHERE RESOURCE_TYPE = 'OBJECT'
AND RESOURCE_ASSOCIATED_ENTITY_ID =(SELECT OBJECT_ID('YourTableName'))
The REQUEST_SESSION_ID is the ID of the session which has the lock set on the table. Then I run EXEC sp_who2 to make sure that the SPID is the one for the expected service. All I needed to do at the end was KILL <SPID> and rebuild the index. You might need to do it multiple time if you are building more than one index as the lock could be set again.
There is an ONLINE = ON/OFF option available when rebuilding indexes in SQL Server 2005 and above which controls how users can access underlying table which may solve your problem.
http://msdn.microsoft.com/en-us/library/ms188388(v=sql.110).aspx
your problem is that the interface will only wait a certain amount of time before deciding to fail. I run into this all of the time.
You can try scripting the change and then running it manually, this will allow you to just wait until all of the locks are released by the users currently using the index. You will have to be careful though, an index rebuild locks the index for the time that it is running (unless of course you have enterprise edition, where rebuilds are online, and everything is made of money)

SQL 2005 Transactional Replication: Behavior during snapshot processing?

So, I've got SQL (2005) Transactional Replication generally working well with a single publisher and single (read-only) subscriber. Data changes and updates flow perfectly, with about 5 second latency, which is just fine.
My one nagging problem, that I've spent a couple days trying to solve (and Googling everywhere for answers) is that new sprocs/tables/etc. do not get propagated to the read-only subscriber, even though I've added them as "articles" to the "publication". The publication has "transmit schema changes" set to ON, and stored procedures are set to transfer their definitions. But, for some reason, they don't.
My "snapshot agent" process is set to NOT SCHEDULED. (In other words, it only happens once, when I initiate it manually.) Should I be putting this on a schedule to enable the transfer of new or modified tables and sprocs?
I thought the mere act of adding the object as an article to the publication would do it, but it's still not sending it unless I do a snapshot. The WAN connecting these is totally fast and reliable, so that's not the issue, and table-data-updates transfer relatively fast and flawlessly.
While I could put my snapshot agent on a schedule, does this have any real-time production impacts for users of the main publication database or the read-only copy? (My site currently gets 4+ million unique-users a month, so I'd like to have minimal disruption...) Thanks!
Transactional replication only distributes (and then subsequently publishes) the DML (Data Manipulation Language) statements from the transaction log of the source (publication) database.
New tables and stored procedures are not replicated to the subscriber. Schema changes in this particular context, although I have to admit it is a little unclear in some of the Books Online documentation, refer to the existing schema, i.e. if you were to a add column to an existing database this change would be propagated to the subscribers.
For clarification here is a Microsoft article that details the schema changes that you can make.
[http://msdn.microsoft.com/en-us/library/ms151870(SQL.90).aspx][1]
I hope this helps. Replication is a big subject area so please let me know if I can be of further assistance.
Oh yes, you are correct, if you add new articles to your publication you will need to create an updated snapshot.
Cheers,