Is there downside to creating and dropping too many tables on SQL Server - sql

I would like to know if there is an inherent flaw with the following way of using a database...
I want to create a reporting system with a web front end, whereby I query a database for the relevant data, and send the results of the query to a new data table using "SELECT INTO". Then the program would make a query from that table to show a "page" of the report. This has the advantage that if there is a lot of data, this can be presented a little at a time to the user as pages. The same data table can be accessed over and over while the user requests different pages of the report. When the web session ends, the tables can be dropped.
I am prepared to program around issues such as tracking the tables and ensuring they are dropped when needed.
I have a vague concern that over a long period of time, the database itself might have some form of maintenance problems, due to having created and dropped so many tables over time. Even day by day, lets say perhaps 1000 such tables are created and dropped.
Does anyone see any cause for concern?
Thanks for any suggestions/concerns.

Before you start implementing your solution consider using SSAS or simply SQL Server with a good model and properly indexed tables. SQL Server, IIS and the OS all perform caching operations that will be hard to beat.
The cause for concern is that you're trying to write code that will try and outperform SQL Server and IIS... This is a classic example of premature optimization. Thousands and thousands of programmer hours have been spent on making sure that SQL Server and IIS are as fast and efficient as possible and it's not likely that your strategy will get better performance.

First of all: +1 to #Paul Sasik's answer.
Now, to answer your question (if you still want to go with your approach).
Possible causes of concern if you use VARBINARY(MAX) columns (from the MSDN)
If you drop a table that contains a VARBINARY(MAX) column with the
FILESTREAM attribute, any data stored in the file system will not be
removed.
If you do decide to go with your approach, I would use global temporary tables. They should get DROPped automatically when there are no more connections using them, but you can still DROP them explicitly.
In your query you can check if they exist or not and create them if they don't exist (any longer).
IF OBJECT_ID('mydb..##temp') IS NULL
-- create temp table and perform your query
this way, you have most of the logic to perform your queries and manage the temporary tables together, which should make it more maintainable. Plus they're built to be created and dropped, so it's quite safe to think SQL Server would not be impacted in any way by creating and dropping a lot of them.

1000 per day should not be a concern if you talk about small tables.
I don't know sql-server, but in Oracle you have the concept of temporary table(small article and another) . The data inserted in this type of table is available only on the current session. when the session ends, the data "disapear". In this case you don't need to drop anything. Every user insert in the same table, and his data is not visible to others. Advantage: less maintenance.
You may check if you have something simmilar in sql-server.

Related

Method for updating tables that users are looking at?

I'm looking for a method or solution to allow for a table to be updated that others are running select queries on?
We have an MS SQL Database storing tables which are linked through ODBC to an Access Database front-end.
We're trying to have a query run an update on one of these linked tables but often it is interrupted by users running select statements on the table to look at data though forms inside access.
Is there a way to maybe create a copy of this database table for the users to look at so that the table can still be updated?
I was thinking maybe a transaction but can you perform transactions for select statements? Do they work that way?
The error we get from inside access when we try to run the update while a user has the table open is:
Any help is much appreciated,
Cheers
As a general rule, this should not be occurring. Those reports should not lock nor prevent the sql system from not allowing inserts.
For a quick fix, you can (should) link the reports to some sql server views for their source. And use this for the view:
SELECT * from tblHotels WITH (NOLOCK)
In fact in MOST cases this locking occurs due to combo boxes being driven by a larger table in from SQL server - if the query does not complete (and access has the nasty ability to STOP the flow of data, then you get a sql server table lock).
You also can see the above "holding" of a lock when you launch a form with a LARGE dataset If access does not finish pulling the table/query from SQL server - again a holding lock on the table can remain.
However, I as a general rule NOT seen this occur for reports.
However, it not all clear how the reports are being used and how their data sources are setup.
But, as noted, the quick fix is to create some views for the reports, and use the no-lock hint as per above. That will prevent the tables from holding locks.
Another HUGE idea? For the reports, if they often use some date range or other critera? MAKE 100% sure that sql server has index on the filter or critera. If you don't, then SQL server will scan/lock the whole table. This advice ALSO applies VERY much to say a form in which you filter - put indexing (sql server side) on those common used columns.
And in fact, the notes about the combo box above? We found that JUST adding a indexing to the sort column used in the combo box made most if not all locking issues go away.
Another fix that often works - and requires ZERO changes to the ms-access client side software?
You can change this on the server:
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED
The above also will in most cases fix the locking issue.

Inserting into the same db table from two different programs at the same time

What are the consequences of inserting new rows (not updating existing) into the same table (SQL Server), from two different programs?
I have a Windows Service, which does a lot of inserts into db (1 insert per second, sometimes faster). I want to make this service scalable (I want to run the same service on many computers). I'm afraid that it can cause problems during those inserts.
If this is the issue, What's the way of handling it? (I'm not asking for "The best" way, so it's not an opinionated question).
My first idea, is to create a new service - "data access service", with the queue. It will be the only service that talks with the database. Other services will connect to that service when they want to insert something. I'm not sure if this is an overkill though. Is there a better way? Or maybe I don't have an issue at all and it's handled by SQL Server (which would be ideal)?
SQL-Server has been designed for concurrent use. So, no, inserting from different programs is not a problem.
The problem arises if inserts, updates, deletes and selects are interleaved. This can result in inconsistent results. A wise use of transactions is required in such cases.
The Transactions Per Second (TPS) count highly depends on the hardware used, the DB schema, server configuration etc. So I cannot give you exact number on this, but you can expect it to handle several hundreds or even thousands of TPS.
Create a stored procedure to insert data and return the scope_identity so you can re-select the data if needed. Within this stored procedure you use transactions.
You can lock the table as you can find in this answer, but you should be fine without. You can block from within a transaction but there is no need to do this.
Here some more tips and advanced tricks, but this can be overwhelming.

What is the best way to log all user request operations: (inserts, updates, deletes in Sql Server 2008?

I have a database with 50 tables and I want to log users requests, such as inserts, updates or deletes on all the tables in the database. I can also create a trigger for this for each request type.
What is the best way to do this from a performance perspective or is there a better way to track this?
You can also create audit tables which are populated by triggers (and which allow much more flexibility than change data capture). The critical component is to capture sets of data not try to work row-by-row. It does add some overhead yes, but if you write the triggers correctly, it isn't that much. Be sure to capture who (including which application if you have multiple applications hitting the database) and when as well as the old and new values. Set up one audit table per table you want audited (too much locking if you use only one audit table). And at the time you set up your system, write the code to get data back from a bad transaction or set of transactions. That makes it easier to recover when you do have something go wrong and you need to revert. We use two tables per table audited, one contains the info about the process that did the changes (name of the application, date, user, etc. and an auditid), the other contains the details about what was changed (old and new values, ID of the record being affected and column affected). Our structure enables us to use the same structure for each table being audited, and allows the tables to change without having to change the audit table and allows us to easily script the audit tables for a new tables. It is also easy for us to see what records were changed at the same time or in the same process or to find out which of the many applications which touch our database was responsible for the bad data as well as telling us who in particular was responsible for the bad data. This helps us track down application bugs and find out why the data was changed the way it was in some cases. It also makes it easier for us to track down all the data that was affected by a broken process rather than just the one we knew about.
If you have Enterprise Edition, look into Change Data Capture. If you don't have Enterprise and aren't interested in capturing the historical values of the columns that change, look into Change Tracking.
See Comparing Change Data Capture and Change Tracking to understand the differences between the two.
Assuming all requests to insert, update and/or delete data goes through some middle-tier data access layer, I would suggest you do your logging there. This is where we do all of ours. It is much simpler than trying to extract the actual insert / delete / update statements out of SQL Server.
If you want to do auditing of data, you can look into Change Data Capture (CDC). But this requires the Enterprise Edition.

What is alternative way to store temp data in a temp table?

In our web app, we are creating session table in database to store temporary data. So the temp table will be created and destroyed for every user. I have some 300 users for this web app. So for every user these table will be created and destroyed.
i heard that this way of design is not good due to performance issues.
I am using MS Sql server 2005. Is there any way to store a result set temporarily without creating any table.
Please suggest me some solution.
Thanks.
Either:
use a single permanent database table for all users, with a UserID column to filter on
or
just use the session handling ability of your web platform to store the info
It sounds as if you are creating and dropping permanent tables. Have you tried using real temp tables (those with table names beginning with #). OR table variables if you havea small data set. Either of these can work quite well. If you use real temp tables, you need to make sure your tempdb is sized large enough to accomodate the usual amount of users, growing tempdb can cause delays.
I think, a solution at GenerateData is what you are looking for. You can create test/sample databases their and delete them when needed.
Depending on what you're actually doing (and whether you can refactor it) it may be more appropriate to use table variables which are highly performant in general.
There is a question of whther the DB is really an appropriate place to even be trying to persist data sets if it's for your applications benefit - if the question isn't just academic perhaps it would be better to keep the object representation of the data in your app memory?

Ideas for Combining Thousand Databases into One Database

We have a SQL server that has a database for each client, and we have hundreds of clients. So imagine the following: database001, database002, database003, ..., database999. We want to combine all of these databases into one database.
Our thoughts are to add a siteId column, 001, 002, 003, ..., 999.
We are exploring options to make this transition as smoothly as possible. And we would LOVE to hear any ideas you have. It's proving to be a VERY challenging problem.
I've heard of a technique that would create a view that would match and then filter.
Any ideas guys?
Create a client database id for each of the client databases. You will use this id to keep the data logically separated. This is the "site id" concept, but you can use a derived key (identity field) instead of manually creating these numbers. Create a table that has database name and id, with any other metadata you need.
The next step would be to create an SSIS package that gets the ID for the database in question and adds it to the tables that have to have their data separated out logically. You then can run that same package over each database with the lookup for ID for the database in question.
After you have a unique id for the data that is unique, and have imported the data, you will have to alter your apps to fit the new schema (actually before, or you are pretty much screwed).
If you want to do this in steps, you can create views or functions in the different "databases" so the old client can still hit the client's data, even though it has been moved. This step may not be necessary if you deploy with some downtime.
The method I propose is fairly flexible and can be applied to one client at a time, depending on your client application deployment methodology.
Why do you want to do that?
You can read about Multi-Tenant Data Architecture and also listen to SO #19 (around 40-50 min) about this design.
The "site-id" solution is what's done.
Another possibility that may not work out as well (but is still appealing) is multiple schemas within a single database. You can pull common tables into a "common" schema, and leave the customer-specific stuff in customer-specific schema. In some database products, however, the each schema is -- effectively -- a separate database. In other products (Oracle, DB2, for example) you can easily write queries that work in multiple schemas.
Also note that -- as an optimization -- you may not need to add siteId column to EVERY table.
Sometimes you have a "contains" relationship. It's a master-detail FK, often defined with a cascade delete so that detail cannot exist without the parent. In this case, the children don't need siteId because they don't have an independent existence.
Your first step will be to determine if these databases even have the same structure. Even if you think they do, you need to compare them to make sure they do. Chances are there will be some that are customized or missed an upgrade cycle or two.
Now depending on the number of clients and the number of records per client, your tables may get huge. Are you sure this will not create a performance problem? At any rate you may need to take a fresh look at indexing. You may need a much more powerful set of servers and may also need to partion by client anyway for performance.
Next, yes each table will need a site id of some sort. Further, depending on your design, you may have primary keys that are now no longer unique. You may need to redefine all primary keys to include the siteid. Always index this field when you add it.
Now all your queries, stored procs, views, udfs will need to be rewritten to ensure that the siteid is part of them. PAy particular attention to any dynamic SQL. Otherwise you could be showing client A's information to client B. Clients don't tend to like that. We brought a client from a separate database into the main application one time (when they decided they didn't still want to pay for a separate server). The developer missed just one place where client_id had to be added. Unfortunately, that sent emails to every client concerning this client's proprietary information and to make matters worse, it was a nightly process that ran in the middle of the night, so it wasn't known about until the next day. (the developer was very lucky not to get fired.) The point is be very very careful when you do this and test, test, test, and test some more. Make sure to test all automated behind the scenes stuff as well as the UI stuff.
what I was explaining in Florence towards the end of last year is if you had to keep the database names and the logical layer of the database the same for the application. In that case you'd do the following:
Collapse all the data into consolidated tables into one master, consolidated database (hereafter referred to as the consolidated DB).
Those tables would have to have an identifier like SiteID.
Create the new databases with the existing names.
Create views with the old table names which use row-level security to query the tables in the consolidated DB, but using the SiteID to filter.
Set up the databases for cross-database ownership chaining so that the service accounts can't "accidentally" query the base tables in the consolidated DB. Access must happen through the views or through stored procedures and other constructs that will enforce row-level security. Now, if it's the same service account for all sites, you can avoid the cross DB ownership chaining and assign the rights on the objects in the consolidated DB.
Rewrite the stored procedures to either handle the change (since they are now referring to views and they don't know to hit the base tables and include SiteID) or use InsteadOf Triggers on the views to intercept update requests and put the appropriate site specific information into the base tables.
If the data is large you could look at using a partioned view. This would simplify your access code as all you'd have to maintain is the view; however, if the data is not large, just add a column to identify the customer.
Depending on what the data is and your security requirements the threat of cross contamination may be a show stopper.
Assuming you have considered this and deem it "safe enough". You may need/want to create VIEWS or impose some other access control to prevent customers from seeing each-other's data.
IIRC a product called "Trusted Oracle" had the ability to partition data based on such a key (about the time Oracle 7 or 8 was out). The idea was that any given query would automagically have "and sourceKey = #userSecurityKey" (or some such) appended. The feature may have been rolled into later versions of the popular commercial product.
To expand on Gregory's answer, you can also make a parent ssis that calls the package doing the actual moving within a foreach loop container.
The parent package queries a config table and puts this in an object variable. The foreach loop then uses this recordset to pass variables to the package, such as your database name and any other details the package might need.
You table could list all of your client databases and have a flag to mark when you are ready to move them. This way you are not sitting around running the ssis package on 32,767 databases. I'm hooked on the foreach loop in ssis.