SSIS or replication for copying parts of a database to the same server?

SSIS or replication for copying parts of a database to the same server? - sql

I have a client who has been promised that he will get a regular copy of the database behind the application we are hosting for him.
The copy is in fact a backup (DB_EXPORT.BAK) that he downloads through SFTP.
(I know, I did not make that promise). I do not want to give him the whole with all the proprietary stored procedures, functions, users and other stuff. I want to give him a slimmed down version of that database with most tables, only selected sp's, some functions, no users and so on.
As I see there are two ways to do this:
a SSIS job that copies certain stuff (using Import/Export Wizard)
replication (snapshot or transactional)
The thing is: the original (DB1) AND the copy (DB_EXPORT) will be hosted on the same server. So using replication feels a bit awkward (publishing to yourself?) but it does give an easy interface for configuring which articles to replicate. Using a SSIS package feels more logical but is actually hard to change.
What can you say about this? Is there a better way for doing this? I am looking for a way that will allow people who just about understand SQL server wil be able to understand.
Thanks for thinking with me!

Not sure if this is the best answer, but I would go with snapshot replication per our discussion, and your avoidance of TSQL scripting.
Replication is relatively simple to setup, but can be a nightmare to troubleshoot (esp. transactional). Often times, its easier to completely delete the publication/subscription and rebuild.
On that note, you can fully script replication configurations -- if someone else has to maintain this, it may be as simple as you scripting out the replication removal (pub and sub removal), and scripting out the replication build-out. All they'd have to do is run the drop/build scripts and it's done.
You can also alter the scheduled job to run the backup immediately following the snapshot generation.

Related

Updating SQL Schema in Multiple SQL Databases at One Time

Our issue is we have an online application with personally identifiable data. We have sold this application to multiple customers and the law in their States says that the data MUST be physically in their State. So this is why we have the identical database (not identical data) in different locations.
Right now we use RedGate SQL Compare, but as we continue to grow, doing this eight, nine, ten times for every update (be it a small stored procedure bug fix or a larger change creating a new table) is becoming more and more inefficient. Marketing is telling us five more states are on the way.
We've looked into a RedGate method, but its more coding and troubleshooting than its worth.
So...any ideas how to update the SCHEMA from one to many databases?

There is a function in SQL Management Studio that works. In SMS use CTRL-ALT-G. This brings up 'Registered Servers'. Under Local Groups you can create groups. Say one for testing and one for live. You then right-click on the local group you created and choose "New Server Registration". Under General tab you give it a name and then in Connection Properties tab, you select just one database. Keeping adding "New Server Registrations" for each database you want in the group. When done, just right click on your Group and choose New Query. Anything you put in there will run on ALL the databases in the group.
So, if all our databases are identical, and you need to make an update, use Redgate to do a Compare. Choose 'Create a Deployment Script' instead of 'Deploy Using SQL Compare' and copy the SQL. Right-click on the group and say "new query" paste and execute.

I'm assuming this is SQL Server since you specified "RedGate SQL Compare" and not "MySQL Compare". If it's not SQL Server, ignore this.
Without having to adopt a new toolset (or even pay RedGate for something) and since the database (not the data) is identical, you could set up a Central Management Server (Microsoft documentation on that here), register each individual SQL Server instance, build your deploy script (you can still use SQL Compare for this), and then use the CMS to simultaneously push the schema changes as you need them to all of the instances or to defined groups as you like.
This would assume you're using windows authentication for all the servers and that whoever does the deployments would have the same access across all of the servers, but it's a pretty decent solution for multi-server administration of this type in general and it's a solid feature that's been around for while (2008).

I work for Redgate, so I'd love to promote it even more, however, let's ignore it for the moment.
If you want to automate deployments to lots of servers at once, I'd suggest you look to tooling like Azure DevOps Pipelines or AWS DeveloperTools, or even a 3rd party product like Octopus or Jenkins. The idea is simple, use any tool you like, right up to just your keyboard, to create the artifacts needed for deployment (your T-SQL scripts for SQL Server). Then, the agents in one of these flow control tools does the heavy lifting of ensuring that script gets deployed to multiple locations. Because you can configure these agents with independent security, you don't have to have the same levels of security yourself that you'd need to control stuff through SSMS or the Central Management Server. Further, this method allows for very easy parallel execution. The only way you can do that yourself is through some pretty extensive PowerShell (or Python) work.
As much as I'd like to promote Redgate as part of this solution, it's actually not necessary (it's just better). You can generate the necessary artifacts any way you want. The important point is being able to control exactly how they get deployed, dealing with tracking the successful and failed deployments, varying levels of necessary security, all this stuff. That's exactly what tools like those I mention before are intended to do.
Also, yeah, this is a ton of work. Automating deployments is absolutely the way to go. However, it's not without labor. Instead of spending your time doing manual processes, prone to error, repetitive, boring and slow, you spend time, and effort, automating stuff. It's not so much that work gets eliminated, rather it gets reoriented. Then, you get all the benefits of that automation. However, you do have to maintain it, grow it, expand it, and deal with issues within it. All work.

Progress DB: backup restore and query individual tables

Here is the use-case: we need to backup some of the tables from a client server, copy it to our servers, restore it, then running some queries using ODBC.
I managed to do this process for the entire database by using probkup for backup, prorest for restore and proserve to make it accessible for SQL queries.
However, some of the databases are big (> 8GB), so we are looking for a solution to do the backup for only the tables we need. I didn't find anything with the documentation of probkup how this can be done.

Progress only supports full database backups.
To get the effect that you are looking for you could dump (export) the tables that you want and then load them into an empty database.
"proutil dump" and "proutil load" are where you want to start digging.
The details will vary depending on exactly what you want to do and what resources and capabilities you have available to you.
Another option would be to replicate the tables in question to a partial database. Progress has a product called "pro2" that can help with that. It is usually pointed at SQL targets but you could also point it at a Progress database.
Or, if you have programming skills, you could put together a solution using replication triggers (under the covers that's what pro2 does...)

probkup and prorest are block-level programs and can't do a backup or restore by table.
To do what you're asking for, you'll need to do a dump the data from the source db's tables and then load it into the target db.

If your object is simply to maintain a copy of the DB, you might also try incremental backups. Depending upon your situation, that might speed things up a bit.
Other options include various forms of DB replication, which allow you to keep real- or near-real-time copies of your database.
OpenEdge Replication. With the correct license, you can do query-only access on the replication target, which is good for reporting and analysis.
Third-party replication products. These can be more flexible in terms of both target DBs and limiting the tables to be replicated.
Home-grown replication (by copying and applying AI files). This is not terribly complicated, but you have to factor the cost of doing the work and maintaining the system. There are some scripts out there that can get you started.
Or, as Tom said, you can get clever with replication via triggers.

What is the best way to design, generate, and version a database schema script for MS SQL Server?

I have never really seen any questions (with answers) as general as this, so I'm hoping to get some useful feedback. The reason I'm asking is because I've done all of this before and I have my own ways, but sometimes I feel it's not the best practice.
Let's take for example that I can't afford better db modeling tools and I only have sql server and ms sql server management studio. What I do is:
I design with mssms, all of the entities in my db (tables, primary keys, foreign keys, indexes, etc)
then I just generate the schema script using 'Generate Scripts...' command in mssms. The script that's generated is rather large (using sql server express 2012) and seems like it's not organized for maintenance very well.
Example: after all the table creation scripts are setup, there's a bunch of ALTER TABLE commands to add all the constraints. This kind of thing seems like it would be better in the table creation script section, maybe not. Also, for upgrade-ability, I normally add for each table creation section, 'IF NOT EXISTS', so that it doesn't throw an error when I need to re-run the sql script when the db is updated with new tables, columns, etc.
Then for versioning, I generally have a separate script that I run to add the schema version in a VERSION table in the db itself (with just one row).
This allows me to do incremental upgrades when I run the script by adding 'if new-version > current-version' sort of thing.
It seems to have worked out for me in the past, but it just seems kind of, I don't know, not very sophisticated. Can a sql expert shed some light on this subject? It's something we all do for every data driven web app we create, over and over again. I'd like to see how other developers do it.
To recap,
how do you go about designing your db model and generate scripts (do you do it with a design tool, write from scratch, etc?),
how to you manage incremental db changes over time?
How do you version control your database?

SQL Server Data Tools is ideal for this. It has all the design features you require and configurable scripting. It will also diff two databases and generate the change script for you. Oh - and it's free!

SQL Server database change workflow best practices

The Background
My group has 4 SQL Server Databases:
Production
UAT
Test
Dev
I work in the Dev environment. When the time comes to promote the objects I've been working on (tables, views, functions, stored procs) I make a request of my manager, who promotes to Test. After testing, she submits a request to an Admin who promotes to UAT. After successful user testing, the same Admin promotes to Production.
The Problem
The entire process is awkward for a few reasons.
Each person must manually track their changes. If I update, add, remove any objects I need to track them so that my promotion request contains everything I've done. In theory, if I miss something testing or UAT should catch it, but this isn't certain and it's a waste of the tester's time, anyway.
Lots of changes I make are iterative and done in a GUI, which means there's no record of what changes I made, only the end result (at least as far as I know).
We're in the fairly early stages of building out a data mart, so the majority of the changes made, at least count-wise, are minor things: changing the data type for a column, altering the names of tables as we crystallize what they'll be used for, tweaking functions and stored procs, etc.
The Question
People have been doing this kind of work for decades, so I imagine there have got to be a much better way to manage the process. What I would love is if I could run a diff between two databases to see how the structure was different, use that diff to generate a change script, use that change script as my promotion request. Is this possible? If not, are there any other ways to organize this process?
For the record, we're a 100% Microsoft shop, just now updating everything to SQL Server 2008, so any tools available in that package would be fair game.
I should clarify I'm not necessarily looking for diff tools. If that's the best way to sync our environments then it's fine, but if there's a better way I'm looking for that.
An example doing what I want really well are migrations in Ruby on Rails. Dead simple syntax, all changes are well documented automatically and by default, determining what migrations need to run is almost trivially easy. I'd love if there was something similar to this for SQL Server.
My ideal solution is 1) easy and 2) hard to mess up. Rails Migrations are both; everything I've done so far on SQL Server is neither.

Within our team, we handle database changes like this:
We (re-)generate a script which creates the complete database and check it into version control together with the other changes. We have 4 files: tables, user defined functions and views, stored procedures, and permissions. This is completely automated - only a double-click is needed to generate the script.
If a developer has to make changes to the database, she does so on her local db.
For every change, we create update scripts. Those are easy to create: The developer regenerates the db script of his local db. All the changes are now easy to identify thanks to version control. Most changes (new tables, new views etc) can simply be copied to the update script, other changes (adding columns for example) need to be created manually.
The update script is tested either on our common dev database, or by rolling back the local db to the last backup - which was created before starting to change the database. If it passes, it's time to commit the changes.
The update scripts follow a naming convention so everybody knows in which order to execute them.
This works fairly well for us, but still needs some coordination if several developers modify heavily the same tables and views. This doesn't happen often though.
The important points are:
database structure is only modified by scripts, except for the local developer's db. This is important.
SQL scripts are versioned by source control - the db can be created as it was at any point in the past
database backups are created regularly - at least before making changes to the db
changes to the db can be done quickly - because the scripts for those changes are created relatively easily.
However, if you have a lot of long lasting development branches for your projects, this may not work well.
It is by far not a perfect solution, and some special precautions are to be taken. For example, if there are updates which may fail depending on the data present in a database, the update should be tested on a copy of the production database.
In contrast to rails migrations, we do not create scripts to reverse the changes of an update. But this isn't always possible anyway, at least in respect to the data (the content of a dropped column is lost even if you recreate the column).

Version Control and your Database
The root of all things evil is making changes in the UI. SSMS is a DBA tool, not a developer one. Developers must use scripts to do any sort of changes to the database model/schema. Versioning your metadata and having upgrade script from every version N to version N+1 is the only way that is proven to work reliably. It is the solution SQL Server itself deploys to keep track of metadata changes (resource db changes).
Comparison tools like SQL Compare or vsdbcmd and .dbschema files from VS Database projects are just last resorts for shops that fail to do a proper versioned approach. They work in simple scenarios, but I see them all fail spectacularly in serious deployments. One just does not trust a tool to do a change to +5TB table if the tools tries to copy the data...

RedGate sells SQL Compare, an excellent tool to generate change scripts.
Visual Studio also has editions which support database compares. This was formerly called Database Edition.
Where I work, we abolished the Dev/Test/UAT/Prod separation long ago in favor of a very quick release cycle. If we put something broken in production, we will fix it quickly. Our customers are certainly happier, but in the risk avert corporate enterprise, it can be a hard sell.

There are several tools available for you. One is from Red-Gate called SQL Compare. Awesome and highly recommended. SQL Compare will let you do a diff in schemas between two databases and even build the sql change scripts for you.
Note they have been working on a SQL Server source control product for awhile now as well.
Another (if you're a visual studio shop) is the schema and data compare features that is part of Visual Studio (not sure which versions).

Agree that SQL Compare is an amazing tool.
However, we do not make any changes to the database structure or objects that are not scripted and saved in source control just like all other code. Then you know exactly what belongs in the version you are promoting because you have the scripts for that particular version.
It is a bad idea anyway to make structural changes through the GUI. If you havea lot of data, it is far slower than using alter table at least in SQL Server. You only want to use tested scripts to make changes to prod as well.

I agree with the comments made by marapet, where each change must be scripted.
The problem that you may be experiencing, however, is creating, testing and tracking these scripts.
Have a look at the patching engine used in DBSourceTools.
http://dbsourcetools.codeplex.com
It's been specifically designed to help developers get SQL server databases under source-code control.
This tool will allow you to baseline your database at a specific point, and create a named version (v1).
Then, create a deployment target - and increment the named version to v2.
Add patch scripts to the Patches directory for any changes to schema or data.
Finally, check the database and all patches into source-code control, to distribute with devs.
What this gives you is a repeatable process to test all patches to be applied from v1 to v2.
DBSourceTools also has functionality to help you create these scripts, i.e. schema compare or script data tools.
Once you are done, simply send all of the files in the patches directory to your DBA to upgrade from v1 to v2.
Have fun.

Another "Diff" tool for databases:
http://www.xsqlsoftware.com/Product/Sql_Data_Compare.aspx

Keep database version in a versioning table
Keep script file name that was successfully applied
Keep md5 sum of each sql script that has been applied. It should ignore spaces when calculate md5 sum. Must be effective.
Keep info about who applied a script Keep info about when a script was applied
Database should be verified on application start-up
New sql script should be applied automatically
If md5 sum of a script that was already applied is changed, error should be thrown (in a production mode)
When script have been released it must not be changed. It must be
immutable in a production environment.
Script should be written in a way, so it could be applied to different types of database (see liquibase)
Since most ddl statements are auto-committing on most databases, it is best to have a single ddl statement per SQL script.
DDL sql statement should be run in a way, so it can be executed several times without errors. Really helps in a dev mode, when you may edit script several times. For instance, create a new table, only if it does not exist, or even drop table before creating a new one. It will help you in a dev mode, with a script that has not been released, change it, clear md5 sum for this script, rerun it again.
Each sql script should be run in its own transaction.
Triggers/procedures should be dropped and created after each db
update.
Sql script is kept in a versioning system like svn
Name of a script contains date when it was committed, existing (jira) issue id, small description
Avoid adding rollback functionality in scripts (liquibase allow to do that). It makes them more complicated to write and support. If you use exactly one ddl statement per script, and dml statements are run within a
transaction, even failing a script will not be a big trouble to
resolve it

This is the workflow we have been using succesfully:
Development instance: SQL objects are created/updated/deleted in DB using MSSQL Studio and all operations are saved to scritps we include in each version of our code.
Moving to production: We compare schema between dev and prod db using SQL Schema Compare in Microsoft Visual Studio. We update prod using the same tool.

How can I maintain consistent DB schema accross 18 databases (sql server)?

We have 18 databases that should have identical schemas, but don't. In certain scenarios, a table was added to one, but not the rest. Or, certain stored procedures were required in a handful of databases, but not the others. Or, our DBA forgot to run a script to add views on all of the databases.
What is the best way to keep database schemas in sync?

For legacy fixes/cleanup, there are tools, like SQLCompare, that can generate scripts to sync databases.
For .NET shops running SQL Server, there is also the Visual Studio Database Edition, which can create change scripts for schema changes that can be checked into source control, and automatically built using your CI/build process.

SQL Compare by Red Gate is a great tool for this.

SQLCompare is the best tool that I have used for finding differences between databases and getting them synced.
To keep the databases synced up, you need to have several things in place:
1) You need policies about who can make changes to production. Generally this should only be the DBA (DBA team for larger orgs) and 1 or 2 backaps. The backups should only make changes when the DBA is out, or in an emergency. The backups should NOT be deploying on a regular basis. Set Database rights according to this policy.
2) A process and tools to manage deployment requests. Ideally you will have a development environment, a test environment, and a production environment. Developers should do initial development in the dev environment, and have changes pushed to test and production as appropriate. You will need some way of letting the DBA know when to push changes. I would NOT recommend a process where you holler to the next cube. Large orgs may have a change control committee and changes only get made once a month. Smaller companies may just have the developer request testing, and after testing is passed a request for deployment to production. One smaller company I worked for used Problem Tracker for these requests.
Use whatever works in your situation and budget, just have a process, and have tools that work for that process.
3) You said that sometimes objects only need to go to a handful of databases. With only 18 databases, probably on one server, I would recommend making each Databse match objects exactly. Only 5 DBs need usp_DoSomething? So what? Put it in every databse. This will be much easier to manage. We did it this way on a 6 server system with around 250-300 DBs. There were exceptions, but they were grouped. Databases on server C got this extra set of objects. Databases on Server L got this other set.
4) You said that sometimes the DBA forgets to deploy change scripts to all the DBs. This tells me that s/he needs tools for deploying changes. S/He is probably taking a SQL script, opening it in in Query Analyzer or Manegement Studio (or whatever you use) and manually going to each database and executing the SQL. This is not a good long term (or short term) solution. Red Gate (makers of SQLCompare above) have many great tools. MultiScript looks like it may work for deployment purposes. I worked with a DBA that wrote is own tool in SQL Server 2000 using O-SQl. It would take an SQL file and execute it on each database on the server. He had to execute it on each server, but it beat executing on each DB. I also helped write a VB.net tool that would do the same thing, except it would also go through a list of server, so it only had to be executed once.
5) Source Control. My current team doesn't use source control, and I don't have enough time to tell you how many problems this causes. If you don't have some kind of source control system, get one.

I haven't got enough reputation to comment on the above answer but the pro version of SQL Compare has a scriptable API. Given that you have to replicate stuff to all of these databases you could use this to make an automated job to either generate the change scripts or to validate that the databases are all in sync. It's also not much more expensive than the standard version.

Aside from using database comparison tools, with 18 databases you should have a DBA, so enforce a policy that only the DBA can change tables at the database level by restricting access to CREATE and ALTER to the DBA only. On both your test and live databases. The dev database shouldn't have this, of course! Make the developers who have been creating or altering the schemas willy-nilly go via the DBA.

Create a single source-controlled DDL/SQL script for each release and only use it to update the databases. The diff tools can be useful but mainly for checking that you haven't made a mistake and getting out of trouble when the policies fail. Combine the DDL, SQL, and stored procedure scripts into a single script so that it's not easy to "forget" to run one of the scripts.

We have got a tool called DB Schema Difftective that can compare and sync database schemas. With our other tool, DB MultiRun you can easily deploy generated (sync) scripts to multiple db servers (project based).

I realize this post is old, but TurnKey is correct. If you are a developer working in a team environment, the best way to maintain a database schema for a large application, is to make updates to a Master Schema in what ever source safe you use. Simply write your own Scripting class and your Database will be perfect every time.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas