Using SSIS and SSMS - Attempting to create an ETL design document that details source and destination columns and their relations - sql

I am attempting to create an ETL design document that details source and destination columns and their relations, by looking at metadata for the columns. The format should be similar to below:
Final document format
I have utilised something similar to the site below to get data for the output/destination database:
MS SQL TIPS - list-columns-and-attributes-for-every-table-in-a-sql-server-database
Now I am trying to get the same information for the source database but I am not sure if I am allowed to run such a query directly on the source as it is important data.
Is there a way I can use SSIS or look at the source in SSMS to see all relationships I need?
I have the packages in SSIS that details as to what transformations I will apply to the source, via sql queries. I've tried looking at packages individually but there are a lot and there should be an easier way I am missing.

Depends on your sources. If they are all queries from tables, you can probably parse them from the .dtsx files. If any of them are stored procs or views, then there's probably nothing you can do without querying the source database.

Related

SQL Server: Scripts to perform Object Movement from a database.schema to another

I am new to SQL server. Database: SQL Server 2012, Size 2TB
We are planning to consolidate multiple databases.dbo.* objects under a single database as different schemas (databaseN.SchemaN.). Thus we need to prepare scripts to move database1.dbo. objects in another database as a different schema (e.g. database2.schema2.*) including all the dependent objects (need exact replica). This needs to be done without using any tools (SSMS, ApexSQL etc).
How should I go about scripting this. I was thinking on the lines of below approach:
Extract complete metadata (including all constraints/triggers/indexes/keys/partitions etc)
Extract data
Execute metadata scripts on target
disable all relational constraints and triggers
insert all the extracted data
enable all the relational constraints and triggers.
If this approach is fine, can I get some assistance in how to go about scripting this. Also, please suggest if any other approach. Some tables are partitioned and 50-100GB in size.

How to load from multiple sources to multiple destination dynamically

I have more than hundred tables in a linked server (lets say on a sql server 1). I have to perform an initial load, basically a simple dump, by creating duplicate copy of those hundred tables in to sql server 2 destination. I know how to perform data flow task in SSIS to extract data from a source and load it in a destination (creating a table in the destination as well). With more than hundred tables, I would need to create more than hundred data flow tasks which is very time consuming. So I have heard about copying files from source to destination dynamically by looping through and creating variables. Now, how do I do this? Remeber, those hundred tables do no contain similar structure. How can I perform this initial load faster without using multiple data flow task in SSIS. Please, help! Thank you!
I would use the Import and Export Wizard - it can do all those tasks you described in one pass. You just tell it your Source and Destination and check the tables you want. It generates an SSIS package, which you can save and customize if you want.
It's easiest to find via the Windows Start Menu, under SQL Server [version] / Import and Export Data.
If you would like to automate transferring data from 100+ tables, I would consider using BIML. BIML is a script language that enables you to generate SSIS packages based on the template you define. This template in your case may include the creation of the tables (if they do not exist) and the mapping / copying of the source. You can then wrap the resulting SSIS packages inside another BIML Master package.
It can be a little clunky if you are not using MIST, but its incredibly powerful once you get into it. A good starting point for you would be Andy Leonard's Stairway to heaven series as it provides a step-by-step walk through of moving data from source to target. After the stairway guide, check out BIML Script

how create a sql database fom a stongly typed dataset

I'm looking for an easy way to transfer a database schema I have developed inside visual studio as a strongly typed dataset (xsd file) into a corresponding sql server database. Silly me I assumed the process would be forthright, but I can't find out how to do it. I assume I could duplicate the tables column by column, but that seems so error prone. Does anyone know of a way to perform the schema transfer like this? Maybe a tool to translate the xsd file into a corresponding sql server ddl file?
Final thought once I have the schema transferred moving data around between the two data stores will be straight forward, its just getting the schemas synced that has me stumped...
Thanks,
Keith
Why didn't you implement your data model directly in SQL Server ?! It is more common and engineered and I think this is why Microsoft has not provided any wizard or tool for this case. As well you can make your data model as scripts or .sql files and they can be managed via SVN and whenever you need the model implementation you can sue them.

Will SSIS work well for importing to multiple tables?

I won't have access to SSIS until tomorrow so I thought I'd ask for advice before I start work on this project.
We currently use Access to store our data. It's not stored in a relational format so it's an awful mess. We want to move to a centralized database (SQL Server 2008 R2), which would require rewriting much of our codebase (which, incidentally, is also an awful mess.) Due to a time constraint, well before that can be done we are going to need to get a centralized database set up solely for the purpose of on-demand report generation for a client. So, our applications will still be running on Access. Instead of:
Receive data -> Import to Access initial file with one table -> Data processing -> Access result file with one table -> Report generation
The goal is:
Receive data -> Import to Access initial file with one table -> Import initial data to multiple tables in SQL Server -> Export Access working file with one table -> Data processing -> Access result file -> Import result to multiple tables in SQL Server -> Report generation whenever
We're going to use SSRS for the reporting component, which seems like it'll be straightforward enough. I'm not sure if SSIS alone would work well for splitting the Access data up into numerous tables, or if everything should be imported into a staging table with SSIS and then split up with stored procedures, or if I'll need to be writing a standalone application for this.
Haven't done much of any work with SQL Server before, so any advice is appreciated.
In SSIS package, you can write code (e.g. C#) to do your own/custom data transformations. However, SSIS comes with built-in transformations that may be good for your needs. SSIS is very powerful and flexible. Actually, you may do pretty much anything you want with the data in SSIS.
The high level workflow for your task could like like the following:
1. Connect to the data source and pull the data
2. Transform the data
3. Output data to the destination data source
You certainly can split a data flow into two separate branches and send it to two destinations. All you need to do is put a multi-cast in the dataflow and then the bulk of the transformations will happen after that.
From what you've said, however, a better solution might be to use the Access tables as a staging database and then grab the data from there and send it to SQL Server. That would mean two data flows but it will be a cleaner implementation.

Methods of maintaining sample data in a database

Firstly, let me apologize for the title, as it probably isn't as clear as I think it is.
What I'm looking for is a way to keep sample data in a database (SQL, 2005 2008 and Express) that get modified every so often. At present I have a handful of scripts to populate the database with a specific set of data, but every time the database is changed all the scripts have to be more or less rewritten and I was looking for some alternatives.
I've seen a number of tools and other software for creating sample data in a database, some free and some not. Are there any other methods I haven’t considered?
Thanks in advance for any input.
Edit: Also, if anyone has any advice at all in dealing with keeping data in sync with a changing application or database, that would be of some help as well.
If you are looking for tools for SQL server, go visit Red Gate Software, they have the best tools. They have a data compare tool that you can use to keep lookup type tables up-to-date and a SQL compare tool that you can use to keep the tables synched up between two datbases. So using SQL data compare, create a datbase with all the sample data you want. Then periodically refresh your testing db (or your prod db if these are strictly lookup type tables) using the compare tool.
I also like the alternative of having a script (you can use Red Gate's tool to create scripts) because that means you can store this info in your source control and use it as part of a deployment package to other servers.
You could save them in another database or the same db in different tables distinguished by the name, like employee_test
Joseph,
Do you need to keep just the data in sync, or the schema as well?
One solution to the data question would be SQL Server snapshots. You create a snapshot of your initial configuration, so any changes to the "real" database don't show up in the snapshot. Then, when you need to reset the table, select from the snapshot into a new table. I'm not sure how it will work if the schema changes, but it might be worth a try.
For generation of sample data, the Database project in Visual Studio has functionality that will create fake/random data.
Let me know if this make sense.
Erick