How to use Master Data Services data then? The MDS lifecycle - sql

So I know that we need to create Master Data Services db to make data clean, right, consistent, etc. We can import some data there, then process it.. and then? We should then export it to another db, is it so? So MDS is like a set of tools to clean and make your data right, and it is for only one use, right? I mean: we have our data -> we load it in MDS db with SSIS -> we process it, apply business rules, etc. -> we export it to our SQL Server db with SSIS -> then we can use it as we like.
Am I right?
I want to understand how MDS are using in practice, where data is going after MDS processes and where from data is got.
Thanks and sorry if it is dump question.

Using Master Data Services in conjunction with Data Quality Services and Integration Services, enables you to clean & normalize a data set, store it as your master (trusted) set of data, and enable others in your organization to view/share this master/trusted data set.
You may find this tutorial, last updated for SQL Server 2014, helpful in better understanding how to use these three technologies together -- https://technet.microsoft.com/en-us/library/jj819782(v=sql.120).aspx

Related

erasing all data and populate with dummy data

what is the best way to transfer database copy as backed up file for outside maintenance on the application? But that copy should not have any sensitive data and it can only have dummy data. What is the efficient and best practice to erase all data in the tables and populate with dummy data? ( sql server 2019)
This is not a trivial task. A 3rd party solution would probably be easiest.
There are several answers available here that discuss copying objects in SQL Server Management Studio. Example: Backup SQL Schema Only?.
If you have access to SQL Server Integration Services, you can copy selected objects using the Transfer SQL Server Objects Task. I have tried this once a long time ago, so I have very little experience to describe how it works.
Another option is to create a job that runs a copy-only backup, restores the database, and then runs a manual series of SQL queries to clear or mask sensitive data.

how to insert data in master data services programmatically

I'm trying out Microsoft Master Data Services and I would like to add data to the database programmatically. I'm starting to get the model/entity/member structure but I'm not yet sure. If you have a nice explanation for this structure, please share.
Say somebody added a new employee in an ERP system and I would like to send that to the MDS. How would I do that? Is the data that I want to add a new member? Because if I look at the following information (http://technet.microsoft.com/en-us/library/hh230995), the only way to import data is through entities?
Thanks in advance for any useful information about this!
Lets start with the basics.
Entities in Master Data Services (MDS) are roughly analogous to tables in a regular database.
Every entity must live in a model.
A model can contain any number of entities.
The Metadata* methods you see on that page can be used to create, read and update models and entities. Once you have modeled your ERP tables as an MDS model, you can use the EntityMembersCreate API (with the relevant model/entity information) to create a member (roughly analogous to a row in a table). You can use EntityMembersUpdate to update members and EntityMembersDelete to delete them.
Another way to get large amounts of data into MDS is by using Entity Based Staging. Entity Based Staging allows you to use tools like SSIS to get bulk data into MDS. A good primer here: http://msdn.microsoft.com/en-us/sqlserver/hh802433.aspx.
I hope this helps. Feel free to ask more questions.
I like using a generic data-access-object that classes in my model inherit from. Each class has a one to one relationship with tables in the database.
We're using SSIS to replicate data from our CRM (as well as other data sources) into our MDS (for the time-being). If you're not familiar with the tool, I'd recommend in terms of moving data around - it's relatively easy to pickup the basics. If you go this route, here's a great resource I followed to push data into our MDS system:
http://www.sqlchick.com/entries/2013/2/16/importing-data-into-master-data-services-2012-part-2.html

Will SSIS work well for importing to multiple tables?

I won't have access to SSIS until tomorrow so I thought I'd ask for advice before I start work on this project.
We currently use Access to store our data. It's not stored in a relational format so it's an awful mess. We want to move to a centralized database (SQL Server 2008 R2), which would require rewriting much of our codebase (which, incidentally, is also an awful mess.) Due to a time constraint, well before that can be done we are going to need to get a centralized database set up solely for the purpose of on-demand report generation for a client. So, our applications will still be running on Access. Instead of:
Receive data -> Import to Access initial file with one table -> Data processing -> Access result file with one table -> Report generation
The goal is:
Receive data -> Import to Access initial file with one table -> Import initial data to multiple tables in SQL Server -> Export Access working file with one table -> Data processing -> Access result file -> Import result to multiple tables in SQL Server -> Report generation whenever
We're going to use SSRS for the reporting component, which seems like it'll be straightforward enough. I'm not sure if SSIS alone would work well for splitting the Access data up into numerous tables, or if everything should be imported into a staging table with SSIS and then split up with stored procedures, or if I'll need to be writing a standalone application for this.
Haven't done much of any work with SQL Server before, so any advice is appreciated.
In SSIS package, you can write code (e.g. C#) to do your own/custom data transformations. However, SSIS comes with built-in transformations that may be good for your needs. SSIS is very powerful and flexible. Actually, you may do pretty much anything you want with the data in SSIS.
The high level workflow for your task could like like the following:
1. Connect to the data source and pull the data
2. Transform the data
3. Output data to the destination data source
You certainly can split a data flow into two separate branches and send it to two destinations. All you need to do is put a multi-cast in the dataflow and then the bulk of the transformations will happen after that.
From what you've said, however, a better solution might be to use the Access tables as a staging database and then grab the data from there and send it to SQL Server. That would mean two data flows but it will be a cleaner implementation.

Methods of maintaining sample data in a database

Firstly, let me apologize for the title, as it probably isn't as clear as I think it is.
What I'm looking for is a way to keep sample data in a database (SQL, 2005 2008 and Express) that get modified every so often. At present I have a handful of scripts to populate the database with a specific set of data, but every time the database is changed all the scripts have to be more or less rewritten and I was looking for some alternatives.
I've seen a number of tools and other software for creating sample data in a database, some free and some not. Are there any other methods I haven’t considered?
Thanks in advance for any input.
Edit: Also, if anyone has any advice at all in dealing with keeping data in sync with a changing application or database, that would be of some help as well.
If you are looking for tools for SQL server, go visit Red Gate Software, they have the best tools. They have a data compare tool that you can use to keep lookup type tables up-to-date and a SQL compare tool that you can use to keep the tables synched up between two datbases. So using SQL data compare, create a datbase with all the sample data you want. Then periodically refresh your testing db (or your prod db if these are strictly lookup type tables) using the compare tool.
I also like the alternative of having a script (you can use Red Gate's tool to create scripts) because that means you can store this info in your source control and use it as part of a deployment package to other servers.
You could save them in another database or the same db in different tables distinguished by the name, like employee_test
Joseph,
Do you need to keep just the data in sync, or the schema as well?
One solution to the data question would be SQL Server snapshots. You create a snapshot of your initial configuration, so any changes to the "real" database don't show up in the snapshot. Then, when you need to reset the table, select from the snapshot into a new table. I'm not sure how it will work if the schema changes, but it might be worth a try.
For generation of sample data, the Database project in Visual Studio has functionality that will create fake/random data.
Let me know if this make sense.
Erick

Create a database from another database?

Is there an automatic way in SQL Server 2005 to create a database from several tables in another database? I need to work on a project and I only need a few tables to run it locally, and I don't want to make a backup of a 50 gig DB.
UPDATE
I tried the Tasks -> Export Data in Management studio, and while it created a new sub database with the tables I wanted, it did not copy over any table metadata, ie...no PK/FK constraints and no Identity data (Even with Preserve Identity checked).
I obviously need these for it to work, so I'm open to other suggestions. I'll try that database publishing tool.
I don't have Integration Services available, and the two SQL Servers cannot directly connect to each other, so those are out.
Update of the Update
The Database Publishing Tool worked, the SQL it generated was slightly buggy, so a little hand editing was needed (Tried to reference nonexistent triggers), but once I did that I was good to go.
You can use the Database Publishing Wizard for this. It will let you select a set of tables with or without the data and export it into a .sql script file that you can then run against your other db to recreate the tables and/or the data.
Create your new database first. Then right-click on it and go to the Tasks sub-menu in the context menu. You should have some kind of import/export functionality in there. I can't remember exactly since I'm not at work right now! :)
From there, you will get to choose your origin and destination data sources and which tables you want to transfer. When you select your tables, click on the advanced (or options) button and select the check box called "preserve primary keys". Otherwise, new primary key values will be created for you.
I know this method can hardly be called automatic but why don't you use a few simple SELECT INTO statements?
Because I'd have to reconstruct the schema, constraints and indexes first. Thats the part I want to automate...Getting the data is the easy part.
Thanks for your suggestions everyone, looks like this is easy.
Integration Services can help accomplish this task. This tool provids advanced data transformation capabilities so you will be able to get exact subset of data that you need from large database.
Assuming that such data is needed for testing/debugging you may consider applying Row Sampling to reduce amount of data exported.
Create new database
Right click on it,
Tasks -> Import Data
Follow instructions