Allow a newly created table to inherit constraints from the parent table - sql

I have recently been allotted development tasks at my job after a couple of years providing customer training/service/support (Yay, long run paying off!) I pointed out to my supervisor and was agreed with that we need to add a couple of tables to implement a new feature in order to stay consistent in the way that our front-side application handles the information in the database.
Currently, information is stored all together in one table until it is approved by an end user to be permanently added to the database. It is possible for the information to be one of three types - Two of the three are special case scenarios that require a second level of finalization from an end user before being put into their final location. For these two that require the second level of finalization, all of the other bits and pieces of information that supplement the primary information are stored in separate tables from the primary piece.
My question is this - These secondary tables where the information temporarily resides before being finalized are basically all foreign keys. I was looking at creating the new set of tables from the existing initial tables - Is there a way during that creation for the new secondary table to inherit the constraints that are coming with the initial table columns?
While both of these particular tables are going to be small ones and I don't mind manually creating the script to add the constraints after the creation of the table, it seems like this would be valuable information to know in the future. I've looked through stack overflow, but all questions that are remotely similar are from a different version of SQL.
Additionally, this would have to work all the way back through SQL Server 2008, as we have not stopped supporting 2008 yet.

If you go into Management Studio, right click on the table and go to "Script Table as >" Then "Create to >" . That will add all the constraints.

Related

I'm a new CDS/Dataverse user and am wondering why there are so many columns in new tables?

I'm new to CDS/Dataverse, coming from the SQL Server world. I created a new Dataverse table and there are over a dozen columns in my "new" table (e.g. "status", "version number"). Apparently these are added automatically. Why is this?
Also, there doesn't seem to be a way to view a grid of data (like I can with SQL Server) for quick review/modification of the data. Is there a way to view data visually like this?
Any tips for a new user, coming from SQL Server, would be appreciated. Thanks.
Edit: clarified the main question with examples (column names). (thanks David)
I am also new to CDS/Dataverse, so the following is a limited understanding from what I have explored so far.
The idea behind Dataverse is that it gives you a pre-built schema that follows best-practice for you build off of, so that you spend less time worrying about building a comprehensive data schema, creating tables, and how to relate them all together, and more time building applications in Power Apps.
For example, amongst the several dozen tables it generates from the get-go is Account and Contact. The former is for organisational entities and the latter is for single-person entities. You can go straight into adding your user records in one of these tables and take advantage of bits of Power Apps functionality already hooked up to these tables. You do not have to spend time thinking up column names, creating the table, making sure it hooks up to all the other Dataverse tables, testing whether the Power Apps functionality works with it correctly etc.
It is much the same story with the automatically generated columns for new tables: they are all there to maintain a best-practice schema and functionality for Power Apps. For example, the extra columns give you good auditing with the data you add, including when a row was created, modified, who created the row etc. The important thing is to start from what you want to build, and not get too caught up in the extra tables/columns. After a bit of research, you'll probably find you can utilise some more tables/columns in your design.
Viewing and adding data is very tedious -- it seems to take 5 clicks and several seconds to load the bit of data you want, which is eons in comparison to doing it in SQL Server. I believe it is how it is due to Microsoft's attempt to make it "user friendly".
Anyhow, the standard way to view data, starting from the main Power Apps view is:
From the right-hand side pane, click Data
Click Tables
From the list of tables, click your table
Along the top row, click Data
There is an alternative method that allows you to view the Dataverse tables in SSMS – see link below:
https://www.strategy365.co.uk/using-sql-to-query-the-common-data-service/
To import data in bulk:
Click on Data from the top drop-down menu > Get data.
Importing data from Excel is free. To import from other sources, including SQL Server, I believe is a paid service (although I think you may be able to do this on the free Community Plan).

Design a process to archive data (SQL Server 2005)

We're designing a process to archive a set of records based on different criteria like date, status, etc...
Simplified set of tables: Claim, ClaimDetails, StatusHistory (tracks changes in Claim status), Comments & Files
Environment: SQL Server 2005, ASP.Net MVC (.NET Framework v3.5 SP1)
Claim is the main entity and it has child row(s) in the sub tables mentioned. Some have details and others are used to track changes. Eventually based on some criteria a Claim becomes "ready to archive" as explained above. In simple words archived Claims will be identified from the database and treated differently in the web-app.
Here's a simplest version: (top view)
Create a script which marks a Claim "archived" in db.
Archived row and its child row(s) can either be kept in the same table (set a flag) or moved to a different table which will be a replica of the original one.
In the web app we'll let the user filter Claims based on the archive status.
There might be a need to "unarchive" a Claim in the future.
What I need to know?
We need to keep this as simple and easy as possible and also flexible to adapt future changes - pls suggest an approach.
Is a timely scheduled SQL script the best option for this scenario?
Should I consider using separate tables or just add an "Archived" flag to each table?
For performance consideration of the selected approach - what difference would it make if we plan to have about 10,000 Claims and what if its 1 million. In short pls mention a light & heavy load approach.
We store files uploaded physically on our server - I believe this can stay as it is.
Note: A claim can have any number of child record(s) in all the mentioned table so it gets n-fold.
Is there a standard framework or pattern that I should refer to learn about archiving process. Any sample ref article or tool would help me learn more.
You can find some good discussion on this topic here and this is another one.
Having archived data in separate table gives more flexibility like if you want to track the user who marked a claim as archived or the date when a claim is archived or to see all the changes made to a claim after it is created.

Database and application design - removing constraints?

I'm working on a Compact Framework app running on Windows Mobile. It's to be used by delivery drivers to tell them their next job and track spending etc. I have a SQL CE database on the mobile devices and SQL Server on the server. After struggling with major performance and configuration problems with the Sync Framework I ended up writing my own sync code using WCF. This works well and is a lot faster than the Sync Framework but I've been asked to speed it up further. Now we get into the details of the problem. Hopefully I can explain this clearly.
The synchronisation works one table at a time and is only one-way. Updates are sent from the server to the PDA only. Data travelling back to the server is handled a completely different way. First of all I delete any records on the PDA that have been removed from the server. Because of database constraints I have to delete from 'child' tables before deleting from 'parent' tables so I work up the heirachy from the bottom. E.G. I delete records from the invoice table before deleting from the products table.
Next I add new records to the PDA that have been added on the server. In this case I have to update the parent tables first and work down the heirachy and update child tables later.
The problem is that my boss doesn't like the fact that my app will keep a large table like the products table synchronised with the server when the delivery driver only needs the
invoiceProduct table. The invoiceProduct table links the invoice and products table together and contains some information about the product. I mean that their database design is not normalised and the product name has been duplicated and stored in the invoiceProduct table as well as the product table. Of course we all know this is poor design but it seems they have done this to improve performance in this type of situations.
The obvious solution is to just remove the products table completely from the PDA database. However I can't do this because it is sometimes needed. Drivers have the ability to add a new product to an invoice on the fly. My boss suggests that they could just synchronise the large products table occasionally or when they try to add a product and find that it's not there.
This won't work with the current design bacause if an invoice is downloaded containing a new product that is not on the PDA it will throw a database foreign key error.
Sorry about posting such a large message. Hopefully it makes sense. I don't want to remove my database constraints and mess up my nice data structure :(
You seems to be running into some architecture problem. I work on a product that somewhat has a similar situation. I had a client-server application where the client loaded too much data that isn't needed.
We used ADO.NET (Dataset) to reflect what the database has on the client side. The Dataset class is like a in memory CE SQL Server.
Our company starts having bigger clients and our architecture isn't fast enough to handle all the data.
In the past, we did the following. These are no fast solution:
Remove the "most" of the constraints
on the client side
all the frequently used data still have constraint in the
dataset.
Create logic to load a subset of data, instead of loading everything to the client. For example, we only load 7 days of works data, instead of every work data (which is what we did in the past).
Denormalized certain data by adding new columns, so that we don't have to load extra data we don't need
Certain data is only loaded when it is needed based on the client modules.
As long as you keep your database constraint on the SQL Server, you should have no data integrity issue. However, on your PDA side, you will need to more testing to ensure your application runs properly.
This isn't an easy problem to solve when you already have an existing architecture. Hopefully these suggestions help you.
Add a created_on field for your products and keep track of when the last time each pda synced. When the invoice is downloaded, check if the product is newer than the last sync and if its re-sync the pda. Does not seem like it would screw up the DB too much?

Purging SQL Tables from large DB?

The site I am working on as a student will be redesigned and released in the near future and I have been assigned the task of manually searching through every table in the DB the site uses to find tables we can consider for deletion. I'm doing the search through every HTML files source code in dreamweaver but I was hoping there is an automated way to check my work. Does anyone have any suggestions as to how this is done in the business world?
If you search through the code, you may find SQL that is never used, because the users never choose those options in the application.
Instead, I would suggest that you turn on auditing on the database and log what SQL is actually used. For example in Oracle you would do it like this. Other major database servers have similar capabilities.
From the log data you can identify not only what tables are being used, but their frequency of use. If there are any tables in the schema that do not show up during a week of auditing, or show up only rarely, then you could investigate this in the code using text search tools.
Once you have candidate tables to remove from the database, and approval from your manager, then don't just drop the tables, create them again as an empty table, or put one dummy record in the table with mostly null values (or zero or blank) in the fields, except for name and descriptive fields where you can put something like "DELETED" "Report error DELE to support center", etc. That way, the application won't fail with a hard error, and you have a chance at finding out what users are doing when they end up with these unused tables.
Reverse engineer the DB (Visio, Toad, etc...), document the structure and ask designers of the new site what they need -- then refactor.
I would start by combing through the HTML source for keywords:
SELECT
INSERT
UPDATE
DELETE
...using grep/etc. None of these are HTML entities, and you can't reliably use table names because you could be dealing with views (assuming any exist in the system). Then you have to pour over the statements themselves to determine what is being used.
If [hopefully] functions and/or stored procedures were used in the system, most DBs have a reference feature to check for dependencies.
This would be a good time to create a Design Document on a screen by screen basis, listing the attributes on screen & where the value(s) come from in the database at the table.column level.
Compile your list of tables used, and compare to what's actually in the database.
If the table names are specified in the HTML source (and if that's the only place they are ever specified!), you can do a Search in Files for the name of each table in the DB. If there are a lot of tables, consider using a tool like grep and creating a script that runs grep against the source code base (HTML files plus any others that can reference the table by name) for each table name.
Having said that, I would still follow Damir's advice and take a list of deletion candidates to the data designers for validation.
I'm guessing you don't have any tests in place around the data access or the UI, so there's no way to verify what is and isn't used. Provided that the data access is consistent, scripting will be your best bet. Have it search out the tables/views/stored procedures that are being called and dump those to a file to analyze further. That will at least give you a list of everything that is actually called from some place. As for if those pages are actually used anywhere, that's another story.
Once you have the list of the database elements that are being called, compare that with a list of the user-defined elements that are in the database. That will give you the ones that could potentially be deleted.
All that being said, if the site is being redesigned then a fresh database schema may actually be a better approach. It's usually less intensive to start fresh and import the old data than it is to find dead tables and fields.

Ideas for Combining Thousand Databases into One Database

We have a SQL server that has a database for each client, and we have hundreds of clients. So imagine the following: database001, database002, database003, ..., database999. We want to combine all of these databases into one database.
Our thoughts are to add a siteId column, 001, 002, 003, ..., 999.
We are exploring options to make this transition as smoothly as possible. And we would LOVE to hear any ideas you have. It's proving to be a VERY challenging problem.
I've heard of a technique that would create a view that would match and then filter.
Any ideas guys?
Create a client database id for each of the client databases. You will use this id to keep the data logically separated. This is the "site id" concept, but you can use a derived key (identity field) instead of manually creating these numbers. Create a table that has database name and id, with any other metadata you need.
The next step would be to create an SSIS package that gets the ID for the database in question and adds it to the tables that have to have their data separated out logically. You then can run that same package over each database with the lookup for ID for the database in question.
After you have a unique id for the data that is unique, and have imported the data, you will have to alter your apps to fit the new schema (actually before, or you are pretty much screwed).
If you want to do this in steps, you can create views or functions in the different "databases" so the old client can still hit the client's data, even though it has been moved. This step may not be necessary if you deploy with some downtime.
The method I propose is fairly flexible and can be applied to one client at a time, depending on your client application deployment methodology.
Why do you want to do that?
You can read about Multi-Tenant Data Architecture and also listen to SO #19 (around 40-50 min) about this design.
The "site-id" solution is what's done.
Another possibility that may not work out as well (but is still appealing) is multiple schemas within a single database. You can pull common tables into a "common" schema, and leave the customer-specific stuff in customer-specific schema. In some database products, however, the each schema is -- effectively -- a separate database. In other products (Oracle, DB2, for example) you can easily write queries that work in multiple schemas.
Also note that -- as an optimization -- you may not need to add siteId column to EVERY table.
Sometimes you have a "contains" relationship. It's a master-detail FK, often defined with a cascade delete so that detail cannot exist without the parent. In this case, the children don't need siteId because they don't have an independent existence.
Your first step will be to determine if these databases even have the same structure. Even if you think they do, you need to compare them to make sure they do. Chances are there will be some that are customized or missed an upgrade cycle or two.
Now depending on the number of clients and the number of records per client, your tables may get huge. Are you sure this will not create a performance problem? At any rate you may need to take a fresh look at indexing. You may need a much more powerful set of servers and may also need to partion by client anyway for performance.
Next, yes each table will need a site id of some sort. Further, depending on your design, you may have primary keys that are now no longer unique. You may need to redefine all primary keys to include the siteid. Always index this field when you add it.
Now all your queries, stored procs, views, udfs will need to be rewritten to ensure that the siteid is part of them. PAy particular attention to any dynamic SQL. Otherwise you could be showing client A's information to client B. Clients don't tend to like that. We brought a client from a separate database into the main application one time (when they decided they didn't still want to pay for a separate server). The developer missed just one place where client_id had to be added. Unfortunately, that sent emails to every client concerning this client's proprietary information and to make matters worse, it was a nightly process that ran in the middle of the night, so it wasn't known about until the next day. (the developer was very lucky not to get fired.) The point is be very very careful when you do this and test, test, test, and test some more. Make sure to test all automated behind the scenes stuff as well as the UI stuff.
what I was explaining in Florence towards the end of last year is if you had to keep the database names and the logical layer of the database the same for the application. In that case you'd do the following:
Collapse all the data into consolidated tables into one master, consolidated database (hereafter referred to as the consolidated DB).
Those tables would have to have an identifier like SiteID.
Create the new databases with the existing names.
Create views with the old table names which use row-level security to query the tables in the consolidated DB, but using the SiteID to filter.
Set up the databases for cross-database ownership chaining so that the service accounts can't "accidentally" query the base tables in the consolidated DB. Access must happen through the views or through stored procedures and other constructs that will enforce row-level security. Now, if it's the same service account for all sites, you can avoid the cross DB ownership chaining and assign the rights on the objects in the consolidated DB.
Rewrite the stored procedures to either handle the change (since they are now referring to views and they don't know to hit the base tables and include SiteID) or use InsteadOf Triggers on the views to intercept update requests and put the appropriate site specific information into the base tables.
If the data is large you could look at using a partioned view. This would simplify your access code as all you'd have to maintain is the view; however, if the data is not large, just add a column to identify the customer.
Depending on what the data is and your security requirements the threat of cross contamination may be a show stopper.
Assuming you have considered this and deem it "safe enough". You may need/want to create VIEWS or impose some other access control to prevent customers from seeing each-other's data.
IIRC a product called "Trusted Oracle" had the ability to partition data based on such a key (about the time Oracle 7 or 8 was out). The idea was that any given query would automagically have "and sourceKey = #userSecurityKey" (or some such) appended. The feature may have been rolled into later versions of the popular commercial product.
To expand on Gregory's answer, you can also make a parent ssis that calls the package doing the actual moving within a foreach loop container.
The parent package queries a config table and puts this in an object variable. The foreach loop then uses this recordset to pass variables to the package, such as your database name and any other details the package might need.
You table could list all of your client databases and have a flag to mark when you are ready to move them. This way you are not sitting around running the ssis package on 32,767 databases. I'm hooked on the foreach loop in ssis.