what is a danger of leaving constraints errors in database?

what is a danger of leaving constraints errors in database? - sql

I had to remove few thousands records from a table which had many FK constraints.
I stopped constraints enforcement in for all tables and deleted the records plus fixed some
obvious constraints which DBCC check showed me and then I have enabled back constraints.
After that DBCC check still shows some errors unfortunately, but at this point I have no time
to search why. My question is then like in title maybe stupid but what can happen if I leave database for a while with constraints errors? Will the app which uses this DB be affected?
Can I deffer fixing the constraints? (I use SQL Server 2008)
thanks

You'll have data that is lacking integrity (has some invalid information). This will not make the database crash or produce errors, but how the application will react to this depends entirely on the application. You might see some error pages, or even worse, the invalid data will propagate to produce even more invalid data or actions (that could be even harder to track down and fix afterwards).
The point of these constraints is to make the database validate (or reject) the data, so that the application can rely on certain invalid patterns not occurring in the data. Many applications are built to not use these assurances and handle data integrity themselves, but if your application did depend on them, and then you pull them out from under it, that sounds a bit dangerous.

Related

Do you need to fully validate data both in Database and Application?

For example, if I need to store a valid phone number in a database, should I fully validate the number in SQL, or is it enough if I fully validate it in the app, before inserting it in the db, and just add some light validation in SQL constraints (like having the correct number of digits).

There is no correct answer to this question.
In general, you want the database to maintain data integrity -- and that includes valid values in columns. You want this for multiple reasons:
Databases are usually more efficient, because they are on multi-threaded servers.
Databases can handle concurrent threads (not an issue for a check constraint, but an issue for other types of constraints).
Databases ensure the data integrity regardless of how the data is changed.
A check constraint (presumably what you want) is part of the data definition and applies to all inserts and updates. Such operations might occur in multiple places in the application.
The third piece is important. If you want to ensure that a phone number looks like a phone number, then you don't want someone to change it accidentally using update.
However, there might be checks that are simpler in the application. Or that might only apply when a new row is inserted, but not later updated. Or, that you want only to apply to data that comes in from the application (as opposed to manual changes). So, there are reasons why you might not want to do all checks in the database.

You definitily have to validate incoming data at your backend before e.g. doing crud operations on your database, since client side validation could bei omitted or even faked. It is considered to be a good practise to validate input data at the client. But you should never ever trust the client.

Create trigger upon each table creation in SQL Server 2008 R2

I need to create an Audit table that is going to track the actions (insert, update, delete) of my tables in the database and add new row with date, row id, table name and a few more details, so I will know what action happened and when.
So basically from my understanding I need a trigger for each table which is going to track insert/update/delete and a trigger on the database which is going to track new table creation.
My main problem is understanding how to connect between those things so when a new table is being created a trigger will be created for that table which is going to track the actions and add new rows for the Audit table as needed.
Is it possible to make a DDL trigger for create_table and inside of it another trigger for insert / update / delete ?

What you're hoping for is not possible. And I'd strongly advise that you'd be better off thinking about what you really want to achieve at a business level with auditing. It will yield a much simpler and more practical solution.
First up
...trigger on the database which is going to track new table creation.
I cannot stress enough how terrible this idea is. Who exactly has such unfettered access to you database that they can create tables without going through code-review and QA? Which should of course be on the gated pathway towards production. Once you realise that schema changes should not happen ad-hoc, it's patently obvious that you don't need triggers (which are by their very nature reactive) to do something because the schema changed.
Even if you could write such triggers: it's at a meta-programming level that simply isn't worth the effort of trying to foresee all possible permutations.
Better options include:
Requirements assessment and acceptance: This is new information in the system. What are the audit requirements?
Design review: New table; does it need auditing?
Test design: How to test an audit requirements?
Code Review: You've added a new table. Does it need auditing?
Not to mention features provided by tools such as:
Source Control.
Db deployment utilities (whether home-grown or third party).
Part two
... a trigger will be created for that table which is going to track the actions and add new rows for the Audit table as needed.
I've already pointed out why doing the above automatically is a terrible. Now I'm going a step further to point out that doing the above at all is also a bad idea.
It's a popular approach, and I'm sure to get some flack from people who've nicely compartmentalised their particular flavour of it; swearing blind how much time it "saves" them. (There may even be claims to it being a "business requirement"; which I can assure you is more likely a misstated version of the real requirement.)
There are fundamental problems with this approach:
It's reactive instead of proactive. So it usually lacks context.
You'll struggle to audit attempted changes that get rolled back. (Which can be a nightmare for debugging and usually violates real business audit requirements.)
Interpreting audit will be a nightmare because it's just raw data. The information is lost in the detail.
As columns are added/renamed/deleted your audit data loses cohesion. (This is usually the least of problems though.)
These extra tables that always get updated as part of other updates can wreak havoc on performance.
Usually this style of auditing involves: every time a column is added to the "base" table, it's also added to the "audit" table. (This ultimately makes the "audit" table very much like a poorly architected persistent transaction log.)
Most people following this approach overlook the significance of NULLable columns in the "base" tables.
I can tell you from first hand experience, interpreting such audit trails in any but the simplest of cases is not easy. The amount of time wasted is ridiculous: investigating issues, training others to be able to interpret them correctly, writing utilities to try make working with these audit trails less painful, painstakingly documenting findings (because the information is not immediately apparent in the raw data).
If you have any sense of self-preservation you'll heed my advice.
Make it great
(Sorry, couldn't resist.)
A better approach is to proactively plan for what needs auditing. Push for specific business requirements. Note that different cases may need different auditing techniques:
If user performs action X, record A details about the action for legal traceability.
If user attempts to do Y but it prevented by system rules, record B details to track rule system integrity.
If user fails to log in, record C details for security purposes.
If system is upgraded, record D details for troubleshooting.
If certain system events occur, record E details ...
The important thing is that once you know the real business requirements, you won't be saying: "Uh, let's just track everything. It might be useful." Instead you'll:
Be able to produce a cleaner more appropriate and reliable design for each distinct kind of auditing.
Be able to test that it behaves as required!
Be able to use the audit data more easily whenever it's needed.

Multiple application on network with same SQL database

I will have multiple computers on the same network with the same C# application running, connecting to a SQL database.
I am wondering if I need to use the service broker to ensure that if I update record A in table B on Machine 1, the change is pushed to Machine 2. I have seen applications that need to use messaging servers to accomplish this before but I was wondering why this is necessary, surely if they connect to the same database, any changes from one machine will be reflected on the other?
Thanks :)

This is mostly about consistency and latency.
If your applications always perform atomic operations on the database, and they always read whatever they need with no caching, everything will be consistent.
In practice, this is seldom the case. There's plenty of hidden opportunities for caching, like when you have an edit form - it has the values the entity had before you started the edit process, but what if someone modified those in the mean time? You'd just rewrite their changes with your data.
Solving this is a bunch of architectural decisions. Different scenarios require different approaches.
Once data is committed in the database, everyone reading it will see the same thing - but only if they actually get around to reading it, and the two reads aren't separated by another commit.
Update notifications are mostly concerned with invalidating caches, and perhaps some push-style processing (e.g. IM client might show you a popup saying you got a new message). However, SQL Server notifications are not reliable - there is no guarantee that you'll get the notification, and even less so that you'll get it in time. This means that to ensure consistency, you must not depend on the cached data, and you have to force an invalidation once in a while anyway, even if you didn't get a change notification.
Remember, even if you're actually using a database that's close enough to ACID, it's usually not the default setting (for performance and availability, mostly). You need to understand what kind of guarantees you're getting, and how to write code to handle this. Even the most perfect ACID database isn't going to help your consistency if your application introduces those inconsistencies :)

SQL FK and Pre-Checking for Existence

I was wondering what everyone's opinion was with regard to pre-checking foreign key look ups before INSERTS and UPDATES versus letting the database handle it. As you know the server will throw an exception if the corresponding row does not exist.
Within .NET we always try to avoid Exception coding in the sense of not using raised exceptions to drive code flow. This means we attempt to detect potential errors before the run-time does.
With SQL I see two opposite points
1) Whether you check or not the database always will. This means that you could be wasting (how much is subjective) CPU cycles doing the same check twice. This makes one lean towards letting the database do it only.
2) Pre-checking allows the developer to raise more informative exceptions back to the calling application. Instead of receiving the generic "foreign key violation" one could return different error codes for each check that needs to be done.
What are your thoughts?

Don't test before:
the DB engine will check anyway on INSERT (you have 2 reads of the index, not one)
it won't scale without lock hints or semaphores which reduce concurrency and performance (an 2nd overlapping concurrent call can pass the EXISTS before the first call does an INSERT)
What you can do is to wrap the INSERT in it's own TRY/CATCH and ignore error xxxx (foreign key violation, sorry don't know it). I've mentioned this before (for unique keys, error 2627)
Only inserting a row if it's not already there
Select / Insert version of an Upsert: is there a design pattern for high concurrency?
SQL Server 2008: INSERT if not exits, maintain unique column
This scales very well to high volumes.

Data integrity maintanence is the Databases's job, so I would say you let the DB handle it. Raised exceptions in this case is a valid case, and even though it could be avoided, it is a correctly raised exception, because it means something in the code didn't work right, that it is sending an orphaned record for insert (or something failed in the first insert - however way you are inserting it). Besides, you should have try/catch anyway, so you can implement a meaningful way to handle this...

I don't see the benefit of pre-checking for FK violations.
If you want more informative error statements, you can simply wrap your insert in a try-catch block and return custom error messages at that point. That way you're only running the extra queries on failure rather than every time.

How to rollback a database deployment without losing new data?

My company uses virtual machines for our web/app servers. This allows for very easy rollbacks of a deployment if something goes wrong. However, if an app server deployment also requires a database deployment and we have to rollback I'm kind of at a loss. How can you rollback database schema changes without losing data? The only thing that I can think of is to write a script that will drop/revert tables/columns back to their original state. Is this really the best way?

But if you do drop columns then you will lose data since those columns/tables (supposedly) will contain some data. And since I'd assume that any rollbacks often are temporary in that a bug is found, a rollback is made to get it going while that's fixed and then more or less the same changes are re-installed, the users could get quite upset if you lost that data and they had to re-enter it when the system was fixed.
I'd suggest that you should only allow additions of tables and columns, no alterations or deletions, then you can rollback just the code and leave the data as is, if you have a lot of rollbacks you might end up with some unused columns, but that shouldn't happen that often that someone added a table/column by mistake and in that case the DBA can remove them manually.

Generally speaking you can not do this.
However assuming that such a rollback makes sense it implies that the data you are trying to retain is independent from the schema changes you'd like to revert.
One way to deal with it would be to:
backup only data (script),
revert the schema to the old one and
restore the data
The above would work well if schema changes would not invalidate the created script (for example changing number of columns would be tricky).
This question has details on tools available in MS SQL for generating scripts.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas