SQL Server Database securing against clever admins? - sql

I want to secure events stored in one table, which has relations to others.
Events are inserted through windows service, that is connecting to hardware and reading from the hardware.
In events table is PK, date and time, and 3 different values.
The problem is that every admin can log in and insert/update/delete data in this table e.g. using sql management studio. I create triggers to prevent update and delete, so if admin doesn't know triggers, he fail to change data, but if he knows trigger, he can easily disable trigger and do whatever he wants.
So after long thinking I have one idea, to add new column (field) to table and store something like checksum in this field, this checksum will be calculated based on other values. This checksum will be generated in insert/update statements.
If someone insert/update something manually I will know it, because if I check data with checksum, there will be mismatches.
My question is, if you have similar problem, how do you solve it?
What algorithm use for checksum? How to secure against delete statement (I know about empty numbers in PK, but it is not enough) ?
I'm using SQL Server 2005.

As admins have permissions to do everything on your SQL Server, I recommend a temper-evident auditing solution. In this scenario – everything that happens on a database or SQL Server instance is captured and saved in a temper-evident repository. In case someone who has the privileges (like admins) modifies or deletes audited data from the repository, it will be reported
ApexSQL Comply is such a solution, and it has a built in integrity check option
There are several anti-tampering measures that provide different integrity checks and detect tampering even when it’s done by a trusted party. To ensure data integrity, the solution uses hash values. A hash value is a numeric value created using a specific algorithm that uniquely identifies it
Every table in the central repository database has the RowVersion and RowHash column. The RowVersion contains the row timestamp – the last time the row was modified. The RowHash column contains the unique row identifier for the row calculated using the values other table columns
When the original record in the auditing repository is modified, ApexSQL Comply automatically updates the RowVersion value to reflect the time of the last change. To verify data integrity, ApexSQL Comply calculates the RowHash value for the row based on the existing row values. The values used in data integrity verification now updated, and the newly calculated RowHash value will therefore be different from the RowHash value stored in the central repository database. This will be reported as suspected tampering
To hide the tampering, I would have to calculate a new value for RowHash and update it. This is not easy, as the formula used for calculation is complex and non-disclosed. But that’s not all. The RowHash value is calculated using the RowHash value from the previous row. So, to cover up tampering, I would have to recalculate and modify the RowHas values in all following rows
For some tables in the ApexSQL Comply central repository database, the RowHash values are calculated based on the rows in other tables, so to cover tracks of tampering in one table, the admin would have to modify the records in several central repository database tables
This solution is not tamper-proof, but definitely makes covering tempering tracks quite difficult
Disclaimer: I work for ApexSQL as a Support Engineer

Security through obscurity is a bad idea. If there's a formula to calculate a checksum, someone can do it manually.
If you can't trust your DB admins, you have bigger problems.

Anything you do at the server level the admin can undo. That's the very definition of its role and there's nothing you can do to prevent it.
In SQL 2008 you can request auditing of the said SQL server with X events, see http://msdn.microsoft.com/en-us/library/cc280386.aspx. This is CC compliant solution that is tamper evident. That means the admin can stop the audit and do its mischievous actions, but the stopping of the audit is recorded.
In SQL 2005 the auditing solution recommended is using the profiler infrastructure. This can be made tamper evident when correctly deployed. You would prevent data changes with triggers and constraints and audit DDL changes. If the admin changes the triggers, this is visible in the audit. If the admin stops the audit, this is also visible in the audit.
Do you plan this as a one time action against a rogue admin or as a feature to be added to your product? Using digital signatures to sign all your application data can be very costly in app cycles. You also have to design a secure scheme to show that records were not deleted, including last records (ie. not a simple gap in an identity column). Eg. you could compute CHECSUM_AGG over BINARY_CHECKSUM(*), sign the result in the app and store the signed value for each table after each update. Needles to say, this will slow down your application as basically you serialize every operation. For individual rows cheksums/hashes you would have to compute the entire signature in your app, and that would require possibly values your app does not yet have (ie. the identity column value to be assigned to your insert). And how far do you want to go? A simple hash can be broken if the admin gets hold of your app and monitors what you hash, in what order (this is trivial to achieve). He then can recompute the same hash. An HMAC requires you to store a secret in the application which is basically impossible against a a determined hacker. These concerns may seem overkill, but if this is an application you sell for instance then all it takes is for one hacker to break your hash sequence or hmac secret. Google will make sure everyone else finds out about it, eventually.
My point is that you're up the hill facing a loosing battle if you're trying to deter the admin via technology. The admin is a person you trust and if this is broken in your case, the problem is trust, not technology.

Ultimately, even if admins do not have delete rights, they can give themselves access, make the change to not deny deletes, delete the row and then restore the permission and then revoke their access to make permission changes.
If you are auditing that, then when they give themselves access, you fire them.
As far as an effective tamper-resistant checksum, it's possible to use public/private key signing. This will mean that if the signature matches the message, then no one except who the record says created/modified the record could have done it. Anyone can change and sign the record with their own key, but not as someone else.

I'll just point to Protect sensitive information from the DBA in SQL Server 2008

The idea of a checksum computed by the application is a good one. I would suggest that you research Message Authentication Codes, or MACs, for a more secure method.
Briefly, some MAC algorithms (HMAC) use a hash function, and include a secret key as part of the hash input. Thus, even if the admin knows the hash function that is used, he can't reproduce the hash, because he doesn't know all of the input.
Also, in your case, a sequential number should be part of the hash input, to prevent deletion of entire entries.
Ideally, you should use a strong cryptographic hash function from the SHA-2 family. MD5 has known vulnerabilities, and similar problems are suspected in SHA-1.

It might be more effective to try to lock down permissions on the table. With the checksum, it seems like a malicious user might be able spoof it, or insert data that appears to be valid.
http://www.databasejournal.com/features/mssql/article.php/2246271/Managing-Users-Permissions-on-SQL-Server.htm

If you are concerned about people modifying the data, you should also be concerned about them modifying the checksum.
Can you not simply password protect certain permissions on that database?

Related

Do you need to fully validate data both in Database and Application?

For example, if I need to store a valid phone number in a database, should I fully validate the number in SQL, or is it enough if I fully validate it in the app, before inserting it in the db, and just add some light validation in SQL constraints (like having the correct number of digits).
There is no correct answer to this question.
In general, you want the database to maintain data integrity -- and that includes valid values in columns. You want this for multiple reasons:
Databases are usually more efficient, because they are on multi-threaded servers.
Databases can handle concurrent threads (not an issue for a check constraint, but an issue for other types of constraints).
Databases ensure the data integrity regardless of how the data is changed.
A check constraint (presumably what you want) is part of the data definition and applies to all inserts and updates. Such operations might occur in multiple places in the application.
The third piece is important. If you want to ensure that a phone number looks like a phone number, then you don't want someone to change it accidentally using update.
However, there might be checks that are simpler in the application. Or that might only apply when a new row is inserted, but not later updated. Or, that you want only to apply to data that comes in from the application (as opposed to manual changes). So, there are reasons why you might not want to do all checks in the database.
You definitily have to validate incoming data at your backend before e.g. doing crud operations on your database, since client side validation could bei omitted or even faked. It is considered to be a good practise to validate input data at the client. But you should never ever trust the client.

Table with multiple foreign keys -- only one not null

I'm trying to design a system where an administrator will have to approve changes to the data and other various administrative tasks -- add a user, add an admin etc.
My idea is to have a notification table that contains these notifications, but the problem is that a notification can be any of the previously mentioned types, ie it's data is stored in one of many tables. Here is a picture to describe my current plan -- note I'm sure that it's not a proper ER diagram.
full_screen
Also, the data goes into a pending table, that reflects the table it will eventually wind up in, provided the data is approved -- it's a staging ground of sorts. So, a pending_user is a user that is not in the user table. And as you can see the user table, amongst others, is not shown here, but one can use their imagination.
I'm concerned that the multiple null values in the pending table will have adverse effects that I'm not totally aware of, such as increased space usage and possibly increase query time. Also, I'm not sure how I'll implement the retrieval of these notifications. My naive approach is to select the first X notifications, analyze the rows to find the non-null column, retrieve the appropriate data and then load all the data in a response.
Is there a more straight forward pattern for this type of problem?
Thanks in advance for any help.
I think, the traditional way is to provide various levels of access/read/write rights to users. These access rights define what actions a user can and can't perform. In this traditional approach if a user has access to a certain function, he can do it without further approval.
Also, traditionally there are some kind of audit logs that contain a trace of all important changes to the data. With such logs it would be possible to know who made a change (and when).
If you need to build a two-stage system, where a change has to go through an approval, I'd add a flag column to each important table that would indicate that values in the given row are not final and have to be approved. The table would store all historical changes to the data and with the help of this flag the system would know which variant is the latest approved version and which variant is pending and waiting for approval.
I would not try to make a single universal table that would hold data related to changes in many different tables. Each table is different and approval process for each table is likely to be different. I doubt that you'll have more than a dozen entities that are important enough to go through this approval process.

Should I create separate SQL Server database for each user?

I am working on Asp.Net MVC web application, back-end is SQL Server 2012.
This application will provide billing, accounting, and inventory management. The user will create an account by signup. just like http://www.quickbooks.in. Each user will create some masters and various transactions. There is no limit, user can make unlimited records in the database.
I want to keep stable database performance, after heavy data load. I am maintaining proper indexing and primary keys in it, but there would be a heavy load on the database, per user.
So, should I create a separate database for each user, or should maintain one database with UserID. Add UserID in each table and making a partition based on UserID?
I am not an expert in SQL Server, so please provide suggestions with clear specifications.
Please inform me if there is any lack of information.
A DB per user is what happens when customers need to be able pack up and leave taking the actual database with them. Think of a self hosted wordpress website. Or if there are incredible risks to one user accidentally seeing another user's data, so it's safer to rely on the servers security model than to rely on remembering to add the UserId filter to all your queries. I can't imagine a scenario like that, but who knows-- maybe if the privacy laws allowed for jail time, I would rather data partitioned by security rules rather than carefully writing WHERE clauses.
If you did do user-per-database, creating a new user will be 10x more effort. While INSERT, UPDATE and so on stay the same from version to version, with each upgrade the syntax for database, user creation, permission granting and so on will evolve enough to break those scripts each SQL version upgrade.
Also, this will multiply your migration headaches by the number of users. Let's say you have 5000 users and you need to add some new columns, change a columns data type, update a trigger, and so on. Instead of needing to run that change script 1x, you need to run it 5000 times.
Per user Dbs also probably wastes disk space. Each of those databases is going to have a transaction log, sitting idle taking up the minimum log space.
As for load, if collectively your 5000 users are doing 1 billion inserts, updates and so on per day, my intuition tells me that it's going to be faster on one database, unless there is some sort of contension issue (everyone reading and writing to the same table at the same time and the same pages of the same table). Each database has machine resources (probably threads and memory) per database doing housekeeping, so these extra DBs can't be free.
Anyhow, the best thing to do is to simulate the two architectures and use a random data generator to simulate load and see how they perform.
It's not an easy answer to give.
First, there is logical design to be considered. Then you have integrity, security, management and performance (in this very order).
A database is a logical unit of data, self contained. Ideally, you should be able to take a database, move it to another instance, probably change the connection strings and be running again.
All the constraints are database-level. No foreign keys can exist referencing some object outside the database.
So, try thinking in these terms first.
How would you reliably prevent one user messing up the other user's data? Keep in mind that it's just a matter of time before someone opens an excel sheet and fire up queries on the database bypassing your application. Row level security in SQL Server is something you don't want to deal with.
Multiple databases mean that all management tasks should be scripted out and executed on all databases. Yes, there is some overhead to it, but once you set it up it's just the matter of monitoring. If a database goes suspect, it's a single customer down, not all of them. You can even have different versions for different customes if each customer have it's own database. Additionally, if you roll an upgrade, you can do it per customer, so the inpact will be much less.
Performance is the least relevant factor here. Of course, it really depends on how many customers and how much data, but proper indexing will solve these issues. Scale-out is much easier with multiple databases.
BTW, partitioning, as you mentioned it, is never a performance booster, it's simply a management feature, allowing for faster loading and evicting of data from a table.
I'd probably put each customer in separate database, but it's up to you eventually to make a decision for yourself. Hope I've helped some with this.

Access sql that triggered the trigger from within trigger (Sybase)

Is there a way to access the sql that triggered a trigger from within the trigger? I've managed to get it by joining to the master..monProcessSQLText MDA table but this only works for users with the mon_role and I don't want to give that to everyone. Is there a global variable I've missed?
I'm trying to log all the updates run against a table so I can trace it back to an IP address and username.
This is with ASE 12.5.
If you are trying to
log all the updates run against a table so I can trace it back to an IP address and username
A trigger is definitely the wrong way to go about it, triggers were not designed for that, and there are other ASE facilities which were designed for that. It is not about the table, it is about security and monitoring in general.
Sybase Auditing.
It takes a bit of setting up, much less overhead than MAD tables; but most important, it was designed for auditing (MDA was not). And there is no coding requirements such as for MDA. It is highly configurable, the idea is to capture only what you need, and not more.
Monitoring.
I would not recommend MDA tables, but since you have them in place, and you have enabled monitoring, and accepted the 22% overhead for capturing SQL text... The info is very transient. In order to use them for any relevant purpose, such as yours, you need to write a capture-and-store mechanism, archiving all required info to an archive database. This has to be done on an ongoing basis, and completely independent of a trigger, etc. You can also filter on the fly to reduce the volume of data stored (warning, it is huge). purge data over 7 days old, etc. It is a little project in itself, that is why there are commercially available from 3rd parties.
Once either of these facilities are in place, then, separately, whenever you wish to inquire about who updated a table, when and from where, all you need to do is inspect the archive. nothing to do with a trigger, or difficulties getting the info from a trigger, or giving admin privileges to ordinary users.
Also, it needs to be appreciated that you do not have normal security in place, the tables are being updated directly by users; thus direct update permissions have been granted to either specific users, or worse, all users. The consequence is, there is no way of knowing who is updating the table, and who is breaking the data or referential integrity.
The secure method is to place the entire transaction in a stored proc, thus eliminating the possibility of incomplete transactions (as well as improving execution speed); and to grant permissions on the procs, not the tables, thus eliminating direct updates. Over time, you may wish to implement security in the server, so that the consequences do not have to be chased down and closed one by one, a process with no finite end.
As far as Auditing goes, if security were in place, then the auditing burden is also substantially reduced: you need to audit stored proc executions only. Otherwise, you need to audit all updates to all tables.

question about frequency of updating access

i have a table in an access database
this access database is used on a regular basis, basically from 9-5
someone else has a copy of this exact table. sometimes records are added, sometimes deleted, and sometimes data within the records is updated.
i need to update the access database table with the offsite table every hour or so. what is the best algorithm of updating the data? there are about 5000 records.
would it severely lock up the table for a few seconds every hour?
i would like to publicly apologize for my rude comment to david fenton
My impression is that this question ties together pieces you've been exploring with your previous questions:
a file "listener" to detect the presence of a new file and do something with it when found
list files with some extension in a folder
DoCmd.TransferText to pull file data into your database
Insert, Update, Delete records in a table based on an imported set of records
Maybe it's time to give us a more detailed picture of what you're dealing with.
Tony asked if both sites are on the same WAN (Wide Area Network). You replied they are on Windows. Elsewhere you said you're using a network. Please tell us about the network.
I'm still unsure whether you need a one-way or two-way data exchange. You've talked about importing changes from the remote table into the local master table. Do you need to do the same type of operation at the remote site: import changes made to the table at the master site?
Tell us what needs to happen regarding the issue James raised. Can local and remote users ever edit the same record? If they can, how will you resolve the conflict? Similarly, what should happen if a remote user updates a record and a local user deletes their copy of that record?
Based on what you've told us so far, this sounds like a real challenge for Access, made more challenging by the rate of record changes (5,000 per hour). I like the outline Kevin suggested. However your challenge will be more complicated since you also need to account for record deletions at both sites.
It seems like you may have to create something which duplicates Access' Replication feature. Maybe you should look at the Jet Replication Wiki to see if you can modify your design to take advantage of Replication. I can't help you there, and unfortunately you appear to have frustrated David Fenton who is a leading authority on Jet Replication.
If a few seconds performance is critical, you'd rather move to a better database engine (like Sqlite, MySQL, MS SQL server). If you want a single file, then Sqlite is the best for you. All these use by-single-record locks, so you can read and write simultaneously.
If you stay with access, you will probably have to implement a timer to update only a few records at a time.
Before you do anything else you need to establish the "rules" as far as collisions go.
If a row in the local copy is updated and the same row in the remote copy is updated which one is the "correct" version? Ditto for deletions, inserts are even more of a pain as you can have the "same" set of values but perhaps a different key.
After you have worked out how to handle each of these cases you can then go on to thinking about the implementation.
As other posters have suggested the way to completely avoid these issues is switch to SQLServer or any other "proper" database which can be updated over the network by all users and where concurrency issues are handled by the DBMS when the updates are applied.
Other users have already suggested switching to a server based database i.e. SQL server etc. I would echo this and say it is the best way to go however if you are stuck with access and have no choice then I would suggest you add a field (with an index) along the lines of “Last Updated”. You could then export all records that have been modified within a particular time frame. Export this file as a CSV, ship it over to the remote site and import it into the “master” access database. With a bit of scripting you could automate this process.