Cast 'new' and 'old' dynamically [duplicate] - sql

I'm interested in using the following audit mechanism in an existing PostgreSQL database.
http://wiki.postgresql.org/wiki/Audit_trigger
but, would like (if possible) to make one modification. I would also like to log the primary_key's value where it could be queried later. So, I would like to add a field named something like "record_id" to the "logged_actions" table. The problem is that every table in the existing database has a different primary key fieldname. The good news is that the database has a very consistent naming convention. It's always, _id. So, if a table was named "employee", the primary key is "employee_id".
Is there anyway to do this? basically, I need something like OLD.FieldByName(x) or OLD[x] to get value out of the id field to put into the record_id field in the new audit record.
I do understand that I could just create a separate, custom trigger for each table that I want to keep track of, but it would be nice to have it be generic.
edit: I also understand that the key value does get logged in either the old/new data fields. But, what I would like would be to make querying for the history easier and more efficient. In other words,
select * from audit.logged_actions where table_name = 'xxxx' and record_id = 12345;
another edit: I'm using PostgreSQL 9.1
Thanks!

You didn't mention your version of PostgreSQL, which is very important when writing answers to questions like this.
If you're running PostgreSQL 9.0 or newer (or able to upgrade) you can use this approach as documented by Pavel:
http://okbob.blogspot.com/2009/10/dynamic-access-to-record-fields-in.html
In general, what you want is to reference a dynamically named field in a record-typed PL/PgSQL variable like 'NEW' or 'OLD'. This has historically been annoyingly hard, and is still awkward but is at least possible in 9.0.
Your other alternative - which may be simpler - is to write your audit triggers in plperlu, where dynamic field references are trivial.

Related

Using auto assigned primary key or setting it on INSERT?

I just answered this question: Can I get the ID of the data that I just inserted?. I explained that in order to know the ID of the last record inserted in a table, what I would do is inserting it manually, instead of using some sequence or serial field.
What I like to do is to run a Max(id) query before INSERT, add 1 to that result, and use that number as ID for the record I'm about to insert.
Now, what I would like to ask: is this a good idea? Can it give some trouble? What are the reasons to use automatically set field on IDs fields?
Note: this is not exactly a question, but looking help center it seems like a good question to ask. If you find it to be off-topic, please tell me and I'll remove it.
This is a bad idea and it will fail in a multi threaded (or multi users) environment.
Please note that the surrogate-key vs natural-key debate is still far from having a concrete definitive solution - but putting that aside for a minute - even if you do go with a surrogate key - you should never try to manually auto-increment columns. Let the database do that for you and avoid all kinds of problems that can occur if you try to do that manually - such as primary key constraint violations in the best case, or duplicate values in the worst case.
If an Entity uses an ID as the Primary-Key, it is in general a good idea to let the DB autocreate it, so you don't need to determine an unused one while creating this Entity in your code. Furthermore a DateAccessObject(DAO) does not need to operate on the ID.
Dependant on what DB u might use, you might even not be allowed to retrieve all IDs of that Table..
I guess there might be other good reasons to let the DB manage this part.

Is there any use to duplicate column names in a table?

In sqlite3, I can force two columns to alias to the same name, as in the following query:
SELECT field_one AS overloaded_name,
field_two AS overloaded_name
FROM my_table;
It returns the following:
overloaded_name overloaded_name
--------------- ---------------
1 2
3 4
... ...
... and so on.
However, if I create a named table using the same syntax, it appends one of the aliases with a :1:
sqlite> CREATE TABLE temp AS
SELECT field_one AS overloaded_name,
field_two AS overloaded_name
FROM my_table;
sqlite> .schema temp
CREATE TABLE temp(
overloaded_name TEXT,
"overloaded_name:1" TEXT
);
I ran the original query just to see if this was possible, and I was surprised that it was allowed. Is there any good reason to do this? Assuming there isn't, why is this allowed at all?
EDIT:
I should clarify: the question is twofold: why is the table creation allowed to succeed, and (more importantly) why is the original select allowed in the first place?
Also, see my clarification above with respect to table creation.
I can force two columns to alias to the same name...
why is [this] allowed in the first place?
This can be attributed to the shackles of compatibility. In the SQL Standards, nothing is ever deprecated. An early version of the Standard allowed the result of a table expression to include columns with duplicate names, probably because an influential vendor had allowed it, possibly due to the inclusion of a bug or the omission of a design feature, and weren't prepared to take the risk of breaking their customers' code (the shackles of compatibility again).
Is there any use to duplicate column names in a table?
In the relational model, every attribute of every relation has a name that is unique within the relevant relation. Just because SQL allows duplicate column names that doesn't mean that as a SQL coder you should utilise such as feature; in fact I'd say you have to vigilant not to invoke this feature in error. I can't think of any good reason to have duplicate column names in a table but I can think of many obvious bad ones. Such a table would not be a relation and that can't be a good thing!
why is the [base] table creation allowed to succeed
Undoubtedly an 'extension' to (a.k.a purposeful violation of) the SQL Standards, I suppose it could be perceived as a reasonable feature: if I attempt to create columns with duplicate names the system automatically disambigutes them by suffixing an ordinal number. In fact, the SQL Standard specifies that there be an implementation dependent way to ensure the result of a table expression does not implicitly have duplicate column names (but as you point out in the question this does not perclude the user from explicitly using duplicate AS clauses). However, I personally think the Standard behaviour of disallowing the duplicate name and raising an error is the correct one. Aside from the above reasons (i.e. that duplicate columns in the same table are of no good use), a SQL script that creates an object without knowing if the system has honoured that name will be error prone.
The table itself can't have duplicate column names because inserting and updating would be messed up. Which column gets the data?
During selects the "duplicates" are just column labels so do not hurt anything.
I assume you're talking about the CREATE TABLE ... AS SELECT command. This looks like an SQL extension to me.
Standard SQL does not allow you to use the same column name for different columns, and SQLite appears to be allowing that in its extension, but working around it. While a simple, naked select statement simply uses as to set the column name, create table ... as select uses it to create a brand new table with those column names.
As an aside, it would be interesting to see what the naked select does when you try to use the duplicated column, such as in an order by clause.
If you were allowed to have multiple columns with the same name, it would be a little difficult for the execution engine to figure out what you meant with:
select overloaded_name from table;
The reason why you can do it in the select is to allow things like:
select id, surname as name from users where surname is not null
union all
select id, firstname as name from users where surname is null
so that you end up with a single name column.
As to whether there's a good reason, SQLite is probably assuming you know what you're doing when you specify the same column name for two different columns. Its essence seems to be to allow a great deal of latitude to the user (as evidenced by the fact that the columns are dynamically typed, for example).
The alternative would be to simply refuse your request, which is what I'd prefer, but the developers of SQLite are probably more liberal (or less anal-retentive) than I :-)

SQL Server - Schema/Code Analysis Rules - What would your rules include?

We're using Visual Studio Database Edition (DBPro) to manage our schema. This is a great tool that, among the many things it can do, can analyse our schema and T-SQL code based on rules (much like what FxCop does with C# code), and flag certain things as warnings and errors.
Some example rules might be that every table must have a primary key, no underscore's in column names, every stored procedure must have comments etc.
The number of rules built into DBPro is fairly small, and a bit odd. Fortunately DBPro has an API that allows the developer to create their own. I'm curious as to the types of rules you and your DB team would create (both schema rules and T-SQL rules). Looking at some of your rules might help us decide what we should consider.
Thanks - Randy
Some of mine. Not all could be tested programmatically:
No hungarian-style prefixes (like "tbl" for table, "vw" for view)
If there is any chance this would ever be ported to Oracle, no identifiers longer than 30 characters.
All table and column names expressed in lower-case letters only
Underscores between words in column and table names--we differ on this one obviously
Table names are singular ("customer" not "customers")
Words that make up table, column, and view names are not abbreviated, concatenated, or acronym-based unless necessary.
Indexes will be prefixed with “IX_”.
Primary Keys are prefixed with “PK_”.
Foreign Keys are prefixed with “FK_”.
Unique Constraints are prefixed with “UC_”.
I suspect most of my list would be hard to put in a rules engine, but here goes:
If possible I'd have it report any tables that are defined as wider than the bytes that can be stored in a record (excluding varchar(max) and text type fields) and/or a datapage.
I want all related PK and FK columns to have the same name if at all possible. The only time it isn't possible is when you need to have two FKs in the same table relating to one PK and even then, I would name it the name of the PK and a prefix or suffix describing the difference. For instance if I had a PersonID PK and a table needed to have both the sales rep id and the customer id, they would be CustomerPersonID, and RepPersonID.
I would check to make sure all FKs have an index.
I would want to know about all fields that are required but have no default value. Depending on what it is, you may not want to define a default, But I would want to be able to easily see which ones don't to hopefully find the ones that should have a default.
I would want all triggers checked to see that they are set-based and not designed to run for one row at time.
No table without a defined Unique index or PK. No table where the PK is more than one field. No table where the PK is not an int.
No object names that use reserved words for the database I'm using.
No fields with the word Date as part of the name that are not defined as date or datetime.
No table without an associated audit table.
No field called SSN, SocialSecurityNumber, etc. that is not encrypted. Same for any field named CreditCardNumber.
No user defined datatypes (In SQL Server at least, these are far more trouble than they are worth.)
No views that call other views. Experience has shown me these are often a performance disaster waiting to happen. Especially if they layer more than one layer deep.
If using replication, no table without a GUID field.
All tables should have a DateInserted field and InsertedBy field (even with auditing, it is often easier to research data problems if this info is easily available.)
Consistent use of the same case in naming. It doesn't matter which as long as all use the same one.
No tables with a field called ID. Hate these with a passion. They are so useless. ID fields should be named tablenameID if a PK and with the PK name if an FK.
No spaces or special characters in object names. In other words if you need special handling for the database to recognize it in the correct context in query, don't use it.
If it is going to analyze code as well, I'd want to see any code that uses a cursor or a correlated subquery. Why create performance problems from the start?
I would want to see if a proc uses dynamic SQl and if so if it has an input bit variable called Debug (and code to only print the dynamic SQl statment and not execute it, if the Debug variable is set to 1).
I'd want to be able to check that if there is more than one statement causing action in the database (insert/update/delete) that there is also an explicit transaction in the proc and error trapping to roll the whole thing back if any part of it fails.
I'm sure I could think of more.

SQL, How to change column in SQL table without breaking other dependencies?

I'm sure this might be quite common query but couldn't find good answer as for now.
Here is my question:
I've got a table named Contacts with varchar column Title. Now in the middle of development I want to replace field Title with TitleID which is foreign key to ContactTitles table. At the moment table Contacts has over 60 dependencies (other tables, views functions).
How can I do that the safest and easiest way?
We use: MSSQL 2005, data has already been migrated, just want to change schema.
Edit:
Thanks to All for quick replay.
Like it was mentioned Contacts table has over 60 dependents, but when following query was run, only 5 of them use Title column. Migration script was run, so no data changes required.
/*gets all objects which use specified column */
SELECT Name
FROM syscomments sc
JOIN sysobjects so ON sc.id = so.id
WHERE TEXT LIKE '%Title%' AND TEXT LIKE '%TitleID%'
Then I went through those 5 views and updated them manually.
Use refactoring methods. Start off by creating a new field called TitleID, then copy all the titles into the ContactTitles table. Then, one by one, update each of the dependencies to use the TitleID field. Just make sure you've still got a working system after each step.
If the data is going to be changing, you'll have to be careful and make sure that any changes to the Title column also change the ContactTitles table. You'll only have to keep them in sync while you're doing the refactoring.
Edit: There's even a book about it! Refactoring Databases.
As others pointed out it depends on your RDBMS.
There are two approaches:
make a change to the table and fix all dependencies
make a view that you can use instead of direct access to the table (this can guard you against future changes in the underlying core table(s), but you might loose some update functionality, depending on your DBMS)
For Microsoft SQL Server Redgate have a (not free) product that can help with this refactoring http://www.red-gate.com/products/sql_refactor/index.htm
In the past I have managed to do this quite easily (if primitively) by simply getting a list of things to review
SELECT * FROM sys.objects
WHERE OBJECT_DEFINITION(OBJECT_ID) LIKE '%Contacts%'
(and possibly taking dependencies information into account and filtering by object type)
Scripting all the ones of interest in Management Studio then simply going down the list and reviewing them all and changing the CREATE to ALTER. It should be quite a simple and repetitive change even for 60 possible dependencies. Additionally if you are referring to a non existent column you should get an error message when you run the script to ALTER.
If you use * in your queries or adhoc SQL in your applications obviously things may be a bit more difficult.
Use SP_Depend 'Table Name' to check the Dependencies of the table
and then Use the SP_Rename to Rename the Column Name which is very useful.
sp_rename automatically renames the associated index whenever a PRIMARY KEY or UNIQUE constraint is renamed. If a renamed index is tied to a PRIMARY KEY constraint, the PRIMARY KEY constraint is also automatically renamed by sp_rename.
and then start Updating the Procedure and Functions one by one there is no other good option for change like this if you found then tell me too.

Is there any way to fake an ID column in NHibernate?

Say I'm mapping a simple object to a table that contains duplicate records and I want to allow duplicates in my code. I don't need to update/insert/delete on this table, only display the records.
Is there a way that I can put a fake (generated) ID column in my mapping file to trick NHibernate into thinking the rows are unique? Creating a composite key won't work because there could be duplicates across all of the columns.
If this isn't possible, what is the best way to get around this issue?
Thanks!
Edit: Query seemed to be the way to go
The NHibernate mapping makes the assumption that you're going to want to save changes, hence the requirement for an ID of some kind.
If you're allowed to modify the table, you could add an identity column (SQL Server naming - your database may differ) to autogenerate unique Ids - existing code should be unaffected.
If you're allowed to add to the database, but not to the table, you could try defining a view that includes a RowNumber synthetic (calculated) column, and using that as the data source to load from. Depending on your database vendor (and the products handling of views and indexes) this may face some performance issues.
The other alternative, which I've not tried, would be to map your class to a SQL query instead of a table. IIRC, NHibernate supports having named SQL queries in the mapping file, and you can use those as the "data source" instead of a table or view.
If you're data is read only one simple way we found was to wrapper the query in a view and build the entity off the view, and add a newguid() column, result is something like
SELECT NEWGUID() as ID, * FROM TABLE
ID then becomes your uniquer primary key. As stated above this is only useful for read-only views. As the ID has no relevance after the query.