If I have a legacy database with no referential-integrity or keys and it uses stored procedures for all external access is there any point in using nHibernate to persist entities (object-graphs)?
Plus, the SP's not only contain CRUD operations but business logic as well...
I'm starting to think sticking with a custom ado.net DAL would be easier :(
Cheers
Ollie
You most likely CAN. But you probably shouldn't :-)
Hibernate does not care about referential integrity per se; while it obviously needs to have some sort of link between associated tables, it does not matter whether actual FK constraint exists. For example, if Product is mapped as many-to-one to Vendor, PRODUCTS table should have some sort of VENDOR_ID in it but it doesn't have to be a FK.
Depending on your SP signatures, you may or may not be able to use them as custom CRUD in your mappings; if SPs indeed have business logic in them that is applied during all CRUD operations, that may be your first potential problem.
Finally, if your SPs are indeed used for ALL CRUD operations (including all possible queries) it's probably just not worth it to try and introduce Hibernate to the mix - you'll gain pretty much nothing and you'll have a yet another layer to deal with.
okay, an example of the problem is this:
A SP uses a sql statement similar to the following to select the next Id to be inserted into the 'Id' column of a table (this column is just an int column but NOT an identity column),
statement: 'select #cus_id = max(id) + 1 from customers',
so once the next id is calculated it's inserted into table A with other data, then a row is inserted into table B where there is ref to table A (no foreign key constraint) on another column from table A, then finally a row is inserted into table C using the same ref to table A.
When I mapped this into NH using the fluent NH the map generated a correct 'insert' sql statement for the first table but when the second table was mapped as a 'Reference' an 'update' sql statement was generated, I was expecting to see an 'insert' statement...
Now the fact there is no identity columns, no keys and no referential-integrity means to me that I can't guarantee relationship are one-to-one, one-to-many etc...
If this is true, how can NH (fluent) configured either...
Cheers
Ollie
Related
I have a table that will contain information for 3 other tables. The design I have is that this table will have a column that will tell the objects's ID and another column will tell the objects's type (and thus the table that that row refers to).
Two questions:
a) Is that the best design or is there something else more widely accepted?
b) What is the recommend procedure to assure that IDs are valid for the given objects's type?
If I understood your question correctly, each row in your table links to exactly one of the three other tables.
Your approach (type field + one foreign key field) is a valid design, and it's useful if you want to create a general-purpose table that contains meta-information about your data (e.g. a list of records that should be retransmitted for replication).
Another approach, which might be more suitable for real application-level data, would be to have three columns, each being a foreign key to one of the three tables, and to add a constraint that requires exactly two of those fields to be null. The has the following advantages:
The three FKs do not need to have the same data type.
The JOIN syntax becomes more natural (not involving the type field).
You can add referential integrity constraints on those FK columns.
You don't need to ensure correctness of the type field -- in fact, you don't need the type field at all. The type is determined implicitly by the one FK column which is not null.
a) I'm supposing you have a relationship one to many between objects and object types. In a normal design you'd have a reference from the objecttype column in the objects table to the primary key of the object types table
b) I would enforce referential integrity in the relationship properties (this depends on the dbms you are using). It's also up to you to use cascading on updates and deletes. This way, an update or a delete of the primary key on object types table would be reflected on the objects one, updating its foreign key column (object type column) or deleting the registers that have that object type.
The basics of DB schema design are easy, but more complicated situations can be really complicated to figure out what's best. There is a lot of personal subjectivity that can come into play here, and even performance can be a factor in denormalizing a design.
Disclaimer aside, my personal recommendation is to never use a column to store more than one kind of FK, i.e. a column for FKs should store FKs that point only to a single table. If you don't do this, you have to map the cascade of that column's data into multiple sub-select queries inside your code, and it can begin to get more messy than you expected. Your given "Problem No. 2, ensuring validity between type and FK" is just the beginning of a whole world of pain that will cascade throughout your source code.
Assuming you change the design to use one field per FK reference, I would also check whether each FK field in your main "information-holding table" will be fully valid for each record. If not, I would move out the FK columns that will only be applicable some of the time to a separate table.
I have to add functionality to an existing application and I've run into a data situation that I'm not sure how to model. I am being restricted to the creation of new tables and code. If I need to alter the existing structure I think my client may reject the proposal.. although if its the only way to get it right this is what I will have to do.
I have an Item table that can me link to any number of tables, and these tables may increase over time. The Item can only me linked to one other table, but the record in the other table may have many items linked to it.
Examples of the tables/entities being linked to are Person, Vehicle, Building, Office. These are all separate tables.
Example of Items are Pen, Stapler, Cushion, Tyre, A4 Paper, Plastic Bag, Poster, Decoration"
For instance a Poster may be allocated to a Person or Office or Building. In the future if they add a Conference Room table it may also be added to that.
My intital thoughts are:
Item
{
ID,
Name
}
LinkedItem
{
ItemID,
LinkedToTableName,
LinkedToID
}
The LinkedToTableName field will then allow me to identify the correct table to link to in my code.
I'm not overly happy with this solution, but I can't quite think of anything else. Please help! :)
Thanks!
It is not a good practice to store table names as column values. This is a bad hack.
There are two standard ways of doing what you are trying to do. The first is called single-table inheritance. This is easily understood by ORM tools but trades off some normalization. The idea is, that all of these entities - Person, Vehicle, whatever - are stored in the same table, often with several unused columns per entry, along with a discriminator field that identifies what type the entity is.
The discriminator field is usually an integer type, that is mapped to some enumeration in your code. It may also be a foreign key to some lookup table in your database, identifying which numbers correspond to which types (not table names, just descriptions).
The other way to do this is multiple-table inheritance, which is better for your database but not as easy to map in code. You do this by having a base table which defines some common properties of all the objects - perhaps just an ID and a name - and all of your "specific" tables (Person etc.) use the base ID as a unique foreign key (usually also the primary key).
In the first case, the exclusivity is implicit, since all entities are in one table. In the second case, the relationship is between the Item and the base entity ID, which also guarantees uniqueness.
Note that with multiple-table inheritance, you have a different problem - you can't guarantee that a base ID is used by exactly one inheritance table. It could be used by several, or not used at all. That is why multiple-table inheritance schemes usually also have a discriminator column, to identify which table is "expected." Again, this discriminator doesn't hold a table name, it holds a lookup value which the consumer may (or may not) use to determine which other table to join to.
Multiple-table inheritance is a closer match to your current schema, so I would recommend going with that unless you need to use this with Linq to SQL or a similar ORM.
See here for a good detailed tutorial: Implementing Table Inheritance in SQL Server.
Find something common to Person, Vehicle, Building, Office. For the lack of a better term I have used Entity. Then implement super-type/sub-type relationship between the Entity and its sub-types. Note that the EntityID is a PK and a FK in all sub-type tables. Now, you can link the Item table to the Entity (owner).
In this model, one item can belong to only one Entity; one Entity can have (own) many items.
your link table is ok.
the trouble you will have is that you will need to generate dynamic sql at runtime. parameterized sql does not typically allow the objects inthe FROM list to be parameters.
i fyou want to avoid this, you may be able to denormalize a little - say by creating a table to hold the id (assuming the ids are unique across the other tables) and the type_id representing which table is the source, and a generated description - e.g. the name value from the inital record.
you would trigger the creation of this denormalized list when the base info is modified, and you could use that for generalized queries - and then resort to your dynamic queries when needed at runtime.
Disclosure: I'm a 'natural key' advocate myself and averse to the IDENTITY PK approach. But I do have a 'live and let live' approach to lifestyle choices, so no religious arguments here please :)
I have inherited a table where the only key is the IDENTITY PK column; let's call it ID. There are a many tables that reference ID. The intended process of creating a new entity seems to be:
INSERT INTO the table.
Use scope_identity to grab the
auto-generated ID.
Use the auto-generated ID to INSERT
into related tables.
In fact, there is a helper stored proc to create an entity and return the ID. However, I have a couple of issues:
I need to go further than the helper stored proc and create rows in related tables which themselves have IDENTITY PKs, so for each entity I need to grab several auto-generated values along the way.
I need to fabricate several hundred entities and the helper procs are coded to handle one entity at a time.
What is the best way to bulk fabricate entities using the 'IDENTITY PK' design?
When using my own 'natural key' designs, I can generate the key values in advance, therefore it's simply a case of loading some scratch tables and INSERTing into the tables in the order expected by the foreign keys. Therefore, I'm tempted to find a sequence of high value INTEGER values (to match the type of the IDENTIY columns) which I know isn't being used now and hope that they won't be being used when the time comes to do the INSERT. Is this a good idea?
Are you talking specifically about MS SQL Server?
It is unfortunate that IDENTITY columns disallow explicit inserts by default. In other DBMSs, being auto-increment wouldn't stop you from inserting an explicit value into that column, which would make it easy to choose the keys in advance. Unfortunately on SQL Server you have the inconvenience of SET IDENTITY_INSERT to worry about.
there is a helper stored proc to create an entity and return the ID.
It seems a little over-the-top to me to use an sproc for that, since it's generally as simple as selecting the SCOPE_IDENTITY(). Quite often you can avoid the explicit select by writing each insert such that it can use the last insert's SCOPE_IDENTITY() directly.
find a sequence of high value INTEGER values which I know isn't being used now and hope that they won't be being used [...] Is this a good idea?
They don't necessarily have to be very high values; in fact if you did that often you'd be making many huge gaps in the IDENTITY values, which is generally better avoided. You could even use the MAX(column)+1 values as long as you either caught the error where someone else used those values in between times, or, better, do a select-max then insert in a transaction.
Say I'm mapping a simple object to a table that contains duplicate records and I want to allow duplicates in my code. I don't need to update/insert/delete on this table, only display the records.
Is there a way that I can put a fake (generated) ID column in my mapping file to trick NHibernate into thinking the rows are unique? Creating a composite key won't work because there could be duplicates across all of the columns.
If this isn't possible, what is the best way to get around this issue?
Thanks!
Edit: Query seemed to be the way to go
The NHibernate mapping makes the assumption that you're going to want to save changes, hence the requirement for an ID of some kind.
If you're allowed to modify the table, you could add an identity column (SQL Server naming - your database may differ) to autogenerate unique Ids - existing code should be unaffected.
If you're allowed to add to the database, but not to the table, you could try defining a view that includes a RowNumber synthetic (calculated) column, and using that as the data source to load from. Depending on your database vendor (and the products handling of views and indexes) this may face some performance issues.
The other alternative, which I've not tried, would be to map your class to a SQL query instead of a table. IIRC, NHibernate supports having named SQL queries in the mapping file, and you can use those as the "data source" instead of a table or view.
If you're data is read only one simple way we found was to wrapper the query in a view and build the entity off the view, and add a newguid() column, result is something like
SELECT NEWGUID() as ID, * FROM TABLE
ID then becomes your uniquer primary key. As stated above this is only useful for read-only views. As the ID has no relevance after the query.
I'm designing this collection of classes and abstract (MustInherit) classes…
This is the database table where I'm going to store all this…
As far as the Microsoft SQL Server database knows, those are all nullable ("Allow Nulls") columns.
But really, that depends on the class stored there: LinkNode, HtmlPageNode, or CodePageNode.
Rules might look like this...
How do I enforce such data integrity rules within my database?
UPDATE: Regarding this single-table design...
I'm still trying to zero in on a final architecture.
I initially started with many small tables with almost zero nullalbe fields.
Which is the best database schema for my navigation?
And I learned about the LINQ to SQL IsDiscriminator property.
What’s the best way to handle one-to-one relationships in SQL?
But then I learned that LINQ to SQL only supports single table inheritance.
Can a LINQ to SQL IsDiscriminator column NOT inherit?
Now I'm trying to handle it with a collection of classes and abstract classes.
Please help me with my .NET abstract classes.
Use CHECK constraints on the table. These allow you to use any kind of boolean logic (including on other values in the table) to allow/reject the data.
From the Books Online site:
You can create a CHECK constraint with
any logical (Boolean) expression that
returns TRUE or FALSE based on the
logical operators. For the previous
example, the logical expression is:
salary >= 15000 AND salary <= 100000.
It looks like you are attempting the Single Table Inheritance pattern, this is a pattern covered by the Object-Relational Structural Patterns section of the book Patterns of Enterprise Application Architecture.
I would recommend the Class Table Inheritance or Concrete Table Inheritance patterns if you wish to enforce data integrity via SQL table constraints.
Though it wouldn't be my first suggestion, you could still use Single Table Inheritance and just enforce the constraints via a Stored Procedure.
You can set up some insert/update triggers. Just check if these fields are null or notnull, and reject insert/update operation if needed. This is a good solution if you want to store all the data in the same table.
You can create also create a unique table for each classes as well.
Have a unique table for each type of node.
Why not just make the class you're building enforce the data integrity for its own type?
EDIT
In that case, you can either a) use logical constraints (see below) or b) stored procedures to do inserts/edits (a good idea regardless) or c) again, just make the class enforce data integrity.
A mixture of C & B would be the course of events I take. I would have unique stored procedures for add/edits for each node type (i.e. Insert_Update_NodeType) as well as make the class perform data validation before saving data.
Personally I always insist on putting data integrity code on the table itself either via a trigger or a check constraint. The reason why is that you cannot guarantee that only the user interface will update insert or delete records. Nor can you guarantee that someone might not write a second sp to get around the constraints in the orginal sp without understanding the actual data integrity rules or even write it because he or she is unaware of the existence of the sp with the rules. Tables are often affected by DTS or SSIS packages, dynamic queries from the user interface or through Query analyzer or the query window, or even by scheduled jobs that run code. If you do not put the data integrity code at the table level, sooner or later your data will not have integrity.
It's probably not the answer you want to hear, but the best way to avoid logical inconsistencies, you really want to look at database normalisation
Stephen's answer is the best. But if you MUST, you could add a check constraint the HtmlOrCode column and the other columns which need to change.
I am not that familiar with SQL Server, but I know with Oracle you can specify Constraints that you could use to do what you are looking for. I am pretty sure you can define constraints in SQL server also though.
EDIT: I found this link that seems to have a lot information, kind of long but may be worth a read.
Enforcing Data Integrity in Databases
Basically, there are four primary types of data integrity: entity, domain, referential and user-defined.
Entity integrity applies at the row level; domain integrity applies at the column level, and referential integrity applies at the table level.
Entity Integrity ensures a table does not have any duplicate rows and is uniquely identified.
Domain Integrity requires that a set of data values fall within a specific range (domain) in order to be valid. In other words, domain integrity defines the permissible entries for a given column by restricting the data type, format, or range of possible values.
Referential Integrity is concerned with keeping the relationships between tables synchronized.
#Zack: You can also check out this blog to read more details about data integrity enforcement, here- https://www.bugraptors.com/what-is-data-integrity/
SQL Server doesn't know anything about your classes. I think that you'll have to enforce this by using a Factory class that constructs/deconstructs all these for you and makes sure that you're passing the right values depending upon the type.
Technically this is not "enforcing the rules in the database" but I don't think that this can be done in a single table. Fields either accept nulls or they don't.
Another idea could be to explore SQL Functions and Stored Procedures that do the same thing. BUt you cannot enforce a field to be NOT NULL for one record and NULL for the next one. That's your Business Layer / Factory job.
Have you tried NHibernate? It's much more matured product than Entity Framework. It's free.