Lazy loading a portion of a record with NHibernate - nhibernate

I'm not sure how to explain this. So here goes...
I'm trying to fit the method for lazy loading blobs as described here but I'm stuck with only one table.
I have a schema (fixed, in a legacy system) which looks something like this:
MyTable
ID int
Name char(50)
image byte
This is on Informix, and the byte column is a simple large object. Now normally I would query the table with "SELECT ID, Name, (image is not null) as imageexists..." and handle the blob load later.
I can construct my object model to have two different classes (and thus two different map definitions) to handle the relationship, but how can I "fool" nhibernate into using the same table to show this one-to-one relationship?

Short answer: you can't.
You either need to map it twice or (my preference) create a DTO that has the fields you want. In HQL you'd do something like:
select new MyTableDTO(t.ID, t.name) from MyTable t

Related

Table structure for data with many NULLs

I'm currently trying to model a dynamic data object that can have or miss some properties (the property names are known for the current requirement). It is not known if new properties will be added later on (but it is almost certain). The modeled object is something along the line of this:
int id PRIMARY KEY NOT NULL;
int owner FOREIGN KEY NOT NULL;
Date date NOT NULL;
Time time NOT NULL;
Map<String,String> properties;
A property can be of any type ( int, bool, string,... )
I'm not sure how i should model this object in an SQL database. There are 2 ways i can think of to do this and i would like to have some input which will be the better choice in terms of developer "work"(maintenance), memory consumption and performance. As a side info: properties are almost always NULL (not existant)
(1) I would have a big table that has id, owner, date, time and every property as a column whereas missing properties for a row are modeled as NULL. e.g.
TABLE_X
id|owner|date|time|prop_1|prop_2|prop_3|...
This table would have alot of NULL values.
If new properties should be added then i would do an ALTER TABLE and insert a new column for every new property
Here i would do a "usual"
SELECT * FROM TABLE_X ...
(2) I would have a main table with all NOT NULL data:
TABLE_X
id|owner|date|time
And then have a seperate table for every property, like this:
TABLE_X_PROP_N
foreign_key(TABLE_X(id))|value
Here would be no NULL values at all. A property either has a value and is in its corresponding table or it is NULL and then does not appear in its table.
To add new properties i would just add another table.
Here is would do a
SELECT * FROM TABLE_X LEFT JOIN TABLE_X_PROP_1 ON ... LEFT JOIN TABLE_X_PROP_2 ON ...
To repeat the question (so you don't have to scroll up):
Which of boths ways to deal with the problem is the better in terms of maintenance (work for developer), memory consumption (on disk) and performance (more queries per second)? Maybe you also have a better idea on how to deal with this. Thanks in advance
If you go with Option 2, I would think you need 3 tables:
TABLE_HEADER
id|owner|date|time
TABLE_PROPERTY
id|name
TABLE_PROPERTYVALUE
id|headerID(FK)|propertyID(FK)|value
Easy to add new properties allow you greater flexibility and to iterate much faster. The number of properties would also have an effect (for example if you have 500 properties you aren't going to want a table with 500 columns!). The main downside is it will become ugly if you need to attach complex business logic using the properties as its a more complex structure to navigate and you can't enforce data integrity like not null for particular fields. If you truly want a property bag like you have modeled in your object structure then this maps easily. Like everything it depends on your circumstances for what is most suitable.
Solution 2. but why without separate tables for every property. Just put everything in one table:
properties(
foreign_key(TABLE_X(id))
property_name,
value);
Sounds like you're trying to implement an Entity-Attribute-Value (often-viewed-as-an-anti-)pattern here. Are you familiar with them? Here's a few references:
https://softwareengineering.stackexchange.com/questions/93124/eav-is-it-really-bad-in-all-scenarios
http://www.dbforums.com/showthread.php?1619660-OTLT-EAV-design-why-do-people-hate-it
https://en.wikipedia.org/wiki/Entity%E2%80%93attribute%E2%80%93value_model
Personally I'm extremely wary of this type of setup in a RDBMS. I tend to think that NoSQL document style databases would be a better fit for these types of dynamic structures, though admittedly I have relatively little real-world experience with NoSQL myself.

How can one delete an entity in nhibernate having only its id and type?

I am wondering how can one delete an entity having just its ID and type (as in mapping) using NHibernate 2.1?
If you are using lazy loading, Load only creates a proxy.
session.Delete(session.Load(type, id));
With NH 2.1 you can use HQL. Not sure how it actually looks like, but something like this: note that this is subject to SQL injection - if possible use parametrized queries instead with SetParameter()
session.Delete(string.Format("from {0} where id = {1}", type, id));
Edit:
For Load, you don't need to know the name of the Id column.
If you need to know it, you can get it by the NH metadata:
sessionFactory.GetClassMetadata(type).IdentifierPropertyName
Another edit.
session.Delete() is instantiating the entity
When using session.Delete(), NH loads the entity anyway. At the beginning I didn't like it. Then I realized the advantages. If the entity is part of a complex structure using inheritance, collections or "any"-references, it is actually more efficient.
For instance, if class A and B both inherit from Base, it doesn't try to delete data in table B when the actual entity is of type A. This wouldn't be possible without loading the actual object. This is particularly important when there are many inherited types which also consist of many additional tables each.
The same situation is given when you have a collection of Bases, which happen to be all instances of A. When loading the collection in memory, NH knows that it doesn't need to remove any B-stuff.
If the entity A has a collection of Bs, which contains Cs (and so on), it doesn't try to delete any Cs when the collection of Bs is empty. This is only possible when reading the collection. This is particularly important when C is complex of its own, aggregating even more tables and so on.
The more complex and dynamic the structure is, the more efficient is it to load actual data instead of "blindly" deleting it.
HQL Deletes have pitfalls
HQL deletes to not load data to memory. But HQL-deletes aren't that smart. They basically translate the entity name to the corresponding table name and remove that from the database. Additionally, it deletes some aggregated collection data.
In simple structures, this may work well and efficient. In complex structures, not everything is deleted, leading to constraint violations or "database memory leaks".
Conclusion
I also tried to optimize deletion with NH. I gave up in most of the cases, because NH is still smarter, it "just works" and is usually fast enough. One of the most complex deletion algorithms I wrote is analyzing NH mapping definitions and building delete statements from that. And - no surprise - it is not possible without reading data from the database before deleting. (I just reduced it to only load primary keys.)

When inserting a complex object into an SQL database, when should the object be broken up into its respectful tables?

Edit: In short what strategy should one use on insert and select scripts with complex objects (eg. two select calls, one for each table; a single select call with unions)?
We have a database insert (postgresql) that includes a list of objects that is serialized (in text xml), and put it into a cell in a row amongst normal strings and such. We would like to create a new table with those lists with references back to the key of the original item. Where should the object be split off? I don't think it is possible in the SQL query, but if so that would be ideal. Our favorite spot currently is just before we set up our JDBCProcedures.
string name
int id
List<sub-objects>
and currently this is being stored in a DB schema like:
name varchar(20)
id int
subObjs text [or other character type big enough to hold the serialized XML]
Please provide a little more information about the structure of your objects and clarify your question. It's not entirely clear what you're asking here.
That said, let me try to take a stab:
If you have objects in Java code with structure somewhat like this:
string name
int id
object[] list_of_sub-objects
and currently this is being stored in a DB schema like:
name varchar(20)
id int
subObjs text [or other character type big enough to hold the serialized XML]
Is that about right?
And then your question is:
We would like to create a new table with those lists with references back to the key of the original item. Where should the object be split off? I don't think it is possible in the SQL query, but if so that would be ideal.
When you say the list-attribute item is "serialized" in your existing system, do you mean as XML? It looks like XML parsing in SQL itself is still in development for postgreSQL, and in any case it's likely to be a lot of trouble to code something like that up if you do not already know how.
But you already have application code which represents your objects in a non-serialized fashion. You could write a function in your application codebase which performs the migration. Load the records from the old database table into application objects according to your existing schema, then write them back into your new pair of DB tables according to your new schema.
This conceptually simplifies the problem down to something you can represent in pseudocode, i.e. "how do I map the structure of my object from the old database schema to the new one?"
I hope this helps! If you can clarify your structure a bit, I might be able to contribute some more specific pseudocode for the solution I'm proposing here.
We ended up splitting the insert into two calls (one for main object, one for sub-object) so that each table would have its own insert, but created a single select so that we could use the advantages of foreign keys in the query.

Storing polymorphic objects in SQL database

[noob warning!] I need to store some data in some tables where it is like the equivalent of an array of pointers to polymorphic objects. E.g. (pseudo C++)
struct MyData { string name; }
struct MyDataA : MyData { int a,b,c; }
struct MyDataB : MyData { string s; }
MyData * data[100];
I don't really know what google search to enter! How would you store info like this in an SQL database?
My random thoughts:
I could have one table with a column that is the struct identifier and then have redundant columns, but this seems wasteful.
I can have one table for each struct type. These would have a foreign key back to the master array table. But, how do I point to the struct tables?
There's really two major ways to solve this:
table-per-type
table-per-hierarchy
Either of them has its pros and cons.
Table-per-type gives you more tables (one per type), which only store the "delta" from the immediate super class. Worst case, you need to join together a number of tables to finally get all the data together for a single instance of a type. Pros: since you only store what's really relevant for that type into a separate table, you can do this like set NOT NULL restrictions etc. on the database table.
Table-per-hierarchy gives you less tables, but each table represents an entire hierarchy, so it will contains potentially lots of columns which aren't filled (in the rows representating base class types). Also, on the extra columns that make up the derived classes, you cannot set things like NOT NULL restrictions - all those extra columns must be nullable, since they really don't exist in the base classes, so you loose some degree of safety here.
See for yourself - there are two really good articles on how to do this (in Entity Framework, but the principles apply to any database and any data mapping technology):
Demystifying The Code: Table Per Type
Demystifying The Code: Table Per Hierarchy
Hope this helps and gives you some inputs!
Marc
I do the "table-per-sublcass" style from the Hibernate docs.
You make a Person table with all the things you know about a person, plus the PersonID. Then you make a Customer table, with only the data that's unique to a Customer (account balance, etc). Put the PersonID in the Customer table. A WebsiteUser might have a CustomerID in it, and so on down the chain.
One-to-one relationships mapping the IS-A inheritance relationships.
One possibility is an XML field to store the data, this allows searching and retrieving whilst also being relatively easy to serialise. (the question says SQL, but doesn't specify a specfic vendor database, so XML may not work for every DB solution.)
Edit : I'm going to caveat this because it's not entirely clear what needs to be stored / retrieved / purpose etc, so XML may be entirely inappropriate - I'm throwing it out there as a thought provoker instead.

Fluent Nhibernate and Dynamic Table Name

I've got a parent and child object. Depending on a value in the parent object changes the table for the child object. So for example if the parent object had a reference "01" then it will look in the following table "Child01" whereas if the reference was "02" then it would look in the table "Child02". All the child tables are the same as in number of columns/names/etc.
My question is that how can I tell Fluent Nhibernate or nhibernate which table to look at as each parent object is unique and can reference a number of different child tables?
I've looked at the IClassConvention in Fluent but this seems to only be called when the session is created rather than each time an object is created.
I found only two methods to do this.
Close and recreate the nhibernate session every time another dynamic table needs to be looked at. On creating the session use IClassConvention to dynamically calculate the name based on user data. I found this very intensive as its a large database and a costly operation to create the session every time.
Use POCO object for these tables with custom data access.
As statichippo stated I could use a basechild object and have multiple child object. Due to the database size and the number of dynamic table this wasn't really a valid option.
Neither of my two solutions I was particularly happy with but the POCO's seemed the best way for my problem.
NHibernate is intended to be an object relational mappers. It sounds like you're doing more of a scripting style and hoping to map your data instead of working in an OOP manner.
It sounds like you have the makings of an class hierarchy though. What it sounds like you're trying to create in your code (and then map accordingly) is a hierarchy of different kinds of children:
BaseChild
--> SmartChild
--> DumbChild
Each child is either smart or dumb, but since they all have a FirstName, LastName, Age, etc, they all are instances of the BaseChild class which defines these. The only differences might be that the SmartChild has an IQ and the DumbChild has a FavoriteFootballTeam (this is just an example, no offense to anyone of course ;).
NHibernate will let you map this sort of relationship in many ways. There could be 1 table that encompasses all classes or (what it sounds like you want in your case), one table per class.
Did I understand the issue/what you're looking for?