Say I'm mapping a simple object to a table that contains duplicate records and I want to allow duplicates in my code. I don't need to update/insert/delete on this table, only display the records.
Is there a way that I can put a fake (generated) ID column in my mapping file to trick NHibernate into thinking the rows are unique? Creating a composite key won't work because there could be duplicates across all of the columns.
If this isn't possible, what is the best way to get around this issue?
Thanks!
Edit: Query seemed to be the way to go
The NHibernate mapping makes the assumption that you're going to want to save changes, hence the requirement for an ID of some kind.
If you're allowed to modify the table, you could add an identity column (SQL Server naming - your database may differ) to autogenerate unique Ids - existing code should be unaffected.
If you're allowed to add to the database, but not to the table, you could try defining a view that includes a RowNumber synthetic (calculated) column, and using that as the data source to load from. Depending on your database vendor (and the products handling of views and indexes) this may face some performance issues.
The other alternative, which I've not tried, would be to map your class to a SQL query instead of a table. IIRC, NHibernate supports having named SQL queries in the mapping file, and you can use those as the "data source" instead of a table or view.
If you're data is read only one simple way we found was to wrapper the query in a view and build the entity off the view, and add a newguid() column, result is something like
SELECT NEWGUID() as ID, * FROM TABLE
ID then becomes your uniquer primary key. As stated above this is only useful for read-only views. As the ID has no relevance after the query.
Related
In our database we have many tables with a 'Notes' column. This is important functionality, but for most rows the value of Notes is null. These tables have many columns and we would like to remove some columns for better legibility.
We could add one Notes table for every table that has a notes column. But this would create clutter of a different kind- too many small tables.
My idea is to create a generic Notes table and also a reference table. The Notes table would have a column for the notes text, a column for the id of the row being linked to, and a foreign key to the reference table. The reference table would have a text value for each table for which we need notes. Using these two tables we should be able to link the note back to whichever table and column it came from.
By using this solution, we remove any cases of null values from notes and also slim down some of our tables. All at the modest price of two additional tables. It feels very 'hacky' to me however. Is there a reason why using a 'generic' id column or a reference table of other tables is a bad idea from a DB management perspective?
Managing the references to disparate entities can be really challenging in SQL Server. Postgres, by contrast, supports inheritance which makes this much simpler.
So, my recommendation is to add a notes column to every entity where you want notes. You an add a view to bring all the notes together if you need a view of all the notes.
This has minimal impact on performance or data size. There is no additional overhead for a varchar column, other than the additional NULL bit -- and that is pretty minimal.
IMO, the other solution of managing two tables doesn't bring in much efficiency but adds complexity to the solution. You should probably stick with the the notes column in the original table with datatype as varchar.
Generic id column is not bad inherently but the use of it generally gives smell of bad/hacky design.
Additionaly for SQL Server you can use sparse for the note columns to reduce size.
But i used a similary approach myself. (Note column needed for many columns to write info / changerequest / lockcomment. But normally never used).
Works fine and can be programmed genericaly in source.
But if you need only one comment column per table i wood prefer sparse
I am trying to figure out what the best way to design this database would be. Currently what I have works, but it requires me to hard-code values where I would like it to be dynamic in the future.
Here is my current database design:
As you can see, for both the Qualities and the PressSettingsSet tables, there are many columns that are hard-coded such as BlownInsert, Blowout, Temperature1, Temperature2, etc.
What I am trying to accomplish is to have these be dynamic. Each job will have these same settings, but I would like to allow the users to define these settings. Would it be best to create a table with just a name field and have a one-to-one relationship to another table with a value for the field and a relation to the Job Number?
I hope this makes sense, any help is appreciated. I can make a database diagram of how I think it should work if that is more helpful to what I am trying to convey. I think that what I have in mind will work, but it just seems like it will be creating a lot of extra rows in the database, so I wanted to see if there is possibly a better way.
Would it be best to create a table with just a name field and have a one-to-one relationship to another table with a value for the field and a relation to the Job Number?
That would be the simplest - you could expand that by adding data-effective fields or de-normalize it by putting it all in one table (with just a name and value field).
Are the values for the settings different per job? If so then yes a "key" table" with the name ans a one-to-many relationship to the value per job would be best.
My application has one table called 'events' and each event has approx 30 standard fields, but also user defined fields that could be any name or type, in an 'eventdata' table. Users can define these event data tables, by specifying x number of fields (either text/double/datetime/boolean) and the names of these fields. This 'eventdata' (table) can be different for each 'event'.
My current approach is to create a lookup table for the definitions. So if i need to query all 'event' and 'eventdata' per record, i do so in a M-D relaitionship using two queries (i.e. select * from events, then for each record in 'events', select * from 'some table').
Is there a better approach to doing this? I have implemented this so far, but most of my queries require two distinct calls to the DB - i cannot simply join my master 'events' table with different 'eventdata' tables for each record in in 'events'.
I guess my main question is: can i join my master table with different detail tables for each record?
E.g.
SELECT E.*, E.Tablename
FROM events E
LEFT JOIN 'E.tablename' T ON E._ID = T.ID
If not, is there a better way to design my database considering i have no idea on how many user defined fields there may be and what type they will be.
There are four ways of handling this.
Add several additional fields named "Custom1", "Custom2", "Custom3", etc. These should have a datatype of varchar(?) or similiar
Add a field to hold the unstructured data (like an XML column).
Create a table of name /value pairs which are associated with some type of template. Let them manage the template. You'll have to use pivot tables or similiar to get the data out.
Use a database like MongoDB or another NoSql style product to store this.
The above said, The first one has the advantage of being fast but limits the number of custom fields to the number you defined. Older main frame type applications work this way. SalesForce CRM used to.
The second option means that each record can have it's own custom fields. However, depending on your database there are definite challenges here. Tried this, don't recommend it.
The third one is generally harder to code for but allows for extreme flexibility. SalesForce and other applications have gone this route; including a couple I'm responsible for. The downside is that Microsoft apparently acquired a patent on doing things this way and is in the process of suing a few companies over it. Personally, I think that's bullcrap; but whatever. Point is, use at your own risk.
The fourth option is interesting. We've played with it a bit and the performance is great while coding is pretty darn simple. This might be your best bet for the unstructured data.
Those type of joins won't work because you will need to pivot the eventdata table to make it columns instead of rows. Therefore it depends on which database technology you are using.
Here is an example with MySQL: How to pivot a MySQL entity-attribute-value schema
My approach would be to avoid using a different table for each event, if that's possible.
I would use something like:
Event (EventId, ..., ...)
EventColumnType (EventColumnTypeId, EventTypeId, ColumnName)
EventColumnData (EventColumnTypeId, Data)
You are them limited to the type of data you can store (everything would have to be strings, for example), but you the number of events and columns are unrestricted.
What I'm getting from your description is you have an event table, and then a separate EventData table for each and every event.
Rather than that, why not have a single EventCustomFields table that contains a foreign key to the event table, a field Name (event+field being the PK) and a field value.
Sure it's not the best. You'd be stuck serializing the value or storing everything as a string. And you'd still be stuck doing two queries, one for the event table and one to get it's custom fields, but at least you wouldn't have a new table for every event in the system (yuck x10)
Another, (arguably worse) option is to serialize the custom fields into a single column of the and then deserialize when you need. So your query would be something like
Select E.*, C.*
From events E, customFields C
Where E.ID = C.ID
Is it possible to just impose a limit on your users? I know the tables underneath Sharepoint 2007 had a bunch of columns for custom data that were just named like CustomString1, CustomDate2, etc. That may end up easier than some of the approaches above, where everything is in one column (though that's an approach I've taken as well), and I would think it would scale up better.
The answer to your main question is: no. You can't have different rows in the result set with different columns. The result set is kind of like a table, so each row has to have the same columns. You can fake it with padding and dummy columns, but that's probably not much better.
You could try defining a fixed event data table, with (say) ten of each type of column. Then you'd store the usage metadata in a separate table and just read that in at system startup. The metadata would tell you that event type "foo" has a field "name" mapped to column string0 in the event data table, a field named "reporter" mapped to column string1, and a field named "reportDate" mapped to column date0. It's ugly and wastes space, but it's reasonably flexible. If you're in charge of the database, you can even define a view on the table so to the client it looks like a "normal" table. If the clients create their own tables and just stick the table name in the event record, then obviously this won't fly.
If you're really hardcore you can write a database procedure to query the table structures and serialize everything to a lilst of key/type/value tuples and return that in one long string as the last column, but that's probably not much handier than what you're doing now.
I have to add functionality to an existing application and I've run into a data situation that I'm not sure how to model. I am being restricted to the creation of new tables and code. If I need to alter the existing structure I think my client may reject the proposal.. although if its the only way to get it right this is what I will have to do.
I have an Item table that can me link to any number of tables, and these tables may increase over time. The Item can only me linked to one other table, but the record in the other table may have many items linked to it.
Examples of the tables/entities being linked to are Person, Vehicle, Building, Office. These are all separate tables.
Example of Items are Pen, Stapler, Cushion, Tyre, A4 Paper, Plastic Bag, Poster, Decoration"
For instance a Poster may be allocated to a Person or Office or Building. In the future if they add a Conference Room table it may also be added to that.
My intital thoughts are:
Item
{
ID,
Name
}
LinkedItem
{
ItemID,
LinkedToTableName,
LinkedToID
}
The LinkedToTableName field will then allow me to identify the correct table to link to in my code.
I'm not overly happy with this solution, but I can't quite think of anything else. Please help! :)
Thanks!
It is not a good practice to store table names as column values. This is a bad hack.
There are two standard ways of doing what you are trying to do. The first is called single-table inheritance. This is easily understood by ORM tools but trades off some normalization. The idea is, that all of these entities - Person, Vehicle, whatever - are stored in the same table, often with several unused columns per entry, along with a discriminator field that identifies what type the entity is.
The discriminator field is usually an integer type, that is mapped to some enumeration in your code. It may also be a foreign key to some lookup table in your database, identifying which numbers correspond to which types (not table names, just descriptions).
The other way to do this is multiple-table inheritance, which is better for your database but not as easy to map in code. You do this by having a base table which defines some common properties of all the objects - perhaps just an ID and a name - and all of your "specific" tables (Person etc.) use the base ID as a unique foreign key (usually also the primary key).
In the first case, the exclusivity is implicit, since all entities are in one table. In the second case, the relationship is between the Item and the base entity ID, which also guarantees uniqueness.
Note that with multiple-table inheritance, you have a different problem - you can't guarantee that a base ID is used by exactly one inheritance table. It could be used by several, or not used at all. That is why multiple-table inheritance schemes usually also have a discriminator column, to identify which table is "expected." Again, this discriminator doesn't hold a table name, it holds a lookup value which the consumer may (or may not) use to determine which other table to join to.
Multiple-table inheritance is a closer match to your current schema, so I would recommend going with that unless you need to use this with Linq to SQL or a similar ORM.
See here for a good detailed tutorial: Implementing Table Inheritance in SQL Server.
Find something common to Person, Vehicle, Building, Office. For the lack of a better term I have used Entity. Then implement super-type/sub-type relationship between the Entity and its sub-types. Note that the EntityID is a PK and a FK in all sub-type tables. Now, you can link the Item table to the Entity (owner).
In this model, one item can belong to only one Entity; one Entity can have (own) many items.
your link table is ok.
the trouble you will have is that you will need to generate dynamic sql at runtime. parameterized sql does not typically allow the objects inthe FROM list to be parameters.
i fyou want to avoid this, you may be able to denormalize a little - say by creating a table to hold the id (assuming the ids are unique across the other tables) and the type_id representing which table is the source, and a generated description - e.g. the name value from the inital record.
you would trigger the creation of this denormalized list when the base info is modified, and you could use that for generalized queries - and then resort to your dynamic queries when needed at runtime.
When I create a view I can base it on multiple columns from different tables.
When I want to create a lookup table I need information from one table, for example the foreign key of an order table, to get customer details from another table. I can create a view having parameters to make sure it will get all data that I need. I could also - from what I have been reading - make a lookup table. What is the difference in this case and when should I choose for a lookup table?? I hope this ain't a bad question, I'm not very into db's yet ;).
Creating a view gives you a "live" representation of the data as it is at the time of querying. This comes at the cost of higher load on the server, because it has to determine the values for every query.
This can be expensive, depending on table sizes, database implementations and the complexity of the view definition.
A lookup table on the other hand is usually filled "manually", i. e. not every query against it will cause an expensive operation to fetch values from multiple tables. Instead your program has to take care of updating the lookup table should the underlying data change.
Usually lookup tables lend themselves to things that change seldomly, but are read often. Views on the other hand - while more expensive to execute - are more current.
I think your usage of "Lookup Table" is slightly awry. In normal parlance a lookup table is a code or reference data table. It might consist of a CODE and a DESCRIPTION or a code expansion. The purpose of such tables is to provide a lsit of permitted values for restricted columns, things like CUSTOMER_TYPE or PRIORITY_CODE. This category of table is often referred to as "standing data" because it changes very rarely if at all. The value of defining this data in Lookup tables is that they can be used in foreign keys and to populate Dropdowns and Lists Of Values.
What you are describing is a slightly different scenario:
I need information from one table, for
example the foreign key of an order
table, to get customer details from
another table
Both these tables are application data tables. Customer and Order records are dynamic. Now it is obviously valid to retrieve additional data from the Customer table to display along side the Order data, and in that sense Customer is a "lookup table". More pertinently it is the parent table of Order, because it has the primary key referenced by the foreign key on Order.
By all means build a view to capture the joining logic between Order and Customer. Such views can be quite helpful when building an application that uses the same joined tables in several places.
Here's an example of a lookup table. We have a system that tracks Jurors, one of the tables is JurorStatus. This table contains all the valid StatusCodes for Jurors:
Code: Value
WS : Will Serve
PP : Postponed
EM : Excuse Military
IF : Ineligible Felon
This is a lookup table for the valid codes.
A view is like a query.
Read this tutorial and you may find helpful info when a lookup table is needed:
SQL: Creating a Lookup Table
Just learn to write sql queries to get exactly what you need. No need to create a view! Views are not good to use in many instances, especially if you start to base them on other views, when they will kill performance. Do not use views just as a shorthand for query writing.