Flexible database design for an inventory system - sql

I have to build an inventory management system for my school and this is the problem that I'm facing:
There are multiple types of equipment that I have to store in my system:
Computers, Printers, Cartridges, Projectors, Mouses, Keyboards etc.
And there is some common data about each item, regardless it's type:
TEMSID (like a barcode), SerialNumber, PurchaseDate, RegisterDate and other, see item entity for more.
Also, each item type has it's specific fields that have to be stored.
This is how I am going to deal with it, however I'm not sure about it:
[item] table stores common data
[item] table has one to one relationships with other tables which store more details about specific items.
Repetitive data (Manufacturer, Model, Resolution etc.) is stored in other tables [itemType] to reduce redundancy.
Ignore those FK IDs from item table
I fell like it is a bad design.
If there is another more efficient solution, I'm ready to start designing from scratch.
Thank you in advance!

Ask yourself: what happens, if the school gets an item, for which you don't have any table (f.e. like Printer and PrinterType) in your database? And if this can happen often? Then you have to add new tables to the database each time.
On the other hand, you have type specific properties, but yet they can be common, like model or color.
If I were you, I would make a dynamic system, which means, the administrator/user can add any item (device) type and can add any properties to it.
I would design the db like this:
We have device types like printer, projector, etc. Then we have our devices table with the basic data and connected to the devicetypes table. Then we have a properties table, in which we hold the other properties of all the device types (one property only once, no redundancy). And finally, we have a propvalues table, where we store the different property values of each device.
F. e. in the above example, the device with id: 1 has two properties (model and color) with the values hp and true.
With this design, the user/administrator can manage as many properties as many he want.
One special thing: the value field should be f. e. a string field, because it can hold values like numbers, string, dates, etc. And in the view you have to cast this value depending on the datatype.
And if you have to manage compatibility data too, of course you can do it more generic. But maybe this should be a homework :-)

Related

How do I make a subtable in Access in this situation?

I'm aware that someone will see this and say "I'm sure there's already an answer to "how do I make a subtable", so this question is repetitive." Well, no answers seem to be relevant to THIS situation. Here goes:
I own a car business. I want to make a rolling list of costs associated with each car, to eventually be summed into "totalCosts".
I have table
ASSETS
stock#,
make,
model,
purchase price,
totalCosts
and table
COSTS
description,
cost (in $)
so I want the COSTS table to contain several costs like:
purchase price $2,000
paint $50
new tires $200
I can figure out how to sum the costs, but mainly I need to know how do I make the COSTS table to exist for each car or stock#. So each stock# would have its list of COSTS, which I would sum and insert into total costs.
The only solution I see now is to make a costs table that contains every cost with a stock # on the same line, but I would like to have a separate costs table for each stock #
This is really the most basic related table you can make – quite much applies to near any database system.
You cost table will have
Id - pk id (all tables should have this PK)
Asset ID - standard long number column used to relate back to the assets table.
Item (description of the cost (paint, tires, seat repair, etc. etc. etc.)
Cost (the amount of the given item)
So you build a main form that is based on the assets, and then build a sub form that likely best setup of as a “repeating rows” or so called multiple items form (of which you drop into your main form, and thus the “costs” form becomes a sub-form of the assets.
So in effect, you attach each row of cost to a given single assets record. And access will “set” the Asset ID column for you automatic if you drop in the costs form as a sub form into the assets form.
The form will thus look something like this form:
In above, (a access application and a basic form I built in Access) I actually have “two” columns in which to select the “item” or “cost” type. I have sunglasses, but on the next row I could add tires, then paint, then whoever I want. Over time, you can add any kind of “new” cost item without having to create a new table or change the database structure.
If your design requires a “change” in the table structure for each new “thing” you enter, then you have the wrong design. Can you imagine an accounting package that you have to “stop” all the time and modify the software?
And then the real mess comes when you attempt to build a report. Once again, reports only work on pre-defined tables – you can “switch” tables on the fly for a report.
So any concept you have in regards to spreadsheets etc. must be tossed out of your mind. Computer information systems SIMPLY DO NOT WORK THAT WAY AND CANNOT WORK THAT WAY!
So keep in mind that Relational databases are NOT spreadsheets, and you don’t and CAN NOT adopt designs that require structure changes on the fly. Forms, reports, a query, and even computer code cannot function on tables that change for each record you enter – you MUST adopt a data model that reflects your current needs. That data model if done right can THEN be used for very complex accounting systems, ERP systems, or even a web store front that has 1000’s or even millions of different products. Such data models are designed first, and then the forms and user interface, the reports etc. are THEN created.
The same even goes for job costing. You might have liquids, labor costs, wall paper used in feet, paint used in gallons, and tiles used in meters. So just like an invoice system, job costing systems can handle “different” kinds of costs, and no table changes are required.
In above, we have an invoice like setup, and we can add as many things we want to a tour booking. (Tickets, jackets, books, skates, tires, paint, seats, and windows - whatever we want – we simply add a new row for whatever we need.
Think of any kind of “invoice” software you used – you can add as many items you want to that invoice.
A relational database does NOT support a whole new table for each new “cost” that you want to build – databases simple do not work this way. So you only (so far) really need a master table of the asset, and then the child table of “costs”.
You can’t create a whole new “cost” table for each asset as the built in tools in near any and all databases don’t support nor work with creating a new table for each new set of data you create.
And for ease of data entry, you could build a table of cost Items, so the user does not have to type in paint, tires etc., but choose that item from a drop down combo box.
So EVERY single example on the internet, every book, and every article that explains how to relate a master table to a child table is in fact 100% relevant and is how you HAVE to approach this problem.
Relational databases do not support creating of a whole new table for “one” new set of data, because such an approach cannot be used in reports nor does the query language support such a design.

SQL vs NoSQL for data that will be presented to a user after multiple filters have been added

I am about to embark on a project for work that is very outside my normal scope of duties. As a SQL DBA, my initial inclination was to approach the project using a SQL database but the more I learn about NoSQL, the more I believe that it might be the better option. I was hoping that I could use this question to describe the project at a high level to get some feedback on the pros and cons of using each option.
The project is relatively straightforward. I have a set of objects that have various attributes. Some of these attributes are common to all objects whereas some are common only to a subset of the objects. What I am tasked with building is a service where the user chooses a series of filters that are based on the attributes of an object and then is returned a list of objects that matches all^ of the filters. When the user selects a filter, he or she may be filtering on a common or subset attribute but that is abstracted on the front end.
^ There is a chance, depending on user feedback, that the list of objects may match only some of the filters and the quality of the match will be displayed to the user through a score that indicates how many of the criteria were matched.
After watching this talk by Martin Folwler (http://www.youtube.com/watch?v=qI_g07C_Q5I), it would seem that a document-style NoSQL database should suit my needs but given that I have no experience with this approach, it is also possible that I am missing something obvious.
Some additional information - The database will initially have about 5,000 objects with each object containing 10 to 50 attributes but the number of objects will definitely grow over time and the number of attributes could grow depending on user feedback. In addition, I am hoping to have the ability to make rapid changes to the product as I get user feedback so flexibility is very important.
Any feedback would be very much appreciated and I would be happy to provide more information if I have left anything critical out of my discussion. Thanks.
This problem can be solved in by using two separate pieces of technology. The first is to use a relatively well designed database schema with a modern RDBMS. By modeling the application using the usual principles of normalization, you'll get really good response out of storage for individual CRUD statements.
Searching this schema, as you've surmised, is going to be a nightmare at scale. Don't do it. Instead look into using Solr/Lucene as your full text search engine. Solr's support for dynamic fields means you can add new properties to your documents/objects on the fly and immediately have the ability to search inside your data if you have designed your Solr schema correctly.
I'm not an expert in NoSQL, so I will not be advocating it. However, I have few points that can help you address your questions regarding the relational database structure.
First thing that I see right away is, you are talking about inheritance (at least conceptually). Your objects inherit from each-other, thus you have additional attributes for derived objects. Say you are adding a new type of object, first thing you need to do (conceptually) is to find a base/super (parent) object type for it, that has subset of the attributes and you are adding on top of them (extending base object type).
Once you get used to thinking like said above, next thing is about inheritance mapping patterns for relational databases. I'll steal terms from Martin Fowler to describe it here.
You can hold inheritance chain in the database by following one of the 3 ways:
1 - Single table inheritance: Whole inheritance chain is in one table. So, all new types of objects go into the same table.
Advantages: your search query has only one table to search, and it must be faster than a join for example.
Disadvantages: table grows faster than with option 2 for example; you have to add a type column that says what type of object is the row; some rows have empty columns because they belong to other types of objects.
2 - Concrete table inheritance: Separate table for each new type of object.
Advantages: if search affects only one type, you search only one table at a time; each table grows slower than in option 1 for example.
Disadvantages: you need to use union of queries if searching several types at the same time.
3 - Class table inheritance: One table for the base type object with its attributes only, additional tables with additional attributes for each child object type. So, child tables refer to the base table with PK/FK relations.
Advantages: all types are present in one table so easy to search all together using common attributes.
Disadvantages: base table grows fast because it contains part of child tables too; you need to use join to search all types of objects with all attributes.
Which one to choose?
It's a trade-off obviously. If you expect to have many types of objects added, I would go with Concrete table inheritance that gives reasonable query and scaling options. Class table inheritance seems to be not very friendly with fast queries and scalability. Single table inheritance seems to work with small number of types better.
Your call, my friend!
May as well make this an answer. I should comment that I'm not strong in NoSQL, so I tend to lean towards SQL.
I'd do this as a three table set. You will see it referred to as entity value pair logic on the web...it's a way of handling multiple dynamic attributes for items. Lets say you have a bunch of products and each one has a few attributes.
Prd 1 - a,b,c
Prd 2 - a,d,e,f
Prd 3 - a,b,d,g
Prd 4 - a,c,d,e,f
So here are 4 products and 6 attributes...same theory will work for hundreds of products and thousands of attributes. Standard way of holding this in one table requires the product info along with 6 columns to store the data (in this setup at least one third of them are null). New attribute added means altering the table to add another column to it and coming up with a script to populate existing or just leaving it null for all existing. Not the most fun, can be a head ache.
The alternative to this is a name value pair setup. You want a 'header' table to hold the common values amoungst your products (like name, or price...things that all rpoducts always have). In our example above, you will notice that attribute 'a' is being used on each record...this does mean attribute a can be a part of the header table as well. We'll call the key column here 'header_id'.
Second table is a reference table that is simply going to store the attributes that can be assigned to each product and assign an ID to it. We'll call the table attribute with atrr_id for a key. Rather straight forwards, each attribute above will be one row.
Quick example:
attr_id, attribute_name, notes
1,b, the length of time the product takes to install
2,c, spare part required
etc...
It's just a list of all of your attributes and what that attribute means. In the future, you will be adding a row to this table to open up a new attribute for each header.
Final table is a mapping table that actually holds the info. You will have your product id, the attribute id, and then the value. Normally called the detail table:
prd1, b, 5 mins
prd1, c, needs spare jack
prd2, d, 'misc text'
prd3, b, 15 mins
See how the data is stored as product key, value label, value? Any future product added can have any combination of any attributes stored in this table. Adding new attributes is adding a new line to the attribute table and then populating the details table as needed.
I beleive there is a wiki for it too... http://en.wikipedia.org/wiki/Entity-attribute-value_model
After this, it's simply figuring out the best methodology to pivot out your data (I'd recommend Postgres as an opensource db option here)

Database design: Store data from paper forms in database

Database design question for y'all. I have a form (like, the paper kind) that has several entry points for data. This form has changed, and is expected to change over years. It is being turned into a computer app, so that we can, among other things, quit wasting paper. (And minor things, like have all the data in one central store that can be queried, etc.) I'd like to store all of the forms data in a database, and have it be pretty agnostic as to the changes.
Originally, I was just considering each field to be a string -- and I had a table something like this:
FormId int (FK)
FieldName nvarchar(64)
FieldValue nvarchar(128)
...something like that. It was actually a bit more 3NFy in that FieldName was in another table, associated with an artificial key, so that the field names weren't duplicated all over the place.
However, I'd like to extend this to numeric and drop-down data. I could just store numeric data as strings, but that seems like a pretty crappy idea. Same with drop downs.
I could stop using a table, and actually use columns on the main form table (the one that FormId above references), but that means adding a column for each new item as they come along, and older forms would just be null. (And, unless I stored it, I wouldn't know when that column was created. With the string table above, it's implicit.)
I could extend the table above to something like:
FormId int (FK)
FieldName nvarchar(64)
FieldValueType int -- enum as to which of the columns below are valid (or just let nulls imply that)
FieldValue nvarchar(128)
FieldValueInt int
Combos would have to be in a OTLT (one true lookup table), which I have reservations about, but perhaps it's needed here?
Any advice on StackOverflow? I'm using MSSQL, but this is really a more general question.
Use Nulls. Proper database design is a complicated subject; you may do well to pick up a good reference and do some research on the whole thing (I gather this is a good book on the topic). In general, it sounds like you would be well served by starting with a single table that encapsulates all the fields in your form, and then putting it through the normalization process. And yes, use nulls and do NOT use an int to enumerate which columns are set to valid values; that is exactly what nulls are for.
You could have a separate table for each datatype.
I.e. to fetch an entire form you'd do an N-way join using the form id where N is the number of distinct datatypes you support (+ perhaps extras depending on the info you want - e.g. dropdown values would probably be stored in another table / your fieldname lookup / etc.)
But the design should probably also depend on how you intend to use the data, which you've said nothing about. And it would also depend on just how fast the rate of change is for these forms . . .
By creating a table with a description of your forms, you are actually defining a metadata structure. That's daunting. You would need a lot of the infrastructure needed for proper table description. I think the vendors of your database system spent a lot of effort in doing all that.
At first I thought - what a nice idea! Build your own compatibility-aware table description system!
But then I thought - I'm too stupid to do that on my own. There must be a database system capable of doing that.
So I conclude, not being a db expert, define proper defaults for 'new fields' in new form versions. Handle the compatibility issue in your business logic.
I would strongly advise against having a "generic table" like you describe.
You are essentially reinventing the relational database, which is not a good idea: Queries and updates will be very painful with your structure, and you will not be able to use the more advanced features like foreign keys and triggers, should you need them.
Just make a table(s) with columns for the data fields, and if a form does not have a field, let it be null.
Or, probably even better, have a "base table" (field that are in every form), and give names/version numbers to updated forms, and have a new table for the new columns that this version adds, then use a synthetic PK to join these new tables to your base table.
I.e.:
base table: id(numeric,PK), name, birthday, town
addresstable1: street, number, postal code, country, base_table_id (foreign key)
addresstable2: po box no, po box code, base_table_id (FK)
and so on.
That way you avoid loads of null fields; your tables are not so wide (always desirable), and your records are implicitly versioned, because the list of tables that have a record belonging to a record in your base table tells you which fields the original form had, hence what kind of form was used originally.

What is the preferred way to store custom fields in a SQL database?

My friend is building a product to be used by different independent medical units.
The database stores a vast collection of measurements taken at different times, like the temperature, blood pressure, etc...
Let us assume these are held in a table called exams with columns temperature, pressure, etc... (as well as id, patient_id and timestamp). Most of the measurements are stored as floats, but some are of other types (strings, integers...)
While many of these measurements are handled by their product, it needs to allow the different medical units to record and process other custom measurements. A very nifty UI allows the administrator to edit these customs fields, specify their name, type, possible range of values, etc...
He is unsure as to how to store these custom fields.
He is leaning towards a separate table (say a table custom_exam_data with fields like exam_id, custom_field_id, float_value, string_value, ...)
I worry that this will make searching both more difficult to achieve and less efficient.
I am leaning towards modifying the exam table directly (while avoiding conflicts on column names with some scheme like prefixing all custom fields with an underscore or naming them custom_1, ...)
He worries about modifying the database dynamically and having different schemas for each medical unit.
Hopefully some people which more experience can weigh in on this issue.
Notes:
he is using Ruby on Rails but I think this question is pretty much framework agnostic, except from the fact that he is only looking for solutions in SQL databases only.
I simplified the problem a bit since the custom fields need to be available for more than one table, but I believe this doesn`t really impact the direction to take.
(added) A very generic reporting module will need to search, sort, generate stats, etc.. of this data, so it is required that this data be stored in the columns of the appropriate type
(added) User inputs will be filtered, for the standard fields as well as for the custom fields. For example, numbers will be checked within a given range (can't have a temperature of -12 or +444), etc... Thus, conversion to the appropriate SQL type is not a problem.
I've had to deal with this situation many times over the years, and I agree with your initial idea of modifying the DB tables directly, and using dynamic SQL to generate statements.
Creating string UserAttribute or Key/Value columns sounds appealing at first, but it leads to the inner-platform effect where you end up having to re-implement foreign keys, data types, constraints, transactions, validation, sorting, grouping, calculations, et al. inside your RDBMS. You may as well just use flat files and not SQL at all.
SQL Server provides INFORMATION_SCHEMA tables that let you create, query, and modify table schemas at runtime. This has full type checking, constraints, transactions, calculations, and everything you need already built-in, don't reinvent it.
It's strange that so many people come up with ad-hoc solutions for this when there's a well-documented pattern for it:
Entity-Attribute-Value (EAV) Model
Two alternatives are XML and Nested Sets. XML is easier to manage but generally slow. Nested Sets usually require some type of proprietary database extension to do without making a mess, like CLR types in SQL Server 2005+. They violate first-normal form, but are nevertheless the fastest-performing solution.
Microsoft Dynamics CRM achieves this by altering the database design each time a change is made. Nasty, I think.
I would say a better option would be to consider an attribute table. Even though these are often frowned upon, it gives you the flexibility you need, and you can always create views using dynamic SQL to pivot the data out again. Just make sure you always use LEFT JOINs and FKs when creating these views, so that the Query Optimizer can do its job better.
I have seen a use of your friend's idea in a commercial accounting package. The table was split into two, first contained fields solely defined by the system, second contained fields like USER_STRING1, USER_STRING2, USER_FLOAT1 etc. The tables were linked by identity value (when a record is inserted into the main table, a record with same identity is inserted into the second one). Each table that needed user fields was split like that.
Well, whenever I need to store some unknown type in a database field, I usually store it as String, serializing it as needed, and also store the type of the data.
This way, you can have any kind of data, working with any type of database.
I would be inclined to store the measurement in the database as a string (varchar) with another column identifying the measurement type. My reasoning is that it will presumably, come from the UI as a string and casting to any other datatype may introduce a corruption before the user input get's stored.
The downside is that when you go to filter result-sets by some measurement metric you will still have to perform a casting but at least the storage and persistence mechanism is not introducing corruption.
I can't tell you the best way but I can tell you how Drupal achieves a sort of schemaless structure while still using the standard RDBMSs available today.
The general idea is that there's a schema table with a list of fields. Each row really only has two columns, the 'table':String column and the 'column':String column. For each of these columns it actually defines a whole table with just an id and the actual data for that column.
The trick really is that when you are working with the data it's never more than one join away from the bundle table that lists all the possible columns so you end up not losing as much speed as you might otherwise think. This will also allow you to expand much farther than just a few medical companies unlike the custom_ prefix you were proposing.
MySQL is very fast at returning row data for short rows with few columns. In this way this scheme ends up fairly quick while allowing you lots of flexibility.
As to search, my suggestion would be to index the page content instead of the database content. Use Solr to parse through rendered pages and hold links to the actual page instead of trying to search through the database using clever SQL.
Define two new tables: custom_exam_schema and custom_exam_data.
custom_exam_data has an exam_id column, plus an additional column for every custom attribute.
custom_exam_schema would have a row to describe how to interpret each of the columns of the custom_exam_data table. It would have columns like name, type, minValue, maxValue, etc.
So, for example, to create a custom field to track the number of fingers a person has, you would add ('fingerCount', 'number', 0, 10) to custom_exam_schema and then add a column named fingerCount to the exam table.
Someone might say it's bad to change the database schema at run time, but I'd argue that configuring these custom fields is part of set up and won't happen too often. Still, this method lets you handle changes at any time and doesn't risk messing around with your core table schemas.
lets say that your friend's database has to store data values from multiple sources such as demogrphic values, diagnosis, interventions, physionomic values, physiologic exam values, hospitalisation values etc.
He might have as well to define choices, lets say his database is missing the race and the unit staff need the race of the patient (different races are more unlikely to get some diseases), they might want to use a drop down with several choices.
I would propose to use an other table that would have these choices or would you just use a "Custom_field_choices" table, which at some point is exactly the same but with a different name.
Considering that the database :
- needs to be flexible
- that data from multiple tables can be added and be customized
- that you might want to keep the integrity of the main structure of your database for distribution and uniformity purpose
- that data MUST have a limit and alarms and warnings
- that data must have units ( 10 kg or 10 pounds) ?
- that data can have a selection of choices
- that data can be with different rights (from simple user to admin)
- that these data might be needed to generate reports without modifying the code (automation)
- that these data might be needed to make cross reference analysis within the system without modifying the code
the custom table would be my solution, modifying each table would end up being too risky.
I would store those custom fields in a table where each record ( dataType, dataValue, dataUnit ) would use in one row. So there would be a relation oneToMany from one sample to the data. You can also create a table to record all the kind of cutsom types you would use. For example:
create table DataType
(
id int primary key,
name varchar(100) not null unique
description text,
uri varchar(255) //<-- can be used for an ONTOLOGY
)
create table DataRecord
(
id int primary key,
sample_id int not null,//<-- reference to the sample
dataType_id int not null, //<-- references DataType
value varchar(100),//<-- the value as string
unit varchar(50)//<-- g, mg/ml, etc... but it could also be a link to a table describing the units just like DataType
)

Define Generic Data Model for Custom Product Types

I want to create a product catalog that allows for intricate details on each of the product types in the catalog. The product types have vastly different data associated with them; some with only generic data, some with a few extra fields of data, some with many fields that are specific to that product type. I need to easily add new product types to the system and respect their configuration, and I'd love tips on how to design the data model for these products as well as how to handle persistence and retrieval.
Some products will be very generic and I plan to use a common UI for editing those products. The products that have extensible configuration associated with them will get new views (and controllers) created for their editing. I expect all custom products to have their own model defined but to share a common base class. The base class would represent the generic product that has no custom fields.
Example products that need to be handled:
Generic product
Description
Light Bulb
Description
Type (with an enum of florescent, incandescent, halogen, led)
Wattage
Style (enum of flood, spot, etc.)
Refrigerator
Description
Make
Model
Style (with an enum in the domain model)
Water Filter information
Part number
Description
I expect to use MEF for discovering what product types are available in the system. I plan to create assemblies that contain product type models, views, and controllers, drop those assemblies into the bin, and have the application discover the new product types, and show them in the navigation.
Using SQL Server 2008, what would be the best way to store products of these various types, allowing for new types to be added without having to grow the database schema?
When retrieving data from the database, what's the best way to translate these polymorphic entities into their correct domain models?
Updates and Clarifications
To avoid the Inner Platform Effect, if there is a database table for every product type (to store the products of that type), then I still need a way to retrieve all products that spans product types. How would that be achieved?
I talked with Nikhilk in more detail about his SharePoint reference. Specifically, he was talking about this: http://msdn.microsoft.com/en-us/library/ms998711.aspx. It actually seems pretty attractive. No need to parse XML; and there is some indexing that could be done allowing for simple and fast queries over the data. For instance, I could say "find all 75-watt light bulbs" by knowing that the first int column in the row is the wattage when the row represents a light bulb. Something (NHibernate?) in the app tier would define the mapping from the product type to the userdata schema.
Voted down the schema that has the Property Table because this could lead to lots of rows per product. This could lead to index difficulties, plus all queries would have to essentially pivot the data.
Use a Sharepoint-style UserData table, that has a set of string columns, a set of int columns, etc. and a Type column.
Then you have a list of types table that specifies the schema for each type - its properties, and the specific columns they map to in the UserData table.
With things like Azure and other utility computing storage you don't even need to define a table. Every store object is basically a dictionary.
I think you need to go with a data model like --
Product Table
ProductId (PK)
ProductName
Details
Property Table
PropertyId (PK)
ProductId (FK)
ParentPropertyId (FK - Self referenced to categorize properties)
PropertyName
PropertyValue
PropertyValueTypeId
Property Value Lookup Table
PropertyValueLookupId (PK)
PropertyId (FK)
LookupValue
And then have a dynamic view based on this. You could use the PropertyValueTypeId coloumn to identify the type, using a convention, like (0- string, 1-integer, 2-float, 3-image etc) - But ultimately you can store everything untyped only. You could also use this column to select the control template to render the corresponding property to the user.
You can use the Value lookup table to keep lookups for a specific property (so that user can choose it from a list)
Summarizing lets look at the options under consideration for storing product information:
1) some xml format in the database
2) similar to the post above about having x number of type defined columns (sharepoint approach)
3) via generic table with name and type definitions stored in lookup table and values in secondary table with columns id, propertyid, value (similar to #2 however this approach would provide unlimited property information
4) some hybrid of the above option where product table would have x common columns (for storage of properties common with all products) with y user defined columns (this could be m of integer type and n of varchar types). This may be taking the best of #2 and a normalzied structure as if you knew all the properties of all products. You would be getting the best sql performance for the properties that you use the most (probably those that are common across all products) while still allowing custom columns for specific properties with each product.
Are there other options? In my opinion I would consider 4 above as the best hybrid of the combinations.
dave
Put as much of the shared anticipated structure in traditional normalized 3NF model, then augment with XML columns as appropriate.
I don't see MEF (or any other ORM) being able to do all this transparently.
I think you should avoid the Inner Platform Effect and actually build tables for your specialized entities. You'll be writing specific code to manage them so why not have proper backing tables too?
It will make your deployment slightly harder - drop in an assembly and run a script - but it will probably save you a lot of pain in the long run.
Jeff,
we currently use a XML field in the Products table to handle all product-specific data. So our Products table has a few common fields that all products share, an XML which contains whatever a particular product needs additionally, and a few computed fields that grab into the XML and surface some of the frequently queried fields as "virtual" fields on the Products table (e.g. "Style" would be set to whatever the current product defines, or NULL, if the product doesn't have a Style property).
So far, we've been quite flexible with that approach - if you create some decent XSD schemas for your XML, you can even create C# proxy classes for these fields.
Works nicely for us - joining the best of both the relational and XML worlds.
Marc