I have a very simple situation but I can't seem to get my head around; I have a table promotions which has many sites. The site can be used with different promotions and in my (postgres) database I've 3 tables; promotions, sites and promotions_sites.
In my web application a user can edit the promotion and add a collection of sites (new line seperated). So in a save the collection of sites is saved at the promotion. This works. Still there are 2 problems;
1) old site records are not removed (when one is delete from the lines of sites)
2) when a current site is saved all original sites are re-created
My question is at which level I should manage the sites;
1) application level; just deleting all sites before re-inserting
2) data level; is there a nhibernate configuration to do this?
3) database level; create triggers/cascade deletes on the site table based upon the absence of an item in promotions_sites
At first the automatical orphan deletes. Sites can't be deleted automatically when your relation is defined as many-to-many. But if your promotions_sites table is mapped as separate entity and you have two one-to-many relations in sites and promotions, you can achieve automatic deletes by setting cascade on those relations to all-delete-orphan.
Second thing - where to manage the collection. You should let NHibernate to do this. Assuming you have proper mappings, you should not care about it on application level or especially at database trigger level. Anyway, the logic of handling collections in NHibernate is somehow different that you may be familiar with from database level. Generally you need to load parent object with the existing child collection, modify the collection and commit back the changes to the database. You should not replace the whole collection.
But it will be easier to talk when you show your mappings and code used to save changes.
Related
I am creating a basic CMS for a small academic project I want to use to practice design patterns in code later and I am starting with designing the database.
The simple requirement is that a "Container" of some kind will contain "Pages" or "Controls". Also a "Page" will contain "Controls". Think of a container like the holder for several tabbed pages. So when something is saved, everything can be saved at once by simply calling save on the container.
Anyway, I have 3 tables. "Container", "Page" and "Control".
Think of a "Container" like a holder for all the things that will appear on screen. It can be split into other smaller containers if needed (for wide screen).
Think of a "Page" like a form of some kind. Something that will allow the user to place controls used to collect data.
Think of a "Control" like a label, text box, button, etc. There are various types of controls (not shown in the diagram), but the "Controls" table will hold multiple instances of any given control type. For example, a form that has 5 labels and 5 text boxes, will have 10 corresponding entries in the controls table - 5 entries for each type of control. So a control is a singular instance of a control type, will have a unique ID, and can only be used once.
If a container is deleted, I want this to cascade and delete the relevant pages and controls.
Likewise, if a page is deleted, I want this to cascade and delete the controls that were on that page.
I have 2 problems to solve.
Problem 1: A control can belong to either a container or a page, but not both. Essentially a control can have only one owner or parent. Likewise with a page. However with a control, because it can be put on either a container, or a page, I have to connect it with 2 relationships, and create a constraint on the PageId and ControlId columns that ensures only one of them is not null. This I have done successfully, but I am wondering if there is a better way?
Problem 2: Because I am using cascade update and delete with 3 tables. I am getting an error:
Introducing FOREIGN KEY constraint 'FK_Pages_Containers' on table
'Pages' may cause cycles or multiple cascade paths. Specify ON DELETE
NO ACTION or ON UPDATE NO ACTION, or modify other FOREIGN KEY
constraints.
I want the controls belonging to a page or container, AND the pages belonging to a container deleted if either the page or container is deleted. So, in my mind, I need the cascading update and delete. So this error is forcing me to change my relationship requirements in a way that will leave orphans.
As a side note, a container could also contain other containers. This might be useful on large wide screens to display a form (page) in two parts, one on the left and one on the right for example, and still be able to save everything by clicking a save button on the parent.
So how can I overcome this? Am I overlooking something in my design?
For clarity here is an ER diagram:
You can see from the stars (*) that it is not saved yet because of the above error. My first goal is to get the database right. I don't like entity framework code first, and I want to use either entity framework database first or an alternative ORM that does it's thing as needed. But I want data integrity handled by the database where it should be.
Any suggestions would be welcome.
Based on the comments under the question. I tried a few solutions. Here is one that I came up with that seems to work best so far. I am just posting this for my own reference later, and maybe it will help others. So I still welcome any other suggestions for improvement.
To start I removed all the relationships and created a container type table. This I linked to a container manager table. Then I lined the container and pages table with a 1:1 relationship to the container management table. The logic behind this was that I currently have 2 types of containers, but later I could add more if I wanted.
In the container types I added data for "Container" and "Page" then referenced this back to the container manager table. I added an additional field in the container manager table to show who the parent of the container or page was. This is a self referencing relationship. It simplified the container and pages table somewhat.
Next I repeated the process with the controls table. It was slightly more complicated but the same idea. It just split containers from controls very nicely and then let the management tables link up to "manage" things.
From there it was simply a matter of adding a container parent to the control manager table and relating it back to the container manager table. All the relationships are cascading and work without problems. As you can see, the management tables, combined with the 1:1 relationships to controls, containers, and pages tables, provide instance specific information allowing for easy management. I really like that I don't have to build any constraints like I had in my original solution to ensure only one of 2 fields were filled in as constraints are not obvious. However, I have noticed that I could technically just combine the data from the linked 1:1 tables directly into the management tables and simplify the diagram even more. Which I may yet do as it is easier and faster to look at 2 tables, than 5. I would be interested to hear peoples opinion of this.
The general use case would be that we know the types of containers or controls in advance. So we can select the type of controls we want to put on a page or container. This information adds data to the management tables and generates a unique Id for the instance of the container, page, or control being used. Then we use that Id to create the actual container, page, or control instance the user will see. Obviously saving the data behind the controls is another matter that is not part of the question so I wont go into my solution for that here.
Thanks to D Mayuri who inspired this approach with is help and comments about separating out the relationships. I am not sure if this is how he intended me to do it. If he posts his own answer, I will of course accept and give him credit.
How can I make sure that specific data in the database isn't altered anymore.
We are working with TSQL. Inside the database we store contract revisions. These have a status: draft / active. When the status has become active, the revision may never be altered anymore. A revision can have 8 active modules (each with its own table), each with their own settings and sub-tables. This creates a whole tree of tables with records that may never change anymore when the contract revision has been set to active.
Ideally I would simply mark those records as read-only. But such thing does not exists as of today. The next thing that comes to mind are triggers. Thus I have to add those triggers to a lot of tables, all which are related to the contract revision.
Now maybe there are other approaches, like a database only for archiving on which the user only has insert rights. Thus when a contract revision has become active, it is moved from one DB to the archive DB (insert is allowed). And can never be altered anymore (DENY UPDATE|DELETE).
But maybe there are other more ingenious options I haven't thought of, and you did. Maybe including the CLR or what not.
So how can I make a tree-structure of records inside our TSQL database effectively readonly that is the most maintenance free, easy to understand, quickly to setup, and can be applied in a most generic way?
What ever you do (triggers, granted rights...) might be overcome by a user with higher rights, this you know for sure...
Is this just to archive this data?
One idea coming into my mind was to create a nested XML with all data within on big structure and put this somewhere into a side table. Create a INSTEAD OF UPDATE,DELETE TRIGGER where you just do nothing. Let these tables be 1:1-related.
You can still work with this data, but not quite as fast as being read from physical tables.
If you want, you even might convert the XML to a string and calculate some Hash-Code, which you store in a different place to check for manipulations.
The whole process might be done in one single Stored Procedure call.
I've been trying to get my head around NoSQL, and I do see the benefits to embedding data in documents.
What I can't understand, and hope someone can clear up, is how to store data if it must be relational.
For example.
I have many users. They are all buying a product. So everytime that they buy a product, we add it under the users document in mongo, so its embedded and its all great.
The problem I have is when something in reference to that product changes.
Lets say user A buys a car called "Porsche". Then, we add a reference to that under the users profile. However, in a strange turn of events Porsche gets purchased by Ferrari.
What do you do now, update each and every record and change to name from Porsche to Ferrari?
Typically in SQL, we would create 3 tables. One for users, one for Cars (description, model etc) & one for mapping users to purchases.
Do you do the same thing for Mongo? It seems like if you go down this route, you are trying to make Mongo do things SQL way, which is not what its intended for.
I can understand how certain data is great for embedding (addresses, contact details, comments, etc) but what happens when you need to reference data that can and needs to change at a regular basis?
I hope this question is clear
DBRefs/Manual References were made specifically to solve this issue. Instead of manually adding the data to each document and then needing to update when something changes, you can store a reference to another collection. Here is the mongoDB documentation for details.
References in Mongo
Then all you would need to do is update the reference collection and the change would be reflected in all downstream locations.
When i used the mongoose library for node js it actually creates 3 tables similar to how you might do it in SQL, you can use object id's as foreign keys and enrich them either on the client side or on the backend, still no joining but you could do an 'in' query for the ID's then enrich the objects that way, mongoose can do this automatically by 'populating'
We are creating a system that allows users to create and modify bills for their clients. The modifications need to be maintained as part of the bill for auditing purposes. It is to some extent a point in time architecture but we aren't tracking by time just by revision. This is a ASP.NET MVC 5, WebAPI2, EntityFramework 6, SQL Server app using Breeze on the client and the server.
I'm trying to figure how to get back the Breeze and our data model to work correctly. When we modify an entity we essentially keep the old row, make a copy of it with the modifications and update some entity state fields w/ date/time/revision number and so on. We can always get the most recent version of the entity based off of an entity ID and an EditState field where "1" is the most current.
I made a small sample app to work on getting Breeze working as part of the solution and to enable some nice SPA architecture and inline editing on the client and it all works... except that since our entity framework code automatically creates a new entity that contains the modifications, the SaveChanges response contains the original entity but not the new "updated" entity. Reloading the data on the client works but it would of course be dumb to do that outside of just hacking around for demo purposes.
So I made a new ContextProvider and inherited from EFContextProvider, overrode the AfterSaveEntities method and then things got a bit more complicated. Not all the entities have this "point in time" / revision functionality but most of them do. If they do I can as I said above get the latest version of that entity using its EntityId and EditState but I'm not seeing a straight forward way to get the new entity (pretty new to EF and very new to Breeze) so I'm hoping to find some pointers here.
Would this solution lie in Breeze or our DataContext? I could just do some reflection, get the type, query the updated entity and shove that into the saveMap. It seems like that might break down at some point (not sure how or when but seems sketchy). Is our architecture bad? Should we have gone the route of creating audit/log tables to store the modified values instead of keeping the datamodel somewhat smaller by keeping all of the revisions of the entities in their original tables but with the revision information and making the queries slightly more complicated? Am I just missing something in EF?
... and to head of the obvious response, I know we should have used a document database but that wasn't an option on this project. We are stuck in relational land.
I haven't tried this but another approach would be to simply change the EntityState of the incoming entity in the BeforeSaveEntities method from Modified to Added. You will probably need to also update some version field in this 'new' entity so that it doesn't have a primary key conflict with the original.
But... having built apps like this in the past, I really recommend another approach. Store your 'historical' entities of each type in a separate table. It can be exactly the same shape as the 'current' table. When you save you first copy the 'current' entity into the 'historical' table ( again with some version numbering or date schema for the primary key) and then just update your 'current' entity normally.
This might not give you the answer you expected, but here is an idea:
When saving an object, intercept save on server, you get an instance of object you need to modify, read object from database that has the same ID, put copy of that old object to legacy table in your database and continue with saving into main table. That way only latest revision stays in main table while legacy table would contain all previous versions.
So, all you would need to do is have two tables containing same objects:
public DbSet<MyClass> OriginalMyClasses{get;set;}
public DbSet<MyClass> LegacyMyClasses{get;set;}
override SaveChanges function and intercept when entry E state is Modified, read E type, get the original and legacy tables, read object O from Original with same ID as E, save O to Legacy table, and finally return base.SaveChanges(); (let it save as it is supposed to by default).
Lets say I have a table EmailQueue that is used to build out emails to send to users, using several non related processes. The Emails can contain a ever growing number of different 'content items' but for now lets just say we have News, Events, and Offers. Each of these 'content items' are already populated in their own respective tables and will be selectively added to a users email.
Now I have a design decision to make.
1
I could keep with a normalized pattern and create a mapping table for each of the 'content items' that an email can contain.
|EmailId|NewsId| , |EmailId|OfferId| , ...
The main issue I see with this design is that there is a good bit of overhead every time a new 'content type' is integrated to the email system; Both in database and object mapping.
OR
2
I could create 1 mapping table that has a Type reference.
|EmailId|ContentID|ContentType|
Of course the big issue here is that there is no referential integrity. I feel object mapping would be much easier to handle and adding a new object only requires adding a new ContentType row (and of course the required object mapping code).
Is one of these solutions better than the other? Or is there a solution better than both of these that I am unaware of?
I'm leaning towards using method 2, mainly because this project needs to be rapidly developed, but worried I may regret that decision down the road.
Note: We are using subsonic as our data access ORM, which does a decent(not perfect) job of handling object graphs through keyed relations. I will still likely need to map the active record 'content' objects to domain object though.