A tree, where each node could have multiple parents

A tree, where each node could have multiple parents - oop

Here's a theoretical/pedantic question: imagine properties where each one could be owned by multiple others. Furthermore, from one iteration of ownership to the next, two neighboring owners could decide to partly combine ownership. For example:
territory 1, t=0: a,b,c,d
territory 2, t=0: e,f,g,h
territory 1, t=1: a,b,g,h
territory 2, t=1: g,h
That is to say, c and d no longer own property; and g and h became fat cats, so to speak.
I'm currently representing this data structure as a tree where each child could have multiple parents. My goal is to cram this into the Composite design pattern; but I'm having issues getting a conceptual footing on how the client might go back and update previous ownership without mucking up the whole structure.
My question is twofold.
Easy: What is a convenient name for this data structure such that I can google it myself?
Hard: What am I doing wrong? When I code I try to keep the mantra, "Keep it simple, Stupid," in my head, and I feel I am breaking this credo.

My question is two fold: Easy: What is a convenient name for this data
structure such that I can google it myself?
What you have here is not a tree, it is a graph. A multimap will help you here.
But any adjacency list or adjacency matrix will give you a good start.
Here is a video on adjacency matrix and list: Youtube on adjacency matrix and list
Hard: What am I doing wrong?
This is really hard to tell. Perhaps you did not model the relationship
in a proper way. It is not that hard, given a good datastructure to start with.
And, as you asked for design patterns (but you probably found out yourself),
the Composite pattern will let you model such an setting with ease.

You have a many-to-many relationship between your owners and your territories (properties). I'm not sure what language you're working in, but this sort of thing can be easily represented and tracked in a relational database. (You'd probably want a table for each entity, and the relationship would probably require a third "junction" table. If it's necessary to be able to query "back in time", this could have some sort of "time index" column as well.)
If you are working in an object-oriented language, you might create two classes, Territory and Owner, where the Territory class has a property/member/field which is a collection of references/pointers to Owners and the Owner class has a similar collection of Territories. (One of these two collections may need to contain "weak" references depending on the language.)
In this case, some difficulty may arise if you want to be able to go back and look at the network state at some particular point earlier in time. (If this is what you need, say so and I (or someone else) can post a solution that works for that.)
I'm not sure what level of simplicity you are striving for, but in neither of these cases is updating the ownership relationships really that "hard". Maybe if you posted some code it might be easier to give you more concrete advice.

Hard to tell without more information regarding the business rules. Though I've plenty of experience designing graphs where each node could potentially have numerous parents.
A common structure is the Directed Acyclic Graph. Essential rules here are that no path through the graph can cycle back onto itself. For example take the path "A/B/C/B", this would not be valid as B repeats twice.
Valid:- "A/B/C", "D/E/C", node C has two parents E and B.
Invalid:- "A/B/C/B", node B repeats in the same path causing a cycle.

Related

Best Practices for Overlapping Object/Data Entity Types

Occasionally I run into situations where all of the conditions below are true for two highly similar, but not quite identical entities or objects. This makes it difficult for me to decide how to model them, either on the database end or in terms of object modeling. I'm going to try to spell out the issue and my questions in detail, because I've found it to be a really difficult modeling problem to define. I'm trying to do both data and object modeling with these entities, so I'm going to use the terminology of both disciplines a little loosely.
1) Both entities share many identical properties, but have a few unique ones not found in the other.
2) One is not a supertype or subtype of another.
3) The overlap is not due to object inheritance.
4) The objects are used for different purposes in the same domain, but often in close proximity in any workflow. This frequently leads those with even moderate domain knowledge to confuse the entities. On the other hand, this fine separation in purposes leads to greater differences between the methods of the associated objects than their properties.
5) In some situations it may be possible to create bridge tables on the database side to express M2M relationships between the entities. Nevertheless, they have so many properties (or columns, on the database side) in common that it might make sense to store them in the same table.
Some cases in point I've run into include:
1) "Product vs. Project confusion" - especially in software marketing, where Products and Projects share many of the same properties. Normally a product will have multiple projects associated with it, but it is also unusual yet conceivable for a project to be used in multiple products.
2) The subtle differences between Features and Components in software development. A feature is developer-centric a means of supplying a benefit, from the customer's point of view, while a component is a means of implementing features on the developer's side. This is a really subtle distinction which nevertheless counts for a lot. For further discussion see Rod Maupin's post at http://www.installationdeveloper.com/347/features-and-components-101/
3) Templates vs. Types in a lot of different problem domains. For example, when identifying types of guitars through a TypeID column, the TypeTable it refers to would probably have columns corresponding to colors, string sizes, body shapes, etc. A template, on the other hand, is something you'd build a guitar from, so it would have different methods than a Type, perhaps linked to an "Apply Template" or "Make Item from Template" menu command. Nevertheless, it would have many of the same columns or properties as a Type, such as color, shape, string size etc. This distinction raises its head in thousands of different object types and templates in many problem domains, not just this narrow example. To complicate matters further, in some situations it might be helpful to associate multiple Templates with a particular Type, and/or vice-versa.
I haven't run into this problem of overlapping entities often, but when it does occur, it becomes a real bottleneck and leads to a lot of waste time refactoring the data and object models. I've read books on both topics and done a lot searches of data/object modeling webpages about the issue, but have yet to see it discussed. The only hits for "overlap" and "data model" I could find on StackOverflow were for differentiating between similar columns in one table or entity, not across tables or entities. My questions are:
1) Is there is a formal name for this issue?
2) Is there a simple shortcut or trick of the trade to identify such overlapping entities at the beginning of the modeling process, rather than much further down the line, when late recognition makes refactoring an issue?
3) How should such overlapping entities be handled? I assume that in terms of OOP, they ought to have separate objects since their methods tend to be different. Inheriting one from the other would be awkward though. A more difficult question would be whether or not it would make sense to use separate tables on the database end. Combining them might require a complex series of views plus waste storage space when the properties/columns they don't have in common are left null. Storing them in separate tables might also be wasteful though, if the common properties could be stored in single columns.
It's a tricky issue to even recognize, let alone handle. I have only a moderate amount of experience with data/object modeling, so the input of someone who really knows what they're doing would be helpful. Thanks :)

Your question concerns both database modeling aspects that object-oriented (programming) modeling aspects. Let’s start from an abstract point of view.
You say:
1) Both entities share many identical properties, but have a few unique ones not found in the other.
2) One is not a supertype or subtype of another.
and:
3) The overlap is not due to object inheritance.
But note that inheritance should not to be confused with subtyping, even if many times they are tied together! See for instance Inheritance (object-oriented programming) in Wikipedia, where this statement is supported by two citations [1,2].
In other words, even if A is not a subtype of B, and B is not a subtype of A, you can find a C from which both A and B inherits attributes.
So, you can think or not of this C as an “abstract supertype” of both A and B; but in any case it is convenient consider it as common ancestor, at least from a database point of view, so that factorize the common attributes in a “supertable”.
Then, from the object-oriented programming side, you can see A or B as subtype of C or simple as two different things, depending on the characteristics of your Object-Relational Mapping tools, from the problem at hand, etc.
Of course, this way of modelling things does not prohibit that A and B, in addition to inherit from C, have one or more relations between them, as in the example Products-Projects that you have done.
So, here is my answer to your four final questions:
1) Yes, it is called inheritance.
2) You can check if two entities have a significant number of common attributes.
3) You can model them in the database with a common table, that perhaps has some common property like integrity constraints, and with two tables that have a foreign key to it. Of course this rule is not to be applied blindly, but can have exception as all the human rules. From the programming point of view, on the other hand, you can decide to model them both with a supertype or not. This dependes on many factors, and should be decided on a case by case basis.

Is structure (graph) of objects an Aggregate Root worthy of a Repository?

Philosophical DDD question here...
I've seen a lot of Entity vs. Value Object discussions here, but mine is slightly different. Forgive me if this has been covered before.
I'm working in the financial domain at the moment. We have funds (hedge variety). Those funds often invest into other funds. This results in a tree structure of sorts with one fund at the top anchoring it all together.
Obviously, a fund is an Entity (Aggregate Root, even). Things like trades and positions are most likely Value Objects.
My question is: Should the tree structure itself be considered an Aggregate Root?
Some thoughts:
The tree structure is stored in the DB by storing the components and the posistions they have into each other. We currently have no coded concept of the tree. The domain is very weak.
The tree structure has no "uniqueness" or identifier.
There is logic needed in many places to "walk" the tree to find the relationships to each other, either top-down, or sometimes bottom-up. This logic needs to be encapsulated somewhere.
There is lots of logic to compute leverage, exposure, etc... and roll it up the tree.
Is it good enough to treat the Fund as a Composite Fund object and that is the Aggregate Root with in-built Invariants? Or is a more formal tree structure useful in this case?

I usually take a more functional/domain approach to designing my aggregates and aggregate roots.
This results in a tree structure of sorts
Maybe you can talk with your domain expert to see if that notion deserves to be a first-class citizen with a name of its own in the ubiquitous language (FundTree, FundComposition... ?)
Once that is done, making it an aggregate root will basically depend on whether you consider the entity to be one of the main entry points in the application, i.e. will you sometimes need a reference to a FundTree before even having any reference to a Fund, or if you can afford to obtain it only by traversal of a Fund.

This is more a decision of if you want to load full trees at all times really.
If you are anal about what you define as an aggregate root, then you will find a lot of bloat as you will be loading full object trees any time you load them.
There is no one size fits all approach to this, but in my opinion, you should have your relationships all mapped to your aggregate roots where possible, but in some cases a part of that tree can be treated as an aggregate root when needed.
If you're in a web environment, this is a different decision to a desktop application.
In the web, you are starting again every page load so I tend to have a good MODEL to map the relationships and a repository for pretty much every entity (as I always need to save just a small part of something from some popup somewhere) and pull it together with services that are done per aggregate root. It makes the code predictable and stops those... "umm.... is this a root" moments or repositories that become unmanagable.
Then I will have mappers that can give me summary and/or listitem views of large trees as needed and when needed.
On a desktop app, you keep things in memory a lot more, so you will write less code by just working out what your aggregate roots are and loading them when you need them.
There is no right or wrong to this. I doubt you could build a big app of any sort without making compromises on what is considered an aggregate root and you'll always end up in a sitation where 2 roots end up joining each other somewhere.

Organizing interconnected objects

This is a generic question, I don't know if it belongs to Programming or StackOverflow.
I'm writing a litte simulation. Without going very deep into its details, consider that many kind of identities are involved. They correspond to Object since I'm using a OOP language.
There are Guys that inhabit the world simulated
There are Maps
A map has many Lots, that are pieces of land with some characteristics
There are Tribes (guys belong to tribes)
There is a generic class called Position to locate the elements
There are Bots in control of tribes that move guys around
There is a World that represents the world simulated
and so on.
If the simulated world was laid down as a database, the objects would be tables with lots of references, but in memory I have to use a different strategy. So, for example, a Tribe has an array of Guys as a property, The world has a, array of Bots, of Tribes, of Maps. A Map has a Dictionary whose key is a Position and whose value is a Lot. A Guy has a Position that is where he stands.
The way I lay down such connections is pretty much arbitrary. For example, I could have an array of Guys in the World, or an Array of guys per Lot (the guys standing on a piece of land), or an array of Guys per Bot (with the Guys controlled by the bot).
Doing so, I also have to pass around a lot of objects. For example, a Bot must have informations about the Map and opponent Guys to decide how to move its Guys.
As said, in a database I'd have a Guys table connected to the Lots table (indicating its position), to the Tribe table (indicating which Tribe it belongs to) and so it would also be easy to query "All the guys in Position [1, 5]". "All the Guys of Tribe 123". "All the Guys controlled by Bot B standing on the Lot b34 not belonging to the Tribe 456" and so on.
I've worked with APIs where to get the simplest information you had to make an instance of the CustomerContextCollection and pass it to CustomerQueryFactory to get back a CustomerInPlaceQuery to... When people criticize OOP and cite verbose abstractions that soon smell ridiculous, that's what I mean. I want to avoid such things and having to relay on deep abstractions and (anti pattern) abstract contexts.
The question is: what is the preferred, clean way to manage entities and collections of entities that are deeply linked in multiple ways?

It depends on your definition of "clean". In my case, I define clean as: I can implement desired behavior in an obvious, efficient manner.
Building OOP software is not a data modeling exercise. I'd suggest stepping back a little. What does each one of those objects actually do? What methods are you going to implement?
Just because "guys are in a lot" doesn't mean that the lot object needs a collection of guys; it only needs one if there are operations on a lot that affect all the guys in it. And even then, it doesn't necessarily need a collection of guys - it needs a way to get the guys in the lot. This may be an internally stored collection, but it could also be a simple method that calls back into the world to find guys matching a criteria. The implementation of that lookup should be transparent to anyone.
From the tenor of your questions, it seems like you're thinking of this from a "how do I generate reports" perspective. Step back and think of the behaviors you're trying to implement first.
Another thing I find extremely valuable is to differentiate between Entities and Values. Entities are objects where identity matters - you may have two guys, both named "Chris", but they are two different objects and remain distinct despite having the same "key". Values, on the other hand, act like ints. From your above list, Position sounds a lot like a value - Position(0,0) is Position(0,0) regardless of which chunk of memory (identity) those bits are stored in. The distinction has a bit effect on how you compare and store values vs. entities. For example, your Guy objects (entities) would store their Position as a simple member variable.
I've found a great reference for how to think about such things is Eric Evan's "Domain Driven Design" book. He's focused on business systems, but the discussions are very valuable for how you think about building OO systems in general I've found.

I would say that no 'true' answer exists to your core question -- a best way to manage collections of entities that are linked in multiple ways. It really depends on the kind of application (simulation) - here are some thoughts:
Is execution time important?
If this is the case, there is really no way around analyzing in which way your simulator will iterate over (query) the objects from the pool: sketch out the basic simulation loop and check what kind of events will require to iterate over what kind of model entities (I assume you are developing a discrete-event simulation?). Then you should organize the data structures in a way that optimizes the most frequent/time-consuming events (as opposed to "laying down the connections arbitrarily"). Additionally, you may want to use special data structures (such as k-d trees) to organize entities with properties that you need to query often (e.g., position data). For some typical problems, e.g. collision detection, there is also a whole lot of approaches to solve them efficiently (so look for suitable libraries/frameworks, e.g. for multi-agent simulation).
How flexible do you want to make it?
If you really want to make it super-flexible and really don't want to decide on the hierarchy of the model entities, why not just use an in-memory database? As you already said, databases are easily applicable to your problem (and you can easily save the model state, which may also be useful).
How clean is clean enough?
If you want to be absolutely sure that the rest of your simulator is not affected by the design choices you make in regards of your model representation, hide it behind an interface (say, ModelWorld), which defines methods for all the types of queries your simulator may invoke (this is orthogonal to the second point and may help with the first point, i.e. figuring out what kind of access pattern your simulator exhibits). This allows you to change implementations easily, without affecting any other parts of the simulator code.

SQLite structure advice

I have a book structure with Chapter, Subchapter, Section, Subsection, Article and unknown number of subarticles, sub-subarticles, sub-sub-subarticles etc.
What's the best way to structure this?
One table with child-parent relationships, multiple tables?
Thank you.

To determine whether there are seperate tables or one-big-table involved, you should take a close look at each item - chapter, subchapter, etc. - and decide if they carry different attributes from the others. Does a chapter carry something different from a sub-chapter?
If so, then you're looking at seperate tables for Chapter, SubChapter, Section, SubSection, Article. Article still feels hierarchical to me with your sub- sub-sub- sub-sub-sub- etc.
If not, then maybe it is one big table with parent/child, but it looks like you may be talking about 'names' for the depth of the hierarchy which leans me toward seperate tables again.
Also consider how you'll query and what you'll be searching for.

There are a couple of methods to save a tree structure in a relational database. The most commonly used are using parent pointers and nested sets.
The first has a very easy data structure, namely a pointer to the respective parent element on each object), and is thus easy to implement. On the downside it is not easy to make some queries on it as the tree can not be fully traversed. You would need a self-join per layer.
The nested set is easier to query (when you have understood how it works) but is harder to update. Many writes require additional updates to other objects ion the tree which might make it harder to be transitionally save.
A third variant is that of the materialized path which I personally consider a good compromise between the former two.
That said, if you want to store arbitrary size trees (e.g,. for sections, sub-sections, sub-sub-sections, ...) you should use one of the mentioned tree implementations. If you have a very limited maximum depth (e.g max 3 layers) you could get away with creating an explicit data structure. But as things always get more complex than initially though, I'd advise you to use a real tree implementation.

How can an object-oriented programmer get his/her head around database-driven programming?

I have been programming in C# and Java for a little over a year and have a decent grasp of object oriented programming, but my new side project requires a database-driven model. I'm using C# and Linq which seems to be a very powerful tool but I'm having trouble with designing a database around my object oriented approach.
My two main question are:
How do I deal with inheritance in my database?
Let's say I'm building a staff rostering application and I have an abstract class, Event. From Event I derive abstract classes ShiftEvent and StaffEvent. I then have concrete classes Shift (derived from ShiftEvent) and StaffTimeOff (derived from StaffEvent). There are other derived classes, but for the sake of argument these are enough.
Should I have a separate table for ShiftEvents and StaffEvents? Maybe I should have separate tables for each concrete class? Both of these approaches seem like they would give me problems when interacting with the database. Another approach could be to have one Event table, and this table would have nullable columns for every type of data in any of my concrete classes. All of these approaches feel like they could impede extensibility down the road. More than likely there is a third approach that I have not considered.
My second question:
How do I deal with collections and one-to-many relationships in an object oriented way?
Let's say I have a Products class and a Categories class. Each instance of Categories would contain one or more products, but the products themselves should have no knowledge of categories. If I want to implement this in a database, then each product would need a category ID which maps to the categories table. But this introduces more coupling than I would prefer from an OO point of view. The products shouldn't even know that the categories exist, much less have a data field containing a category ID! Is there a better way?

Linq to SQL using a table per class solution:
http://blogs.microsoft.co.il/blogs/bursteg/archive/2007/10/01/linq-to-sql-inheritance.aspx
Other solutions (such as my favorite, LLBLGen) allow other models. Personally, I like the single table solution with a discriminator column, but that is probably because we often query across the inheritance hierarchy and thus see it as the normal query, whereas querying a specific type only requires a "where" change.
All said and done, I personally feel that mapping OO into tables is putting the cart before the horse. There have been continual claims that the impedance mismatch between OO and relations has been solved... and there have been plenty of OO specific databases. None of them have unseated the powerful simplicity of the relation.
Instead, I tend to design the database with the application in mind, map those tables to entities and build from there. Some find this as a loss of OO in the design process, but in my mind the data layer shouldn't be talking high enough into your application to be affecting the design of the higher order systems, just because you used a relational model for storage.

I had the opposite problem: how to get my head around OO after years of database design. Come to that, a decade earlier I had the problem of getting my head around SQL after years of "structured" flat-file programming. There are jsut enough similarities betwwen class and data entity decomposition to mislead you into thinking that they're equivalent. They aren't.
I tend to agree with the view that once you're committed to a relational database for storage then you should design a normalised model and compromise your object model where unavoidable. This is because you're more constrained by the DBMS than you are with your own code - building a compromised data model is more likley to cause you pain.
That said, in the examples given, you have choices: if ShiftEvent and StaffEvent are mostly similar in terms of attributes and are often processed together as Events, then I'd be inclined to implement a single Events table with a type column. Single-table views can be an effective way to separate out the sub-classes and on most db platforms are updatable. If the classes are more different in terms of attributes, then a table for each might be more appropriate. I don't think I like the three-table idea:"has one or none" relationships are seldom necessary in relational design. Anyway, you can always create an Event view as the union of the two tables.
As to Product and Category, if one Category can have many Products, but not vice versa, then the normal relational way to represent this is for the product to contain a category id. Yes, it's coupling, but it's only data coupling, and it's not a mortal sin. The column should probably be indexed, so that it's efficient to retrieve all products for a category. If you're really horrified by the notion then pretend it's a many-to-many relationship and use a separate ProductCategorisation table. It's not that big a deal, although it implies a potential relationship that doesn't really exist and might mislead somone coming to the app in future.

In my opinion, these paradigms (the Relational Model and OOP) apply to different domains, making it difficult (and pointless) to try to create a mapping between them.
The Relational Model is about representing facts (such as "A is a person"), i.e. intangible things that have the property of being "unique". It doesn't make sense to talk about several "instances" of the same fact - there is just the fact.
Object Oriented Programming is a programming paradigm detailing a way to construct computer programs to fulfill certain criteria (re-use, polymorphism, information hiding...). An object is typically a metaphor for some tangible thing - a car, an engine, a manager or a person etc. Tangible things are not facts - there may be two distinct objects with identical state without them being the same object (hence the difference between equals and == in Java, for example).
Spring and similar tools provide access to relational data programmatically, so that the facts can be represented by objects in the program. This does not mean that OOP and the Relational Model are the same, or should be confused with eachother. Use the Realational Model to design databases (collections of facts) and OOP to design computer programs.
TL;DR version (Object-Relational impedance mismatch distilled):
Facts = the recipe on your fridge.
Objects = the content of your fridge.

Frameworks such as
Hibernate http://www.hibernate.org/
JPA http://java.sun.com/developer/technicalArticles/J2EE/jpa/
can help you to smoothly solve this problem of inheritance. e.g. http://www.java-tips.org/java-ee-tips/enterprise-java-beans/inheritance-and-the-java-persistenc.html

I also got to understand database design, SQL, and particularly the data centered world view before tackling the object oriented approach. The object-relational-impedance-mismatch still baffles me.
The closest thing I've found to getting a handle on it is this: looking at objects not from an object oriented progamming perspective, or even from an object oriented design perspective but from an object oriented analysis perspective. The best book on OOA that I got was written in the early 90s by Peter Coad.
On the database side, the best model to compare with OOA is not the relational model of data, but the Entity-Relationship (ER) model. An ER model is not really relational, and it doesn't specify the logical design. Many relational apologists think that is ER's weakness, but it is actually its strength. ER is best used not for database design but for requirements analysis of a database, otherwise known as data analysis.
ER data analysis and OOA are surprisingly compatible with each other. ER, in turn is fairly compatible with relational data modeling and hence to SQL database design. OOA is, of course, compatible with OOD and hence to OOP.
This may seem like the long way around. But if you keep things abstract enough, you won't waste too much time on the analysis models, and you'll find it surprisingly easy to overcome the impedance mismatch.
The biggest thing to get over in terms of learning database design is this: data linkages like the foreign key to primary key linkage you objected to in your question are not horrible at all. They are the essence of tying related data together.
There is a phenomenon in pre database and pre object oriented systems called the ripple effect. The ripple effect is where a seemingly trivial change to a large system ends up causing consequent required changes all over the entire system.
OOP contains the ripple effect primarily through encapsulation and information hiding.
Relational data modeling overcomes the ripple effect primarily through physical data independence and logical data independence.
On the surface, these two seem like fundamentally contradictory modes of thinking. Eventually, you'll learn how to use both of them to good advantage.

My guess off the top of my head:
On the topic of inheritance I would suggest having 3 tables: Event, ShiftEvent and StaffEvent. Event has the common data elements kind of like how it was originally defined.
The last one can go the other way, I think. You could have a table with category ID and product ID with no other columns where for a given category ID this returns the products but the product may not need to get the category as part of how it describes itself.

The big question: how can you get your head around it? It just takes practice. You try implementing a database design, run into problems with your design, you refactor and remember for next time what worked and what didn't.
To answer your specific questions... this is a little bit of opinion thrown in, as in "how I would do it", not taking into account performance needs and such. I always start fully normalized and go from there based on real-world testing:
Table Event
EventID
Title
StartDateTime
EndDateTime
Table ShiftEvent
ShiftEventID
EventID
ShiftSpecificProperty1
...
Table Product
ProductID
Name
Table Category
CategoryID
Name
Table CategoryProduct
CategoryID
ProductID
Also reiterating what Pierre said - an ORM tool like Hibernate makes dealing with the friction between relational structures and OO structures much nicer.

There are several possibilities in order to map an inheritance tree to a relational model.
NHibernate for instance supports the 'table per class hierarchy', table per subclass and table per concrete class strategies:
http://www.hibernate.org/hib_docs/nhibernate/html/inheritance.html
For your second question:
You can create a 1:n relation in your DB, where the Products table has offcourse a foreign key to the Categories table.
However, this does not mean that your Product Class needs to have a reference to the Category instance to which it belongs to.
You can create a Category class, which contains a set or list of products, and you can create a product class, which has no notion of the Category to which it belongs.
Again, you can easy do this using (N)Hibernate;
http://www.hibernate.org/hib_docs/reference/en/html/collections.html

Sounds like you are discovering the Object-Relational Impedance Mismatch.

The products shouldn't even know that
the categories exist, much less have a
data field containing a category ID!
I disagree here, I would think that instead of supplying a category id you let your orm do it for you. Then in code you would have something like (borrowing from NHib's and Castle's ActiveRecord):
class Category
[HasMany]
IList<Product> Products {get;set;}
...
class Product
[BelongsTo]
Category ParentCategory {get;set;}
Then if you wanted to see what category the product you are in you'd just do something simple like:
Product.ParentCategory
I think you can setup the orm's differently, but either way for the inheritence question, I ask...why do you care? Either go about it with objects and forget about the database or do it a different way. Might seem silly, but unless you really really can't have a bunch of tables, or don't want a single table for some reason, why would you care about the database? For instance, I have the same setup with a few inheriting objects, and I just go about my business. I haven't looked at the actual database yet as it doesn't concern me. The underlying SQL is what is concerning me, and the correct data coming back.
If you have to care about the database then you're going to need to either modify your objects or come up with a custom way of doing things.

I guess a bit of pragmatism would be good here. Mappings between objects and tables always have a bit of strangeness here and there. Here's what I do:
I use Ibatis to talk to my database (Java to Oracle). Whenever I have an inheretance structure where I want a subclass to be stored in the database, I use a "discriminator". This is a trick where you have one table for all the Classes (Types), and have all fields which you could possibly want to store. There is one extra column in the table, containing a string which is used by Ibatis to see which type of object it needs to return.
It looks funny in the database, and sometimes can get you into trouble with relations to fields which are not in all Classes, but 80% of the time this is a good solution.
Regarding your relation between category and product, I would add a categoryId column to the product, because that would make life really easy, both SQL wise and Mapping wise. If you're really stuck on doing the "theoretically correct thing", you can consider an extra table which has only 2 colums, connecting the Categories and their products. It will work, but generally this construction is only used when you need many-to-many relations.
Try to keep it as simple as possible. Having a "academic solution" is nice, but generally means a bit of overkill and is harder to refactor because it is too abstract (like hiding the relations between Category and Product).
I hope this helps.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas