Complex Django model relationships - sql

I'm trying to build an SQL database for use in Django. I understand the model from an abstract perspective, but I don't know enough about how to build databases to know if I'm relating these objects correctly. I want to keep this abstract so I can implement the code myself.
Say I have three types of objects: A, B, and C. An A may hold one or more B. A B can hold multiple Bs or Cs in any combination, but it must have at least one (B or C) inside of it. A C merely holds some data and is a simple construct.
Currently I have:
from django.db import models
class A(models.Model):
# A attributes
class B(models.Model):
a = models.ForeignKey(A, on_delete=models.CASCADE)
# B attributes
class C(models.Model):
b = models.ForeignKey(B, on_delete=models.CASCADE)
# C attributes
I know that this just allows a Many-To-One relationship between C→B and B→A, but I don't know how to allow a B to refer to another B.
I would also like to be able to easily write a form where you start with an A with a requirement of at least one B inside of it, and you can add or remove Bs at will. Is this possible?
I think there's probably just a better way of setting up this data, but I don't see it since I'm very new to database organization.
If it helps, I'm designing a form to allow easy writing of workouts for swimming. The A is a workout, which has a title and an author. Each B is a set. Sets are composed of things like "2x50yards freestyle" or "8x100 IM on 2:00" — the Cs. But sometimes a set has sort of a sub-set, which is like a loop.

I highly recommend checking out the Django documentation on models which can be found here: Django Models
Moreover, to make a symmetrical Many to Many relationship, use:
class B(models.Model):
bs = models.ManyToManyField("self")
Additionally, I recommend making the relationship between A and B a many to many relationship instead of a foreign key. This will allow you to assign the B to many A's while still allowing 1 A to have many B's. The same logic should potentially be taken for B and C.
To answer your question about making B's required for A, I do not think this is possible. Check out this question for more information: Django 1.7: how to make ManyToManyField required?

Related

How to implement design in OOP

I have following structure
One organization can have many environments.
One environment can have many Applications.
One application can have many Policis.
I created class of each entities i.e.
class Organization,
class Environment,
class Application,
class Policy
Now I want to apply policies to Application.
One policy should have one Policy class object. All instances of Policy are different. Every policy have unique name and ID.
Inheritance will not work, Consider following hierarchy -
Organization
Environment(Organization)
API(Environment)
Policy(API)
because every policy, required to procide all details of API, Environment, Organization.
Can we do aggregation here? Need help on this
All instances of Policy are different.
Every policy have unique name and ID
You can indicate that with the constraint :
Policy.allInstances() -> forAll(p1, p2 |
p1 <> p2 implies (p1.name <> p2.name and p1.ID <> p2.ID))
A class diagram from the information you give can be :
I do not use bidirectional relations supposing Policy does not know the associated Application(s) whose does not know associated Environment(s) whose does not know associated Organization(s).
I use multiplicities * equivalent to 0..* because nothing in your question says the minimum multiplicity is 1 each time. I do not indicate the multiplicity in the opposite direction of the relations because your question does not indicate something about them.
Inheritance will not work
A inherits B implies A is a B, among the classes your give none of them satisfy that, so there is no possible inheritance between them.
Can we do aggregation here
may be between Environment and Application because we can say an environment is composed by applications, but else where no.

SQLAlchemy Category column

I am using a SQLAlchemy database to hold data for a flask application. I would like one column in my database to represent a category (e.g. the possible categories may be A, B or C).
I have seen in documentation that this can be achieved by a simple relationship which relates two tables. One table to hold some live data (inclusive of a category ID and a category) and another table to relate a category id to the associated category. http://flask-sqlalchemy.pocoo.org/2.3/quickstart/#simple-relationships
Would this method be considered good practice for including some kind of "category" column in my database? Or is there a simpler/better way. In this case my aim is to prioritise simplicity while maintaining good practice (don't really need best practice if it entails too much complexity).
Additionally, if my category names will never change, is it bad practice to use a constant list of category names to compare input data with in order to validate it? If so, why?
This is more of an SQL question and it isn't related to Python at all.
Anyways, it is actually better to use a reference table as you first suggested.
In this case, a Category table with one-to-many relationship. This allows you to change category name, and enrich Category with more details (like description) that might become useful in the future.
The other way, using constant list, is considered a bad practice - especially using Enums. You can read more about it in this article: 8 Reasons Why MySQL's ENUM Data Type Is Evil
You can read more about this dilemma here.
Hope it helps.

Many2many as 'composition' of two One2many

Suppose I have three objects A, B, C with relationships one A to many B and one A to many C. This naturally implies the existence of a many B to many C relationship, but the implication is clearly not recognized by the computer.
The questions are,
(i) How can this many2many be defined so that it respects the links as given through the already existing relationships?
(ii) Are there any special means of displaying said relationship in the form-view for each of objects B and C?
(iii) Is it possible that this is inherently the meaning of a many2many relationship and that I should just browse through the plethora of non-existent examples in the documentation?
You should be able to define a related fields.Many2many that uses relationships from B to C. See: Related Fields Documentation
For example:
Model_A:
b_ids = fields.One2many(comodel_name='B',
inverse_name='a_id')
c_ids = fields.One2many(comodel_name='C',
inverse_name='a_id')
Model_B:
a_id = fields.Many2one(comodel_name='A')
c_ids = fields.Many2many(comodel_name='C',
related='a_id.c_ids')
Model_C:
a_id = fields.Many2one(comodel_name='A')
b_ids = fields.Many2many(comodel_name='B',
related='a_id.b_ids')
Once you've defined the related fields, all the normal Many2many interactions will work (views, ORM, etc). You can add store=True to the field definition to store the relation in its own database table for easier searching and queries.

SQL vs NoSQL for data that will be presented to a user after multiple filters have been added

I am about to embark on a project for work that is very outside my normal scope of duties. As a SQL DBA, my initial inclination was to approach the project using a SQL database but the more I learn about NoSQL, the more I believe that it might be the better option. I was hoping that I could use this question to describe the project at a high level to get some feedback on the pros and cons of using each option.
The project is relatively straightforward. I have a set of objects that have various attributes. Some of these attributes are common to all objects whereas some are common only to a subset of the objects. What I am tasked with building is a service where the user chooses a series of filters that are based on the attributes of an object and then is returned a list of objects that matches all^ of the filters. When the user selects a filter, he or she may be filtering on a common or subset attribute but that is abstracted on the front end.
^ There is a chance, depending on user feedback, that the list of objects may match only some of the filters and the quality of the match will be displayed to the user through a score that indicates how many of the criteria were matched.
After watching this talk by Martin Folwler (http://www.youtube.com/watch?v=qI_g07C_Q5I), it would seem that a document-style NoSQL database should suit my needs but given that I have no experience with this approach, it is also possible that I am missing something obvious.
Some additional information - The database will initially have about 5,000 objects with each object containing 10 to 50 attributes but the number of objects will definitely grow over time and the number of attributes could grow depending on user feedback. In addition, I am hoping to have the ability to make rapid changes to the product as I get user feedback so flexibility is very important.
Any feedback would be very much appreciated and I would be happy to provide more information if I have left anything critical out of my discussion. Thanks.
This problem can be solved in by using two separate pieces of technology. The first is to use a relatively well designed database schema with a modern RDBMS. By modeling the application using the usual principles of normalization, you'll get really good response out of storage for individual CRUD statements.
Searching this schema, as you've surmised, is going to be a nightmare at scale. Don't do it. Instead look into using Solr/Lucene as your full text search engine. Solr's support for dynamic fields means you can add new properties to your documents/objects on the fly and immediately have the ability to search inside your data if you have designed your Solr schema correctly.
I'm not an expert in NoSQL, so I will not be advocating it. However, I have few points that can help you address your questions regarding the relational database structure.
First thing that I see right away is, you are talking about inheritance (at least conceptually). Your objects inherit from each-other, thus you have additional attributes for derived objects. Say you are adding a new type of object, first thing you need to do (conceptually) is to find a base/super (parent) object type for it, that has subset of the attributes and you are adding on top of them (extending base object type).
Once you get used to thinking like said above, next thing is about inheritance mapping patterns for relational databases. I'll steal terms from Martin Fowler to describe it here.
You can hold inheritance chain in the database by following one of the 3 ways:
1 - Single table inheritance: Whole inheritance chain is in one table. So, all new types of objects go into the same table.
Advantages: your search query has only one table to search, and it must be faster than a join for example.
Disadvantages: table grows faster than with option 2 for example; you have to add a type column that says what type of object is the row; some rows have empty columns because they belong to other types of objects.
2 - Concrete table inheritance: Separate table for each new type of object.
Advantages: if search affects only one type, you search only one table at a time; each table grows slower than in option 1 for example.
Disadvantages: you need to use union of queries if searching several types at the same time.
3 - Class table inheritance: One table for the base type object with its attributes only, additional tables with additional attributes for each child object type. So, child tables refer to the base table with PK/FK relations.
Advantages: all types are present in one table so easy to search all together using common attributes.
Disadvantages: base table grows fast because it contains part of child tables too; you need to use join to search all types of objects with all attributes.
Which one to choose?
It's a trade-off obviously. If you expect to have many types of objects added, I would go with Concrete table inheritance that gives reasonable query and scaling options. Class table inheritance seems to be not very friendly with fast queries and scalability. Single table inheritance seems to work with small number of types better.
Your call, my friend!
May as well make this an answer. I should comment that I'm not strong in NoSQL, so I tend to lean towards SQL.
I'd do this as a three table set. You will see it referred to as entity value pair logic on the web...it's a way of handling multiple dynamic attributes for items. Lets say you have a bunch of products and each one has a few attributes.
Prd 1 - a,b,c
Prd 2 - a,d,e,f
Prd 3 - a,b,d,g
Prd 4 - a,c,d,e,f
So here are 4 products and 6 attributes...same theory will work for hundreds of products and thousands of attributes. Standard way of holding this in one table requires the product info along with 6 columns to store the data (in this setup at least one third of them are null). New attribute added means altering the table to add another column to it and coming up with a script to populate existing or just leaving it null for all existing. Not the most fun, can be a head ache.
The alternative to this is a name value pair setup. You want a 'header' table to hold the common values amoungst your products (like name, or price...things that all rpoducts always have). In our example above, you will notice that attribute 'a' is being used on each record...this does mean attribute a can be a part of the header table as well. We'll call the key column here 'header_id'.
Second table is a reference table that is simply going to store the attributes that can be assigned to each product and assign an ID to it. We'll call the table attribute with atrr_id for a key. Rather straight forwards, each attribute above will be one row.
Quick example:
attr_id, attribute_name, notes
1,b, the length of time the product takes to install
2,c, spare part required
etc...
It's just a list of all of your attributes and what that attribute means. In the future, you will be adding a row to this table to open up a new attribute for each header.
Final table is a mapping table that actually holds the info. You will have your product id, the attribute id, and then the value. Normally called the detail table:
prd1, b, 5 mins
prd1, c, needs spare jack
prd2, d, 'misc text'
prd3, b, 15 mins
See how the data is stored as product key, value label, value? Any future product added can have any combination of any attributes stored in this table. Adding new attributes is adding a new line to the attribute table and then populating the details table as needed.
I beleive there is a wiki for it too... http://en.wikipedia.org/wiki/Entity-attribute-value_model
After this, it's simply figuring out the best methodology to pivot out your data (I'd recommend Postgres as an opensource db option here)

CoreData referencing

My application is CoreData based but they may be a common theory for all relational databases:
I have a Output-Input to-many relationship in my model. There are potentially an unlimited number of links under this relationship for each entity. What is the best way to identify a specific input or output?
The only way I have achieved this so far is to place an intermediate entity in the relationship that can hold an output and input name. Then an entity can cycle through its inputs/outputs to find the right relationship when required. Is there a better way?
Effectively I am trying to provide a generic entity that can have any number of relationships with other generic entity.
Apologies if my description isn't the clearest.
Edit in response to the answer below:
Firstly thank you for your response. I certainly have a two-way too-many relationship in mind. But if a widget has 2 other widgets linked to its Inputs relationship what is the best way of determining which input is supplying, say, 'Age' or 'Years Service' when both may have this property but I'm only interested in a specific value from each?
I'm as confused as Joshua - which tells me that it may be that you haven't got a clear picture of what you're trying to achieve or that it is somewhat complex (both?).
My best guess is that you have something like:
Entity Widget
Attributes:
identifier
Relationships
outputWidgets <<->> Widget
inputWidgets <<->> Widget
(where as per standard a ->> is a to-many relationship and <<->> is a to-many relationship with a to-many reverse relationship).
So each widget will be storing the set of widgets that it has as outputs and the set of widgets it has as inputs.
Thus a specific widget maintains a set of inputWidgets and outputWidgets. Each of these relationships is also reversed so you can - for any of the widgets in the input or output - see that your widget is in their list of inputs or outputs.
This is bloody ugly though.
I think your question is how to achieve the above while labelling a relationship. You mention you want to have a string identifier (unique?) for each relationship.
You could do this via:
Where you create a new widgetNamedRelationship for each double sided relationship. Note that I'm assuming that every relationship is double sided.
Then for each widget you have a set of named inputs and named outputs. This also allows for widgets to be attached to themselves but only of there are separate input and output busses.
So then for your example "age" in your implementation class for Widget instance called aWidget you'd have something like:
NSPredicate *agePredicate = [NSPredicate predicateWithFormat:#"name='age'"];
NSSet *ageInputs = [aWidget.inputs filteredSetUsingPredicate:agePredicate];
Have I understood the question?
There really is no better way if you want to be able to take full advantage of the conveniences of fast and efficient in-store querying. It's unclear what you're asking in your additional comments, which I suppose is why you haven't gotten any answers yet.
Keep in mind Core Data supports many-to-many relationships without a "join table."
If Widget has many Inputs or Outputs (which I suspect could be the same entity), then a many-to-many, two-way relationship (a relationship with an inverse, in Core Data parlance) between Widget and Input is all you need. Then all you need to do is see if your Input instance is in the Widget instance's -inputs or if a Widget instance is in the Input instance's -widgets.
Is that what you were looking for? If not, please try to clarify your question (by editing it, not by appending comments :-)).