I would like to implement something along the lines of multi table inheritance for my rails application. I am familiar with how STI works and was wondering if the implementation would be similar.
My situation is as follows (names of tables have been changed):
I have a table Employee, and Employee has many janitors and programmers. Janitors and Programmers have many different types of work utensils, so a custodial Table would fit the janitor and Tech table would fit the programmer. Well the jobs could be endless and the attributes for the jobs (janitors, programmers etc) are different so they must be separate tables. I want to consolidate a table called Jobs which belongs under Employee. This table Jobs will have a job_type (here it can be either janitor or programmer) and a utensil_type (custodial, tech). How can I properly implement what this scenario is trying to achieve?
I know how important the type is for STI so I want to know how I can implement this MTI for my rails problem?
Maybe ActiveRecord::ActsAs gem will fit your needs https://github.com/hzamani/active_record-acts_as
Related
I'm a junior developer working on a new Rails App and need some assistance in deciding which design pattern is best to use to model different types of orders. I was planning on utilizing the Single Table Inheritance pattern for these problems, but have heard several developers say I should stay away from this pattern. Any ideas are appreciated!
Orders need different types. Each type shares some database columns, however will also need some columns which are not shared by the other types (as we further develop the business and begin to handle new types of orders). I want to avoid duplicating columns in different database tables, and I also want to avoid having one table for everything as each type will not need everything from a single table.
Orders can be created by either customers or retailers on behalf of one of their customers. Another type of Order is the WelcomeKit which we send to customers directly before they send in their order
All Orders need the following:
customer_id
fullfilled - boolean
fullfilled_date
outgoing_shipment_id
total
sub_total
discount_total
All RepairOrders also need the following:
arrived (boolean)
arrival_date
incoming_shipment_id
repairer_id (the store doing the repair)
All RetailOrders need everything from Orders and RepairOrders and also:
retailer_id (the store creating the order)
retailer_order_id (the id of the order from the retailer)
All WelcomeKits only need information from the Order table.
Is Single Table Inheritance the best way to handle this? Are there any other patterns that are better suited for long term maintainability? Any specific help on both database and model design would be awesome! Thanks.
I highly recommend you to read the book by Sandi Metz, Practical Object-Oriented Design in Ruby, as it touches upon this very subject in great detail. Since I don't have it with me right now, I will copy a short paragraph written by Ben Johnson that kind of summarizes the overall strategy:
Always prefer composition to inheritance (classes) unless you can
defend inheritance. Also, consider if you have one basic thing with
subtypes or something that is made up of parts.
If X is-a Y → Inheritance (class)
If X has-a Y → Composition
IF X behaves-like-a Y → Duck type (module)
Good design naturally progresses towards small, independent objects that rely on abstractions.
I think that in your particular situation inheritance may be a good fit.
I have seen an article in Dzone regarding Post and Post Details (two different entities) and the relations between them. There the post and its details are in different tables. But as I see it, Post Detail is an embeddable part because it cannot be used without the "parent" Post. So what is the logic to separate it in another table?
Please give me a more clear explanation when to use which one?
Embeddable classes represent the state of their parent classes. So to take your example, a StackOverflow POST has an ID which is invariant and used in an unbreakable URL for sharing e.g. http://stackoverflow.com/q/44017535/146325. There are a series of other attributes (state, votes, etc) which are scalar properties. When the post gets edited we have various versions of the text (which are kept and visible to people with sufficient rep). Those are your POST DETAILS.
"what is the logic to separate it in another table?"
Because keeping different things in separate tables is what relational databases do. The standard way of representing this data model is a parent table POST and child table POST_DETAIL with a defined relationship enforced through a foreign key.
Embeddable is a concept from object-oriented programming. Oracle does support object-relational constructs in the database. So it would be possible to define a POST_DETAIL Type and create a POST Table which has a column declared as a nested table of that Type. However, that would be a bad design for two reasons:
The SQL for working with nested tables is clunky. For instance, to get the POST and the latest version of its text would require unnesting the collection of details every time we need to display it. Computationally not much different from joining to a child table and filtering on latest version flag, but harder to optimise.
Children can have children themselves. In the case of Posts, Tags are details because they can vary due to editing. But if you embed TAG in POST_DETAIL embedded in POST how easy would it be to find all the Posts with an [oracle] tag?
This is the difference between Object-Oriented design and relational design.
OO is strongly hierarchical: everything is belongs to something and the way to get the detail is through the parent. This approach works well when dealing with single instances of things, and so is appropriate for UI design.
Relational prioritises commonality: everything of the same type is grouped together with links to other things. This approach is suited for dealing with sets of things, and so is appropriate for data management tasks (do you want to find all the employees who work in BERLIN or whose job is ENGINEER or who are managed by ELLIOTT?)
"give me a more clear explanation when to use which one"
Always store the data relationally in separate tables. Build APIs using OO patterns when it makes sense to do so.
I'm new to database design and have some uncertainties about how best to model this particular case. I'd appreciate any suggestions for this fairly simple scenario.
When a production task begins, two people are involved at all times. One is in charge of the production, and a second is tasked with quality assurance. For any task in the database, it must be possible to identify these two people. They'll both exist in a Person table and have IDs, so I just want the best way to relate them to the production task. The following rules exist:
Either person may be swapped out for a different person at any time.
Each task always involves both people (Neither of these are null).
There are never any other people involved in the task that we want to record.
Each person may be involved in multiple tasks, or none at all.
If we had a whole host of relationships between the task and the people, I'd create some sort of convoluted relationship structure describing their relationship (As producer, quality assurance person, overseer, etc.), but here I feel as though it's sensible to just stick the IDs of the two people in the Task table, in separate columns for Production Person and Quality Assurance Person. Is this bad for some reason that I can't see?
What has really prompted my question is that I'm trying to design exactly that in DBDesigner 4, which I'm new to, and it just doesn't like it - When I try to set up a second non-identifying relationship between Task and Person, it won't give me a second field. It also won't seem to let me rename the fields in Task that refer to the persons, so it'd be impossible to differentiate between the two anyway. Since no-one else seems to share this problem, I've began to wonder whether it's a good idea at all. Is it standard to introduce additional tables as soon as there are two or more links between two entities? What would that look like if I wanted to enforce the above rules? I can't see how I'd ensure that an n:m table always has entries for both people working on the task.
If you are confident your requirements will stay this rigid forever, then just create two NOT NULL FKs:
This declaratively enforces that exactly two people are associated to the task at all times, which would not be readily achievable with just the junction table (as you already noted).
OTOH, if you anticipate your requirements might change at some point in the future, then the added flexibility of junction table might be more important than the completely declarative enforcement of your business rules.
I'm not familiar with DBDesigner, and therefore with your particular problem, but in ER modeling in general, multiple relationships with the same entity are distinguished by their "rolenames" which determine the names of migrated attributes (see the section on "Rolenames" in the chapter 3 of the ERwin Methods Guide). Try locating something along those lines in the UI of your tool.
If you want to know the current state and not who held the role previously #Branko Dimitrijevic's solution will work.
But if the statement 'Either person may be swapped out for a different person at any time' implies you need to know who previously held that role consider a 3 table design
Task; TaskID, <other details>
Assignee; TaskID, PeopleID, role, start_date, end_date
People; PeopleID, <other details>
Then in the assignee table you need constraints to ensure that for each TaskID, Role combination the dates are reasonable e.g. dates don't overlap or have gaps. That you have only 1 of each role active for each task at a time. To manage this would probably require code either in triggers or the application.
So for my software engineering course, as a part of the larger project, we need to implement a database using HSQLDB. Unfortunately, I haven't taken database design yet, and 3 out of 5 people in our group have dropped the course, leaving this part for me to do.
As of now, I've come up with this ER Diagram for our project:
What we have is a list of courses, and each course contains many modules. Every account can be registered in any course, giving them access to each module of the course, which is graded, and than the mark is stored on their account.
I think the diagram I've come up with represents this fairly well; however, I just started learning about this today, so I'm still a bit shaky, so to say.
Is there anything that jumps out as wrong about this, or parts that could be improved?
P.s - I just noticed in the module table, it contains grade, which should actually be in module_grade.
Course_grade table is absolutley useless in your model. You should store the grade information inside course_grade and module_grade instead of the module directly. Think of module as master data (so something you want to use for all students) which means that you should not store student specific information inside it.
I would also add timestamps to your model at least inside the tables that have the grade information so that you can at least check when the student got the information. If you also have the information available who gave the grade you should probably store that as well.
If you are using SQL to access your model think about changing the foreign key columns in course_grade and module_grade to something that is unique. This makes queries much more readable imo. Maybe for course grade renaming course_id to cg_course_id.
I read up on database structuring and normalization and decided to remodel the database behind my learning thingie to reduce redundancy.
I have different types of entries that can be learned. Gap texts/cloze tests (one text, many gaps) and simple known-unknown (one question, one answer) types.
Now I'm in a bit of a pickle:
gaps need exactly the same columns in the user table as question-answer types
but they need less columns than question-answer types (all that info is in the clozetests table)
I'm wishing for a "magic" foreign key that can point both to the gap and the terms table. Of course their ids would overlap though. I don't like having both a term_id and gap_id in the user_terms, that seems unelegant (but is the most elegant I can come up with after googling for a while, not knowing what name this pickle goes by).
I don't want a user_gaps analogue to user_terms, because then I'd be in the same pickle when it comes to the table user_terms_answers.
I put up this cardboard cutout collage of my schema. I didn't remove the stuff that isn't relevant for this question, but I can do that if anyone's confusion can be remedied like that. I think it looks super tidy already. Tidier than my mental concept of this at least.
Did I say any help would be greatly appreciated? Answerers might find themselves adulated for their wisdom.
Background story if you care, it's not really relevant to the question.
Before remodeling I had them all in one table (because I added the gap texts in a hurry), so that the gap texts were "normal" items without answers, while the gaps where items without questions. The application linked them together.
Edit
I added an answer after SO coughed up some helpful posts. I'm not yet 100% satisfied. I try to write views for common queries to this set up now and again I feel like I'll have to pull application logic for something that is database turf.
As mentioned in the comment, it is hard to answer without knowing the whole story. So, here is a story and a model to match. See if you can adapt this to you example.
School of (foreign) languages offers exams for several levels of language proficiency. The school maintains many pre-made tests for each level of each language (LangLevelTestNo).
Each test contains several (many) questions. Each question can be simple or of the close-text-type. Correct answers are stored for each simple question. Correct terms are stored for each gap of each close-text question.
Student can take an exam for a language level and is presented with one of the pre-made tests. For each student exam, the exam form is maintained which stores students answers for each question of the exam. Like a question, an answer may be of a simple of of a close-text-type.
After editing my question some Stackoverflow started relating the right questions to me.
I knew this was a common problem, but I really couldn't find it, just couldn't come up with the right search terms, I guess.
The following threads address similar problems and I'll try to apply that logic to my own design. They all propose adding a higher-level description for (in my case terms and gaps) like items. That makes sense and reflects the logic behind my application.
Relation Database Design
Foreign Key on multiple columns in one of several tables
Foreign Key refering to primary key across multiple tables
And this good person illustrates how to retrieve the data once it's broken up across tables. He also clues me to the keyword class table inheritance, so now I know what to google.
I'll post back with my edited schema once I've applied this. It does seem more elegant like this.
Edited schema