This question already has answers here:
Semantic relationship in UML
(3 answers)
Closed 2 months ago.
This post was edited and submitted for review 2 months ago and failed to reopen the post:
Original close reason(s) were not resolved
According to the UML specification and other valid resources, association is both a semantic relationship and a structural relationship.
There are questions:
What is the semantic relationship?
What is structural relationship?
Edit:
This question was closed due to a misunderstanding, but I did not receive a good answer to my question.
My question is about the structural and semantic relationship, not the meaning of these two words.
You could simply look into a dictionary. (Here it's Oxford)
Semantic
relating to meaning in language or logic.
Structure
the arrangement of and relations between the parts or elements of
something complex: flint is extremely hard, like diamond, which has a
similar structure.
a building or other object constructed from
several parts. [...]
Structure is what you can see. E.g. you can see lots of bricks glued together. That's the structure. The semantics will be "house" (if it's that) or could be "art" as well as "ruin". So the semantics is what you interpret a structure to be.
Relationships are no different. You can see the connection between two classes. But what it means will be defined by semantics. You will need some context to determine it. Once you have that (requirements, explanatory SDs etc.) you're done.
Related
I have seen an article in Dzone regarding Post and Post Details (two different entities) and the relations between them. There the post and its details are in different tables. But as I see it, Post Detail is an embeddable part because it cannot be used without the "parent" Post. So what is the logic to separate it in another table?
Please give me a more clear explanation when to use which one?
Embeddable classes represent the state of their parent classes. So to take your example, a StackOverflow POST has an ID which is invariant and used in an unbreakable URL for sharing e.g. http://stackoverflow.com/q/44017535/146325. There are a series of other attributes (state, votes, etc) which are scalar properties. When the post gets edited we have various versions of the text (which are kept and visible to people with sufficient rep). Those are your POST DETAILS.
"what is the logic to separate it in another table?"
Because keeping different things in separate tables is what relational databases do. The standard way of representing this data model is a parent table POST and child table POST_DETAIL with a defined relationship enforced through a foreign key.
Embeddable is a concept from object-oriented programming. Oracle does support object-relational constructs in the database. So it would be possible to define a POST_DETAIL Type and create a POST Table which has a column declared as a nested table of that Type. However, that would be a bad design for two reasons:
The SQL for working with nested tables is clunky. For instance, to get the POST and the latest version of its text would require unnesting the collection of details every time we need to display it. Computationally not much different from joining to a child table and filtering on latest version flag, but harder to optimise.
Children can have children themselves. In the case of Posts, Tags are details because they can vary due to editing. But if you embed TAG in POST_DETAIL embedded in POST how easy would it be to find all the Posts with an [oracle] tag?
This is the difference between Object-Oriented design and relational design.
OO is strongly hierarchical: everything is belongs to something and the way to get the detail is through the parent. This approach works well when dealing with single instances of things, and so is appropriate for UI design.
Relational prioritises commonality: everything of the same type is grouped together with links to other things. This approach is suited for dealing with sets of things, and so is appropriate for data management tasks (do you want to find all the employees who work in BERLIN or whose job is ENGINEER or who are managed by ELLIOTT?)
"give me a more clear explanation when to use which one"
Always store the data relationally in separate tables. Build APIs using OO patterns when it makes sense to do so.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
This is somewhat of a meta question, but because it relates to a database design, I thought I should post it here.
I'm building a site that includes Q+A and was wondering how I should structure my SQL database, so naturally, I looked to the best of the best. However, the Stack Exchange database schema seems to defy what I've learned about creating maintainable/extensible table hierarchies.
As you can see, Stack Exchange stores all of its "Posts" in one table, except for comments, which has its own table. Post types include questions, answers, and various wiki things. This results in a lot of NULL columns in the table. For example, questions have titles, tags, and answerCounts, while answers don't, so all answer entries have NULL for all three of those columns. If more post types are added over time, this will progressively become less maintainable. And the fact that comments is the only type of post that has its own table just seems inconsistent.
What I've read states that it's generally preferred to use an object subclass hierarchy, in which there's a generic "Posts" table along with a bunch of tables for each type of post that all have one column that maps back to the corresponding entry in the "Posts" table. This keeps the number of null columns to a minimum and makes it more extensible, but slows down queries because they'll require more joins.
So why does Stack Exchange use this giant table method? Is it just the result of ages of modifications to an old database? More specifically, should I use this model for my own Q+A system or stick with an object subclass hierarchy (my Q+A/forum system will closely resemble SO's, with several types of posts including questions, answers, polls, reviews, etc.)?
This is a classic case of so-called "Object-relational impedance mismatch". Specifically, you are taking about mapping OO's inheritance into a relational database structure. There are several common ways of doing that -
A table per subclass,
A table per leaf subclass, and
A table per class hierarchy (with a discriminator)
Each of these strategies is perfectly valid. Moreover, the structures could be mixed as needed.
It looks like Stack Exchange used a table per class hierarchy approach, with PostTypeId serving as a discriminator. This approach is as valid as any other approach that they could have taken. It is also one of the simplest ones to take from the maintenance standpoint, because it lets you construct manual queries with less work.
There is another thing in the structure of the table that you did not mention: it is not normalized. Specifically, there are AnswerCount and CommentCount fields that store information that could be obtained by aggregating the table (i.e. running a SELECT COUNT(*) FROM ... WHERE ... AND other.ParentId = p.Id ...) This is a common tradeoff between normalization and speed of execution: most likely, the profiling has indicated that the aggregation takes significant amount of time, so the counts have been moved into the "parent" record.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 4 years ago.
Improve this question
I need to store message data in SQL. Cannot decide which way to go here.
There is a main class Message, say (simplified):
class Message(models.Model):
user_id = models.ForeignKey(User)
text = models.TextField()
Plus, there are other Message classes that inherit this one.
class MmsMessage(Message):
imagedata = models.ForeignKey(ImageData)
And so on. These other message classes of course can have more than 1 additional field.
Now, I am evaluating the best (fastest) design pattern to make this work.
In around 25% of cases I will not be needing additional fields, simply raw Message objects (Message.objects.all). In other cases, I need all data. Additional fields may not necessarily be searchable. Nonetheless, it would be nice thing to have.
I was thinking about:
A: Inheritance (concrete, abstract)
Abstract inheritance is out. I loose the ability to do: Message.objects.all() which is unacceptable.
Concrete inheritance seems to me like a way to go. Tried two approaches. django-model-utils one (select_subclasses) which doesn't need additional queries, but due to lots of inner joins and redundant data in results it is very slow compared to other solutions.
django_polymorphic (still concrete inheritance) approach (using contenttypes to know what we are dealing with and then select related fields) is ~4 times faster than select_subclasses (at least on postgresql) - which was a small surprise for me (it requires +n queries where n is a number of child types but still faster due to simpler joins and no unnecessary data results). Tested on 10 000 objects across 20 different Message child types.
B: EAV model (many to many for additional attributes)
Haven't tested EAV model but I doubt it will be faster than inheritance solution. When I know what column names and types I want, it seems that EAV model loses all its charm.
[UPDATED - horse_with_no_name] B1: hstore - similiar to EAV with many to many but possibly much faster (no joins, backend support)
Great to add dictionary like custom fields.
Downsides: I lose compatibility with other django database backends (I would prefer not to), also it is type-agnostic, key and value is TEXT. I am also worried about making Message table raw queries slower in general due to many TEXT fields in hstore dict.
C: XML field in Message table for additional data
XML field in Message table is something that feels a little fishy to me. What if I dont need these additional fields (from message child types) to be searchable or indexable - is XML field a good solution?
What is the best option in your opinion?
The simple answer here is to just use a single table. However the real question is why you're putting stuff in a database in the first place. If your intent is to scale up to large sizes, then you probably want to look at a hybrid storage model (indexing messages in the DB and storing the raw message in something like hbase).
I read up on database structuring and normalization and decided to remodel the database behind my learning thingie to reduce redundancy.
I have different types of entries that can be learned. Gap texts/cloze tests (one text, many gaps) and simple known-unknown (one question, one answer) types.
Now I'm in a bit of a pickle:
gaps need exactly the same columns in the user table as question-answer types
but they need less columns than question-answer types (all that info is in the clozetests table)
I'm wishing for a "magic" foreign key that can point both to the gap and the terms table. Of course their ids would overlap though. I don't like having both a term_id and gap_id in the user_terms, that seems unelegant (but is the most elegant I can come up with after googling for a while, not knowing what name this pickle goes by).
I don't want a user_gaps analogue to user_terms, because then I'd be in the same pickle when it comes to the table user_terms_answers.
I put up this cardboard cutout collage of my schema. I didn't remove the stuff that isn't relevant for this question, but I can do that if anyone's confusion can be remedied like that. I think it looks super tidy already. Tidier than my mental concept of this at least.
Did I say any help would be greatly appreciated? Answerers might find themselves adulated for their wisdom.
Background story if you care, it's not really relevant to the question.
Before remodeling I had them all in one table (because I added the gap texts in a hurry), so that the gap texts were "normal" items without answers, while the gaps where items without questions. The application linked them together.
Edit
I added an answer after SO coughed up some helpful posts. I'm not yet 100% satisfied. I try to write views for common queries to this set up now and again I feel like I'll have to pull application logic for something that is database turf.
As mentioned in the comment, it is hard to answer without knowing the whole story. So, here is a story and a model to match. See if you can adapt this to you example.
School of (foreign) languages offers exams for several levels of language proficiency. The school maintains many pre-made tests for each level of each language (LangLevelTestNo).
Each test contains several (many) questions. Each question can be simple or of the close-text-type. Correct answers are stored for each simple question. Correct terms are stored for each gap of each close-text question.
Student can take an exam for a language level and is presented with one of the pre-made tests. For each student exam, the exam form is maintained which stores students answers for each question of the exam. Like a question, an answer may be of a simple of of a close-text-type.
After editing my question some Stackoverflow started relating the right questions to me.
I knew this was a common problem, but I really couldn't find it, just couldn't come up with the right search terms, I guess.
The following threads address similar problems and I'll try to apply that logic to my own design. They all propose adding a higher-level description for (in my case terms and gaps) like items. That makes sense and reflects the logic behind my application.
Relation Database Design
Foreign Key on multiple columns in one of several tables
Foreign Key refering to primary key across multiple tables
And this good person illustrates how to retrieve the data once it's broken up across tables. He also clues me to the keyword class table inheritance, so now I know what to google.
I'll post back with my edited schema once I've applied this. It does seem more elegant like this.
Edited schema
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 9 years ago.
Improve this question
I'm working on a fairly simple survey system right now. The database schema is going to be simple: a Survey table, in a one-to-many relation with Question table, which is in a one-to-many relation with the Answer table and with the PossibleAnswers table.
Recently the customer realised she wants the ability to show certain questions only to people who gave one particular answer to some previous question (eg. Do you buy cigarettes? would be followed by What's your favourite cigarette brand?, there's no point of asking the second question to a non-smoker).
Now I started to wonder what would be the best way to implement this conditional questions in terms of my database schema? If question A has 2 possible answers: A and B, and question B should only appear to a user if the answer was A?
Edit: What I'm looking for is a way to store those information about requirements in a database. The handling of the data will be probably done on application side, as my SQL skills suck ;)
Survey Database Design
Last Update: 5/3/2015
Diagram and SQL files now available at https://github.com/durrantm/survey
If you use this (top) answer or any element, please add feedback on improvements !!!
This is a real classic, done by thousands. They always seems 'fairly simple' to start with but to be good it's actually pretty complex. To do this in Rails I would use the model shown in the attached diagram. I'm sure it seems way over complicated for some, but once you've built a few of these, over the years, you realize that most of the design decisions are very classic patterns, best addressed by a dynamic flexible data structure at the outset.
More details below:
Table details for key tables
answers
The answers table is critical as it captures the actual responses by users.
You'll notice that answers links to question_options, not questions. This is intentional.
input_types
input_types are the types of questions. Each question can only be of 1 type, e.g. all radio dials, all text field(s), etc. Use additional questions for when there are (say) 5 radio-dials and 1 check box for an "include?" option or some such combination. Label the two questions in the users view as one but internally have two questions, one for the radio-dials, one for the check box. The checkbox will have a group of 1 in this case.
option_groups
option_groups and option_choices let you build 'common' groups.
One example, in a real estate application there might be the question 'How old is the property?'.
The answers might be desired in the ranges:
1-5
6-10
10-25
25-100
100+
Then, for example, if there is a question about the adjoining property age, then the survey will want to 'reuse' the above ranges, so that same option_group and options get used.
units_of_measure
units_of_measure is as it sounds. Whether it's inches, cups, pixels, bricks or whatever, you can define it once here.
FYI: Although generic in nature, one can create an application on top of this, and this schema is well-suited to the Ruby On Rails framework with conventions such as "id" for the primary key for each table. Also the relationships are all simple one_to_many's with no many_to_many or has_many throughs needed. I would probably add has_many :throughs and/or :delegates though to get things like survey_name from an individual answer easily without.multiple.chaining.
You could also think about complex rules, and have a string based condition field in your Questions table, accepting/parsing any of these:
A(1)=3
( (A(1)=3) and (A(2)=4) )
A(3)>2
(A(3)=1) and (A(17)!=2) and C(1)
Where A(x)=y means "Answer of question x is y" and C(x) means the condition of question x (default is true)...
The questions have an order field, and you would go through them one-by one, skipping questions where the condition is FALSE.
This should allow surveys of any complexity you want, your GUI could automatically create these in "Simple mode" and allow for and "Advanced mode" where a user can enter the equations directly.
one way is to add a table 'question requirements' with fields:
question_id (link to the "which brand?" question)
required_question_id (link to the "do you smoke?" question)
required_answer_id (link to the "yes" answer)
In the application you check this table before you pose a certain question.
With a seperate table, it's easy adding required answers (adding another row for the "sometimes" answer etc...)
Personally, in this case, I would use the structure you described and use the database as a dumb storage mechanism. I'm fan of putting these complex and dependend constraints into the application layer.
I think the only way to enforce these constraints without building new tables for every question with foreign keys to others, is to use the T-SQL stuff or other vendor specific mechanisms to build database triggers to enforce these constraints.
At an application level you got so much more possibilities and it is easier to port, so I would prefer that option.
I hope this will help you in finding a strategy for your app.