I need some advice on modelling my data tables. I need to apply inheritance hierarchy on my tables using SQL Server and Hibernate. Could anyone show me a basic example? It could be a tutorial on website too.
Cheers...
Set up the tables so that the derived table shares primary key with base table.
MS SQL Server isn't an object-orientated database, it is a relational database. It sounds like you should be using views over your base tables rather than duplicating columns.
Duplicating columns is unnecessary and would no doubt impact performance and maintenance would become a nightmare.
Maybe edit your question to include more details on what you are trying to achieve.
Another way is to duplicate attributes in child tables and make parent table as VIEW (that selects by common attributes from all children).
CREATE VIEW Parent
AS
SELECT ID, Name FROM Child1
UNION ALL
SELECT ID, Name FROM Child2 ...
The problem could be with ID that should be unique through all the child tables (using GUIDs is preferrable)
Related
I have lots of sql tables. The tables are "dependent" , i.e. constraints on foreign keys are defined between the tables.
I need to transfer the tables from sql to csv. What is correct way to do that:
Define tables exactly as they are defined in sql? (What should I do with the foreign keys?)
Try to generate other tables by joining the existing ones based on foreign keys in order to hide the foreign keys dependencies?
May be there are other options? What are the pros and cons ?
Thanks,
Note:This is need for another application that run some anylitics on the data
I would suggest to create a view in SQL which contains all information from all tables you need in your CSV later.
The view already implements the dependencies (link of two rows from different tables) and linkes all together in one table.
It would be way easier than your second proposal to create a new table because the view will do all the work for you.
I guess you will need your dependencies.
So you should not ignore them.
Here a quick example how they work:
Lets say you have 2 Tables the first one is named persons and the second one is cars. In the persons table you have 3 columns: ID, Name, Age. In the second one you have ID, Car. To see which person has which car you just check which id from the first table has which value for car in the second one.
If you link them together in a view the result is one single table with the columns ID, Person, Age, Car.
Same does the view.
Later you can simply export the view to CSV.
Maybe I can help you better if you define your needs a bit more detailed.
What kind of data is in your tables, how are they linked(what are the primary/secondary keys).
If in a databse we have a parent table and two children tables. Is it better to use joins to get the children or add a flag to distinguish them ?
For example, the parent table is Person[Person_Name, Person_ID]. The first child table is Employee[Person_ID, Employee_ID, Department] and the other child is Customer[Person_ID, Location, Rank].
So, is it a good thing to add flag [isEmployee] or [isCustomer] to the parent table (Person) and save the effort of Joining the tables on "person_Id" ?
Another case would be with one child, for example, the parent table would be Member[Member_Name, Member_ID] and a child table GoldenMember[Member_ID, Phone_Number, EMail].
Now in this case, if I want to show the info of a specific Member, I need to do a join between tables to see whether it's a Golden Memmber or not, but if the flag "isGolden" was in the table (Member) it would save us a join?
So, which is better and why ??
Thanks in advance :)
There is no "better" unless you provide criteria for measurement of "goodness".
SQL's support for entity subtyping is inadequate. You can hack your way around any of the shortcomings that there are, but each hack will do no more than introduce new problems of its own.
Additional "Type" columns on the top level introduce the problem of database updating becoming more complex. Defective update procedures will corrupt the database's integrity.
Leaving out the additional "Type" columns at the top level will make the problem of formulating read queries more complex (more joins, notably). Many people would add here "and degrade performance", but it's unlikely that you will suffer noticeably from this.
Choose which difficulty is the easiest to live with in your particular use case.
I'd like to hear opinions on if it is better to keep forum categories and subcategories in the same table or in two separate tables...
Let's say you have a table ForumCategories. By adding a colum ParentId referencing the PK Id in the same table you could easily keep both the main categories and subcategories in the same table.
Alternatively, you could create a separate table ForumSubCategories and make the Id on that table a FK referencing PK Id column of the ForumCategories table.
Both solutions would work but what are the pros and cons of each solution?
Obviously, this is a more generic question that can apply to many other scenarios I just couldn't come up with a better phrasing in a hurry...
I can't think of any benefits of using 2 tables. Using 2 tables is going to constrain you to a 2 level tree. If you look at things as objects, then subcategories really are just category object. So put them in the same table. The 1 table structure will be simpler to design around and develop queries for.
If you know for sure that your forums will have only 2 levels of categories, then having 2 tables is reasonable.
Though storing categories in one table with foreign key to itself, basically, allows you store a tree of categories with virutally unlimited levels.
If they are the same entity (Category), you can link to itself. The parent would have a null for the parent ID or it could be linked to itself. This limits you to only one level unless you have a second table to handle the many-to-many possible relationships.
They must have the same fields or you're going to have unecessary fields for one or the other type. A separate table is why you would do this because they're not the same.
This is typical for an employee table. The supervisor is another employee record.
If I was asked to query JOIN of more than three tables, what is the best way I go about understanding the relationship of the tables before I code. Should I use Database Diagram in SQL Server, or would I be given the necessary information? What would you recommend?
Thanks in advance for your time.
You could use the diagramming tools in SQL Server Management Studio to discover any Foreign Key relationships between those tables. It might be quicker than using the GUI to inspect each table in Design mode, and viewing its Relationships dialog.
Consider creating an ad-hoc View with those 3 tables. This will help you produce the SQL statement that you'd need. If any relationships exist on those 3 tables, you'll have the JOIN statement created for you by the tool.
Right click Views -> New View.
Pick the tables you need, click Add
all relationships are displayed in the Diagram Pane, and the SQL Pane will the SELECT statement with the required JOINS.
Depending on the convention they use for creating foreign keys, it shouldn't be to hard to find the relationships between tables.
The convention we use is
dbo.TableA(ID PK)
dbo.TableB(ID PK, TableAID FK)
dbo.TableC(ID PK, TableBID FK)
...
If they don't use any convention at all or didn't even create Foreign Key constraints, you can take that as an opportunity to educate them about the importance of conventions aka the lost time and money by not using them.
We're building a RDBMS-based web site for a federal semantic network (RDF, Protege, etc). This is basically a large collection of nodes, each having a large and indefinite set of named relationships to (and from) other nodes.
My first thought is a single table for all the nodes (name, description, etc), plus one table per named relationship. Any better ideas out there?
On further reflection, two tables total might do, one for nodes (id, name, description), and other for relations (id, name, description, from, to),
where from and two are ids in the nodes table (ints). Still on the right track?
You could optimize the performance by creating 2 rows per relation.
Let's say you have a table Items and a table Relations and that Person A has a relation with Person B. The Relations table has a left and right column, both referring to Items. Now, if you only have one row for this relation, and you want all relations for a certain Item, you would have a query looking like this:
SELECT * FROM Relations WHERE LeftItemId = #ItemId OR RightItemId = #ItemId
The OR in this query will ruin your performance! If you would duplicate the row and switch the relation (left becomes right and vice versa) the query looks like this:
SELECT * FROM Relations WHERE LeftItemId = #ItemId
With the right index this one will go blazingly fast.
No, that sould be fine. Pay attention to primary key and indexes, so that the performance is good.
If you didn't have a single table for the nodes, you'd have to define a lot of relation tables. Each new node type would require a new relation table with every old node type. That could get out of hand quickly.
So a single table sounds best. You can always use a 1:1 relation to extend it, if you need additional fields for certain node types.
if you're using sql server 2008, you might want to consider the new HierarchyID datatype to store your hierarchy in. It's optimized for storage.