SQL - What is best to do when multiple tables have the same columns - sql

I have different tables in my scheme with different columns, but I want to store data of when was the table modified or when was the data stored, so I added some columns to specify that.
I realized that I had to add the same "modification_date" and "modification_time" columns to all my tables, so I thought about making a new table called DATA_INFO so I won't need to do so, but every table has a different PRIMARY KEY and I don't know which one to add as FOREIGN KEY to the DATA_INFO table.
I don't know if I have to maybe add all of them or is there another way to do what I need.

It's better to have the same "modification_datetime" column in all tables, rather than trying to keep that data in a central table.
That's what we have done at every shop I've worked in.

I want to emphasize that a separate table is not reasonable for this purpose. The lack of an obvious foreign key is a hint.
Unlike Tab Allerman, tables that I create are much less likely to be updated, so I have three additional columns on most tables:
CreatedBy -- the user who created the row
CreatedAt -- when the row was creatd
CreatedOn -- the system where the table was created
The most important point is that this information can -- in many databases -- be implemented using default values rather than triggers. That is a big advantage of working within a single row. The fewer triggers, the better.

Related

is it necessary to have foreign key for simple tables

have a table called RoundTable
It has the following columns
RoundName
RoundDescription
RoundType
RoundLogo
Now the RoundType will be having values like "Team", "Individual", "Quiz"
is it necessary to have a one more table called "RoundTypes" with columns
TypeID
RoundType
and remove the RoundType from the rounds table and have a column "TypeID" which has a foreign key to this RoundType table?
Some say that if you have the RoundType in same table it is like hard-coding as there will be lot of round types in future.
is it like if there are going to be only 2-3 round types, i need not have foreign key??
Is it necessary? Obviously not. SQL works fine either way. In a properly defined database, you would do one of two things for RoundType:
Have a lookup table
Have a constraint that checks that values are within an agreed upon set (and I would put enums into this category)
If you have a lookup table, I would advocate having an auto-incremented id (called RoundTypeId) for it. Remember, that in a larger database, such a table would often have more than two columns:
CreatedAt -- when it was created
CreatedBy -- who created it
CreatedOn -- where it was created (important for distributed systems)
Long name
In a more advanced system, you might also need to internationalize the system -- that is, make it work for multiple languages. Then you would be looking up the actual string value in other tables.
is it like if there are going to be only 2-3 round types, i need not
have foreign key??
Usually it's just the opposite: If you have a different value for most of the records (like in a "lastName" column) you won't use a lookup table.
If, however, you know that you will have a limited set of allowed/possible values, a lookup table referenced via a foreign key is probably the better solution.
Maybe read up on "database normalization", starting perhaps # Wikipedia.
Actually you need to have separate table if you have following association between entities,
One to many
Many to many
because of virtue of these association simple DBMS becomes **R**DBMS ( Relation .)
Now ask simple question,
Whether my single record in round table have multiple roundTypes?
If so.. Make a new table and have foreign key in ROUNDTable.
Otherwise no.
yeah I think you should normalize it. Because if you will not do so then definitely you have to enter the round types (value) again and again for each record which is not good practice at all in case if you have large data. so i will suggest you to make another table
however later on you can make a view to get the desired result as fallow
create view vw_anyname
as
select RoundName, RoundDescription , RoundLogo, RoundType from roundtable join tblroundtype
on roundtable.TypeID = tblroundtype .typeid
select * from vw_anyname

A single table that represents multiple tables

I have a problem with finding a way to represent multiple tables hash tables into a single table.
Say I have 3 tables with the format:
Table1(Table1_PK1,Table1_PK2,Table1_PK3,Table1_Hash)
Table2(Table2_PK1,Table2_PK2,Table2_Hash)
Table3(Table3_Pk1,Table3_PK2,Table3_PK3,Table3_PK4,Table3_PK5,Table3_Hash)
Table1_PK1,Table1_PK2,Table1_PK3... are columns and they might have different datatypes (VARCHAR, INT or DATETIME ...).
My question is if there is a way to create a single table (fixed number of columns) that can represent all of these 3 tables (may be more in practical).
I am trying to do this for my database tool. Each table actual a table which contains primary keys and a hash data associating with them.
Since you're apparently building a database tool, not a database, it might make more sense to do this in application code rather than in a database table.
In a different answer, you commented
I am still looking for a dynamic way to do it without knowing how many primary keys a table can have.
A table can have only one primary key. That primary key can consist of more than one column, though. (You already knew this; you were just using the wrong words, which might confuse others.)
A table can also have an arbitrary number of other keys, which will be either declared (as NOT NULL UNIQUE) or "undeclared" (by creating an index that guarantees uniqueness over a set of columns).
You can look all that stuff up at run time in one or both of two ways. (Links go to documentation for PostgreSQL.)
System tables, sometimes called system catalogs
information_schema views
As far as I know, all modern SQL platforms implement at least one of these interfaces. The information_schema views are covered in the SQL standards, but there seems to be some room for interpretation. They don't look quite the same on all platforms.
Why combine the 3 tables into one? Would be really bad db design. But here's a way to do it:
The one table will have a column for each of the 3 tables' columns you want in the final table. I am making the assumption that TableX_Hash is the same type, so that remains as one unique column:
Table_All_in_One (
Table1_PK1,
Table1_PK2,
Table1_PK3,
# space just for clarity of grouping
Table2_PK1,
Table2_PK2,
Table3_PK1,
Table3_PK2,
Table3_PK3,
Table3_PK4,
Table3_PK5,
TableX_Hash # Assuming all the _Hash'es are the same type+length,
# otherwise, add Table1_Hash, Table2_Hash, Table3_Hash
# This can be your new primary key
)
The Primary Keys (PKx) are required to be non-NULL only in their own tables. For this table, they have to allow nulls. The idea is that each row of this new table will only hold the data for one of the tables. The other columns will be empty for that row. If you want to associate the row of one table with another, you can add that to the same row or add FK_Table1_Hash, FK_Table2_Hash and FK_Table3_Hash columns which will refer to the TableX_Hash value of a record.
PS: I wonder if what you are really looking for is a View and not this really bad all-in-one table.
Edit: Combining them into one "without knowing how many primary keys a table can have." as per your comment:
Store all the _PKs concatenated into one column:
Table_All_in_One (
New_PK,
TableX_Hash,
Table1_PKx, # Concatenated PKs of Table1
Table2_PKx, # Concatenated PKs of Table2, etc.
...,
# OR just one
TableX_PKs, # concatenate all the PK's into one VARCHAR field
# Add a pipe `|` between them optionally.
Table_Num # If using just one, then you'll need to store the table number
)
You will not be able to conveniently pick records based on part of their composite primary key. It will always have to be TableX_PKs = CONCAT_WS('|', Table1_PK1, Table1_PK2, ...). So your only dependency is the number of PKs in the original column.
In order to model a bunch of tables you will need 3 tables. An entity table that contains the table names of the tables you wish to set up this way called a factor or entity table. A Factor_detail table that contains all the columns and their associated properties of the tables. A table, factor_detail_value, for storing things like lookup values for lookup tables. I'm trying to learn more about this myself as well because we are using this technique at work as well. Genrate sql on the fly for any table so encoded, and store the data in a repository pertiinant to the data itself. This way if a table changes and you need to add a column or change a datatype, you can add a row to the factor detail table without affecting a database shut down in production. In most businesses a four hour shut down to make a sql data table change can cost thousands of dollars. If you are dealing with insurance for example, each additional state that you sell insurance in has different requirements for being able to seel it and that will result in table changes. We reduced our table count way down from over 700 tables in this manner also we can make changes without database shut down thus avoiding the loss in revenue.

Table referenced by other tables having different PKs

I would like to create a table called "NOTES". I was thinking this table would contain a "table_name" VARCHAR(100) which indicates what table put in the note, a "key" or multiple "key" columns representing the primary key values of the table that this note applies to and a "note" field VARCHAR(MAX). When other tables use this table they would supply THEIR primary key(s) and their "table_name" and get all the notes associated with the primary key(s) they supplied. The problem is that other tables might have 1, 2 or more PKs so I am looking for ideas on how I can design this...
What you're suggesting sounds a little convoluted to me. I would suggest something like this.
Notes
------
Id - PK
NoteTypeId - FK to NoteTypes.Id
NoteContent
NoteTypes
----------
Id - PK
Description - This could replace the "table_name" column you suggested
SomeOtherTable
--------------
Id - PK
...
Other Columns
...
NoteId - FK to Notes.Id
This would allow you to keep your data better normalized, but still get the relationships between data that you want. Note that this assumes a 1:1 relationship between rows in your other tables and Notes. If that relationship will be many to one, you'll need a cross table.
Have a look at this thread about database normalization
What is Normalisation (or Normalization)?
Additionally, you can check this resource to learn more about foreign keys
http://www.w3schools.com/sql/sql_foreignkey.asp
Instead of putting the other table name's and primary key's in this table, have the primary key of the NOTES table be NoteId. Create an FK in each other table that will make a note, and store the corresponding NoteId's in the other tables. Then you can simply join on NoteId from all of these other tables to the NOTES table.
As I understand your problem, you're attempting to "abstract" the auditing of multiple tables in a way that you might abstract a class in OOP.
While it's a great OOP design principle, it falls flat in databases for multiple reasons. Perhaps the largest single reason is that if you cannot envision it, neither will someone (even you) looking at it later have an easy time reassembling the data. Smaller that that though, is that while you tend to think of a table as a container and thus similar to an object, in reality they are implemented instances of this hypothetical container you are trying to put together and operate better if you treat them as such. By creating an audit table specific to a table or a subset of tables that share structural similarity and data similarity, you increase the performance of your database and you won't run in to strange trigger or select related issues later.
And you can't envision it not because you're not good at what you're doing, but rather, the structure is not conducive to database logging.
Instead, I would recommend that you create separate logging tables that manage the auditing of each table you want to audit or log. In fact, some fast google searches show many scripts already written to do much of this tasking for you: Example of one such search
You should create these individual tables and then if you want to be able to report on multiple table or even all tables at once, you can create a stored procedure (if you want to make queries based on criterion) or a view with an included SELECT statement that JOINs and/or UNIONs the tables you are interested in - in a fashion that makes sense to the report type. You'll still have to write new objects in to the view, but even with your original table design, you'd have to account for that.

Difference between a db view and a lookuptable

When I create a view I can base it on multiple columns from different tables.
When I want to create a lookup table I need information from one table, for example the foreign key of an order table, to get customer details from another table. I can create a view having parameters to make sure it will get all data that I need. I could also - from what I have been reading - make a lookup table. What is the difference in this case and when should I choose for a lookup table?? I hope this ain't a bad question, I'm not very into db's yet ;).
Creating a view gives you a "live" representation of the data as it is at the time of querying. This comes at the cost of higher load on the server, because it has to determine the values for every query.
This can be expensive, depending on table sizes, database implementations and the complexity of the view definition.
A lookup table on the other hand is usually filled "manually", i. e. not every query against it will cause an expensive operation to fetch values from multiple tables. Instead your program has to take care of updating the lookup table should the underlying data change.
Usually lookup tables lend themselves to things that change seldomly, but are read often. Views on the other hand - while more expensive to execute - are more current.
I think your usage of "Lookup Table" is slightly awry. In normal parlance a lookup table is a code or reference data table. It might consist of a CODE and a DESCRIPTION or a code expansion. The purpose of such tables is to provide a lsit of permitted values for restricted columns, things like CUSTOMER_TYPE or PRIORITY_CODE. This category of table is often referred to as "standing data" because it changes very rarely if at all. The value of defining this data in Lookup tables is that they can be used in foreign keys and to populate Dropdowns and Lists Of Values.
What you are describing is a slightly different scenario:
I need information from one table, for
example the foreign key of an order
table, to get customer details from
another table
Both these tables are application data tables. Customer and Order records are dynamic. Now it is obviously valid to retrieve additional data from the Customer table to display along side the Order data, and in that sense Customer is a "lookup table". More pertinently it is the parent table of Order, because it has the primary key referenced by the foreign key on Order.
By all means build a view to capture the joining logic between Order and Customer. Such views can be quite helpful when building an application that uses the same joined tables in several places.
Here's an example of a lookup table. We have a system that tracks Jurors, one of the tables is JurorStatus. This table contains all the valid StatusCodes for Jurors:
Code: Value
WS : Will Serve
PP : Postponed
EM : Excuse Military
IF : Ineligible Felon
This is a lookup table for the valid codes.
A view is like a query.
Read this tutorial and you may find helpful info when a lookup table is needed:
SQL: Creating a Lookup Table
Just learn to write sql queries to get exactly what you need. No need to create a view! Views are not good to use in many instances, especially if you start to base them on other views, when they will kill performance. Do not use views just as a shorthand for query writing.

Normalization Help

I am refactoring an old Oracle 10g schema to try to introduce some normalization. In one of the larger tables, there is a text field that has at most, 10-15 possible values. In my mind, it seems that this field is an example of unnecessary data duplication and should be extracted to a separate table.
After examining the data, I cannot find one relevant piece of information that could be associated with that text value. Basically, if I pulled that value out and put it into its own table, it would be the only field in that table. It exists today as more of a 'flag' field. Should I create a two-column table with a surrogate key, keep it as it is, or do something entirely different? Am I doing more harm than good by trying to minimize data duplication on this field?
You might save some space by extracting the column to a separate table. This is called a lookup table. It can give you a couple of other benefits:
You can declare a foreign key constraint to the lookup table, so you can rely on the column in the main table never having any value other than the 10-15 values you want.
It's easy to query for a concise list of all permitted values, by querying the lookup table. This can be faster than using SELECT DISTINCT on the main table's column. It also returns values that are permitted, but not currently used in the main table.
If you change a value in the lookup table, it automatically applies to all rows in the main table that reference it.
However, creating a lookup table with one column is not strictly normalization. You're just replacing one value with another. The attribute in the main table either already supports a normal form, or not.
Using surrogate keys (vs. natural keys) also has nothing to do with normalization. A lot of people make this mistake.
However, if you move other attributes into the lookup table, attributes that depend only on the lookup value and therefore would create repeating groups (violating 3NF) in the main table if you left them there, then that would be normalization.
If you want normalization break it out.
I think of these types of data in DBs as the equivalent of enums in C,C++,C#. Mostly you put them in the table as documentation.
I often have an ID, Name, Description, and auditing columns for them (eg modified by, modified date, create date, create by, active.) The description field is rarely used.
Example (some might say there are more than just 2)
Gender
ID Name Audit Columns...
1 Male
2 Female
Then in your contacts you would have a GenderID column which would link to this one.
Of course you don't "need" the table. You could have external documentation somewhere that says 1=Male, 2=Female -- but I think these tables serve to document a system.
If it's really a free-entry text field that's not re-used somewhere else in the database, and there's just a single field without repeated instances, I'd probably go ahead and leave it as it is. If you're determined to break it out I'd create a 'validation' table with a surrogate key and the text value, then put the surrogate key in the base table.
Share and enjoy.
Are these 10-15 values actually meaningful, or are they really just flags? If they're meaningful pieces of text and it seems wasteful to replicate them, then sure create a lookup table. But if they're just arbitrary flag values, then your new table will be nothing more than a mapping from one arbitrary value to another, and not terribly helpful.
A completely separate question is whether all or most of the rows in your big table even have a value for this column. If not, then indeed you have a good opportunity for normalization and can create a separate table linking the primary key from your base table with the flag value.
Edit: One thing. If there's some chance that one of these "flag" values is likely to be wholesale replaced with another value at some point in the future, that would be another good reason to create a table.