implementing complex check constraints - sql

I have the following tables:
CREATE TABLE group_systems
(
group_name,
system_name,
section_name,
created_date,
decom_date,
status (Active, Deactivate)
)
CREATE TABLE systems
(
system_name,
section_name
)
A system is identified by the key (system_name, section_name). There can be dup system names but no dup section name.
In the groups table, I want to enforce the constraint that only one system in a section in a group can be active. However, because the groups table is also a history table, I can't just use the unique constraint (group_name, section_name, system_name). I have to use a check constraint that runs a subquery. There's also some additional constraints that are subqueries.
The problem is that inserting a benchmark of 100k records takes a long time (due to the subqueries).
Is it better to build another table active_systems_for_groups that references back to the group_systems table? That way, I can add the unique constraint to active_systems_for_groups that enforces only one active system per section per group and keep building complex constraints by adding more tables.
Is there a better way to handle complex check constraints?

You can enforce the "single active record" pattern in two ways:
The solution you suggest, which is to create a table that holds only the primary key values of the active records from the multiple-records-allowed table. Those values also serve as a primary key in the active records table.
Adding a column to another table that represents the objects that can have only a single active record each. In this case that would mean adding a column active_group_name to systems. This column would be a foreign key to the multiple-records-allowed table.
Which is preferable depends, in part, on whether every section is required to have an active group, whether it's common (but not required) for a section to have an active group, or whether it's only occasionally true that a section has an active group.
In the first case (required), you would use option (2) and the column could be declared NOT NULL, preserving complete normalization. In the second case (common) you would need to make the column NULLable but I'd probably still use that technique for convenience of JOINs. In the third case (occasional), I'd probably use option (1) since it might well improve performance when JOINing to get the active records.

Since you never answered which RDBMS you're using I'll throw this out there for others who might be interested in another way to easily handle this constraint in SQL Server (2008 or later).
You can use a filtered unique index to effectively put a constraint on the number of "active" rows for a given type. As an example:
CREATE UNIQUE INDEX My_Table_active_IDX ON My_Table (some_pk) WHERE active = 1
This approach has several advantages:
It's declarative
It's self-contained within the single table (no
FKs, no other objects that you need to keep updated, etc.)

Related

SQL - What is best to do when multiple tables have the same columns

I have different tables in my scheme with different columns, but I want to store data of when was the table modified or when was the data stored, so I added some columns to specify that.
I realized that I had to add the same "modification_date" and "modification_time" columns to all my tables, so I thought about making a new table called DATA_INFO so I won't need to do so, but every table has a different PRIMARY KEY and I don't know which one to add as FOREIGN KEY to the DATA_INFO table.
I don't know if I have to maybe add all of them or is there another way to do what I need.
It's better to have the same "modification_datetime" column in all tables, rather than trying to keep that data in a central table.
That's what we have done at every shop I've worked in.
I want to emphasize that a separate table is not reasonable for this purpose. The lack of an obvious foreign key is a hint.
Unlike Tab Allerman, tables that I create are much less likely to be updated, so I have three additional columns on most tables:
CreatedBy -- the user who created the row
CreatedAt -- when the row was creatd
CreatedOn -- the system where the table was created
The most important point is that this information can -- in many databases -- be implemented using default values rather than triggers. That is a big advantage of working within a single row. The fewer triggers, the better.

SQL Server: How to allow duplicate records on small table

I have a small table "ImgViews" that only contains two columns, an ID column called "imgID" + a count column called "viewed", both set up as int.
The idea is to use this table only as a counter so that I can track how often an image with a certain ID is viewed / clicked.
The table has no primary or foreign keys and no relationships.
However, when I enter some data for testing and try entering the same imgID multiple times it always appears greyed out and with a red error icon.
Usually this makes sense as you don't want duplicate records but as the purpose is different here it does make sense for me.
Can someone tell me how I can achieve this or work around it ? What would be a common way to do this ?
Many thanks in advance, Tim.
To address your requirement to store non-unique values, simply remove primary keys, unique constraints, and unique indexes. I expect you may still want a non-unique clustered index on ImgID to improve performance of aggregate queries that would otherwise require a scan the entire table and sort. I suggest you store an insert timestamp, not to provide uniqueness, but to facilitate purging data by date, should the need arise in the future.
You must have some unique index on that table. Make sure there is no unique index and no unique or primary key constraint.
Or, SSMS simply doesn't know how to identify the row that was just inserted because it has no key.
It is generally not best practice to have a table without a (logical) primary key. In your case, I'd make the image id the primary key and increment the counter. The MERGE statement is well-suited for performing and insert or update at the same time. Alternatives exist.
If you don't like that, create a surrogate primary key (an identity column set as the primary key).
At the moment you have no way of addressing a specific row. That makes the table a little unwieldy.
If you allow multiple rows being absolutely identical, how would you update/delete one of those rows?
How would you expect the database being able to "know" what row you referred to??
At the very least add a separate identity column (preferred being the clustered index, too).
As a side note: It's weird that you "like to avoid unneeded data" but at the same time insert duplicates over and over again instead of simply add up the click count per single image...
Use SQL statements, not GUI, if the table has not primary key or unique constraint.

is it necessary to have foreign key for simple tables

have a table called RoundTable
It has the following columns
RoundName
RoundDescription
RoundType
RoundLogo
Now the RoundType will be having values like "Team", "Individual", "Quiz"
is it necessary to have a one more table called "RoundTypes" with columns
TypeID
RoundType
and remove the RoundType from the rounds table and have a column "TypeID" which has a foreign key to this RoundType table?
Some say that if you have the RoundType in same table it is like hard-coding as there will be lot of round types in future.
is it like if there are going to be only 2-3 round types, i need not have foreign key??
Is it necessary? Obviously not. SQL works fine either way. In a properly defined database, you would do one of two things for RoundType:
Have a lookup table
Have a constraint that checks that values are within an agreed upon set (and I would put enums into this category)
If you have a lookup table, I would advocate having an auto-incremented id (called RoundTypeId) for it. Remember, that in a larger database, such a table would often have more than two columns:
CreatedAt -- when it was created
CreatedBy -- who created it
CreatedOn -- where it was created (important for distributed systems)
Long name
In a more advanced system, you might also need to internationalize the system -- that is, make it work for multiple languages. Then you would be looking up the actual string value in other tables.
is it like if there are going to be only 2-3 round types, i need not
have foreign key??
Usually it's just the opposite: If you have a different value for most of the records (like in a "lastName" column) you won't use a lookup table.
If, however, you know that you will have a limited set of allowed/possible values, a lookup table referenced via a foreign key is probably the better solution.
Maybe read up on "database normalization", starting perhaps # Wikipedia.
Actually you need to have separate table if you have following association between entities,
One to many
Many to many
because of virtue of these association simple DBMS becomes **R**DBMS ( Relation .)
Now ask simple question,
Whether my single record in round table have multiple roundTypes?
If so.. Make a new table and have foreign key in ROUNDTable.
Otherwise no.
yeah I think you should normalize it. Because if you will not do so then definitely you have to enter the round types (value) again and again for each record which is not good practice at all in case if you have large data. so i will suggest you to make another table
however later on you can make a view to get the desired result as fallow
create view vw_anyname
as
select RoundName, RoundDescription , RoundLogo, RoundType from roundtable join tblroundtype
on roundtable.TypeID = tblroundtype .typeid
select * from vw_anyname

Is ID column required in SQL?

Traditionally I have always used an ID column in SQL (mostly mysql and postgresql).
However I am wondering if it is really necessary if the rest of the columns in each row make in unique. In my latest project I have the "ID" column set as my primary key, however I never call it or use it in any way, as the data in the row makes it unique and is much more useful for me.
So, if every row in a SQL table is unique, does it need a primary key ID table, and are there ant performance changes with or without one?
Thanks!
EDIT/Additional info:
The specific example that made me ask this question is a table I am using for a many-to-many-to-many-to-many table (if we still call it that at that point) it has 4 columns (plus ID) each of which represents an ID of an external table, and each row will always be numeric and unique. only one of the columns is allowed to be null.
I understand that for normal tables an ID primary key column is a VERY good thing to have. But I get the feeling on this particular table it just wastes space and slows down adding new rows.
If you really do have some pre-existing column in your data set that already does uniquely identify your row - then no, there's no need for an extra ID column. The primary key however must be unique (in ALL circumstances) and cannot be empty (must be NOT NULL).
In my 20+ years of experience in database design, however, this is almost never truly the case. Most "natural" ID's that appear to be unique aren't - ultimately. US Social Security Numbers aren't guaranteed to be unique, and most other "natural" keys end up being almost unique - and that's just not good enough for a database system.
So if you really do have a proper, unique key in your data already - use it! But most of the time, it's easier and more convenient to have just a single surrogate ID that you can guarantee will be unique over all rows.
Don't confuse the logical model with the implementation.
The logical model shows a candidate key (all columns) which could makes your primary key.
Great. However...
In practice, having a multi column primary key has downsides: it's wide, not good when clustered etc. There is plenty of information out there and in the "related" questions list on the right
So, you'd typically
add a surrogate key (ID column)
add a unique constraint to keep the other columns unique
the ID column will be the clustered key (can be only one per table)
You can make either key the primary key now
The main exception is link or many-to-many tables that link 2 ID columns: a surrogate isn't needed (unless you have a braindead ORM)
Edit, a link: "What should I choose for my primary key?"
Edit2
For many-many tables: SQL: Do you need an auto-incremental primary key for Many-Many tables?
Yes, you could have many attributes (values) in a record (row) that you could use to make a record unique. This would be called a composite primary key.
However it will be much slower in general because the construction of the primary index will be much more expensive. The primary index is used by relational database management systems (RDBMS) not only to determine uniqueness, but also in how they order and structure records on disk.
A simple primary key of one incrementing value is generally the most performant and the easiest solution for the RDBMS to manage.
You should have one column in every table that is unique.
EDITED...
This is one of the fundamentals of database table design. It's the row identifier - the identifier identifies which row(s) are being acted upon (updated/deleted etc). Relying on column combinations that are "unique", eg (first_name, last_name, city), as your key can quickly lead to problems when two John Smiths exist, or worse when John Smith moves city and you get a collision.
In most cases, it's best to use a an artificial key that's guaranteed to be unique - like an auto increment integer. That's why they are so popular - they're needed. Commonly, the key column is simply called id, or sometimes <tablename>_id. (I prefer id)
If natural data is available that is unique and present for every row (perhaps retinal scan data for people), you can use that, but all-to-often, such data isn't available for every row.
Ideally, you should have only one unique column. That is, there should only be one key.
Using IDs to key tables means you can change the content as needed without having to repoint things
Ex. if every row points to a unique user, what would happen if he/she changed his name to let say John Blblblbe which had already been in db? And then again, what would happen if you software wants to pick up John Blblblbe's details, whose details would be picked up? the old John's or the one ho has changed his name? Well if answer for bot questions is 'nothing special gonna happen' then, yep, you don't really need "ID" column :]
Important:
Also, having a numeric ID column with numbers is much more faster when you're looking for an exact row even when the table hasn't got any indexing keys or have more than one unique
If you are sure that any other column is going to have unique data for every row and isn't going to have NULL at any time then there is no need of separate ID column to distinguish each row from others, you can make that existing column primary key for your table.
No, single-attribute keys are not essential and nor are surrogate keys. Keys should have as many attributes as are necessary for data integrity: to ensure that uniqueness is maintained, to represent accurately the universe of discourse and to allow users to identify the data of interest to them. If you have already identified a suitable key and if you don't find any real need to create another one then it would make no sense to add redundant attributes and indexes to your table.
An ID can be more meaningful, for an example an employee id can represent from which department he is, year of he join and so on. Apart from that RDBMS supports lots operations with ID's.

Multiple foreign keys to a single column

I'm defining a database for a customer/ order system where there are two highly distinct types of customers. Because they are so different having a single customer table would be very ugly (it'd be full of null columns as they are pointless for one type).
Their orders though are in the same format. Is it possible to have a CustomerId column in my Order table which has a foreign key to both the Customer Types? I have set it up in SQL server and it's given me no problems creating the relationships, but I'm yet to try inserting any data.
Also, I'm planning on using nHibernate as the ORM, could there be any problems introduced by doing the relationships like this?
No, you can't have a single field as a foreign key to two different tables. How would you tell where to look for the key?
You would at least need a field that tells what kind of user it is, or two separate foreign keys.
You could also put the information that is common for all users in one table and have separate tables for the information that is specific for the user types, so that you have a single table with user id as primary key.
A foreign key can only reference a single primary key, so no. However, you could use a bridge table:
CustomerA <---- CustomerA_Orders ----> Order
CustomerB <---- CustomerB_Orders ----> Order
So Order doesn't even have a foreign key; whether this is desirable, though...
I inherited a SQL Server database where this was done (a single column used in four foreign key relationships with four unrelated tables), so yes, it's possible. My predecessor is gone, though, so I can't ask why he thought it was a good idea.
He used a GUID column ("uniqueidentifier" type) to avoid the ambiguity problem, and he turned off constraint checking on the foreign keys, since it's guaranteed that only one will match. But I can think of lots of reasons that you shouldn't, and I haven't thought of any reasons you should.
Yours does sound like the classical "specialization" problem, typically solved by creating a parent table with the shared customer data, then two child tables that contain the data unique to each class of customer. Your foreign key would then be against the parent customer table, and your determination of which type of customer would be based on which child table had a matching entry.
You can create a foreign key referencing multiple tables. This feature is to allow vertical partioining of your table and still maintain referential integrity. In your case however, this is not applicable.
Your best bet would be to have a CustomerType table with possible columns - CustomerTypeID, CustomerID, where CustomerID is the PK and then refernce your OrderID table to CustomerID.
Raj
I know this is a very old question; however if other people are finding this question through the googles, and you don't mind adding some columns to your table, a technique I've used (using the original question as a hypothetical problem to solve) is:
Add a [CustomerType] column. The purpose of storing a value here is to indicate which table holds the PK for your (assumed) [CustomerId] FK column. Optional - addition of a check constraint (to ensure CustomerType is in CustomerA or CustomerB) will help you sleep better at night.
Add a computed column for each [CustomerType], eg:
[CustomerTypeAId] as case when [CustomerType] = 'CustomerA' then [CustomerId] end persisted
[CustomerTypeBId] as case when [CustomerType] = 'CustomerB' then [CustomerId] end persisted
Add your foreign keys to the calculated (and persisted) columns.
Caveat: I'm primarily in a MSSQL environment; so I don't know how well this translates to other DBMS (ie: Postgres, ORACLE, etc).
As noted, if the key is, say, 12345, how would you know which table to look it up in? You could, I suppose, do something to insure that the key values for the two tables never overlapped, but this is too ugly and painful to contemplate. You could have a second field that says which customer type it is. But if you're going to have two fields, why not have one field for customer type 1 id and another for customer type 2 id.
Without knowing more about your app, my first thought is that you really should have a general customer table with the data that is common to both, and then have two additional tables with the data specific to each customer type. I would think that there must be a lot of data common to the two -- basic stuff like name and address and customer number at the least -- and repeating columns across tables sucks big time. The additional tables could then refer back to the base table. As there is then a single key for the base table, the issue of foreign keys having to know which table to refer to evaporates.
Two distinct types of customer is a classic case of types and subtypes or, if you prefer, classes and subclasses. Here is an answer from another question.
Essentially, the class-table-inheritance technique is like Arnand's answer. The use of the shared-primary-key technique is what allows you to get around the problems created by two types of foreign key in one column. The foreign key will be customer-id. That will identify one row in the customer table, and also one row in the appropriate kind of customer type table, as the case may be.
Create a "customer" table include all the columns that have same data for both types of customer.
Than create table "customer_a" and "customer_b"
Use "customer_id" from "consumer" table as foreign key in "customer_a" and "customer_b"
customer
|
---------------------------------
| |
cusomter_a customer_b