SQL Trigger After INSERT DELETE for Numbering - sql

I have a table with two columns, and together they make up the primary key for this table:
Column A | Number
----------+--------
Elephant | 1
Elepahnt | 2
Giraff | 1
Giraff | 2
Giraff | 3
Now, I want a trigger that if you delete Giraff 2, then Giraff 1 stays the same, but Giraff 3 becomes Giraff 2.
Also this trigger should see that if I insert an Elephant without a number, it just goes and picks 3 as number.
So I am thinking a trigger after insert, delete
but I need a if statement/loop that goes through each row re-evaluating the numbers and updating if necessary.
Any ideas?

Two things.
This is a terrible idea, I can’t urge you enough not to do it. Whatever reason you think you have for this is wrong.
That said, yes, you could do this with triggers. Although you don’t need to be deleting the record and renumbering, you could instead just delete from the end. Which gives you a better solution than a trigger. Use stored procedures instead.
That said, this is a terrible idea, please don’t do it. Please ask another question, giving the reason why you think you should do this. Get a better opinion.

The re-evaluation that you are requesting seems like a lot can go wrong. In very small amounts of data it wouldn't be as much of a problem but as the data grows and multiple users are working in it you will see problems.
You should consider the reason why you would want the PK to change. Your better off redesigning the table to have a unique key, use the ANIMALSPECIES as a field and count the number by Animal Species. You may also want to add a field like ACTIVE and filtering by the value in ACTIVE. This way you can see the keep the data for further purposes.
I do not recommend what you are requesting.

Don't do this. Instead, have an auto-incrementing column. This looks like:
create table animals (
animal_id int generated always as identity, -- depends on database
a varchar(255)
);
Then create a view to calculate the value you want:
create view v_animals as
select animal,
row_number() over (partition by a order by animal_id) as number
from animals;

Related

How to force ID column to remain sequential even if a recored has been deleted, in SQL server?

I don't know what is the best wording of the question, but I have a table that has 2 columns: ID and NAME.
when I delete a record from the table the related ID field deleted with it and then the sequence spoils.
take this example:
if I deleted row number 2, the sequence of ID column will be: 1,3,4
How to make it: 1,2,3
ID's are meant to be unique for a reason. Consider this scenario:
**Customers**
id value
1 John
2 Jackie
**Accounts**
id customer_id balance
1 1 $500
2 2 $1000
In the case of a relational database, say you were to delete "John" from the database. Now Jackie would take on the customer_id of 1. When Jackie goes in to check here balance, she will now show $500 short.
Granted, you could go through and update all of her other records, but A) this would be a massive pain in the ass. B) It would be very easy to make mistakes, especially in a large database.
Ids (primary keys in this case) are meant to be the rock that holds your relational database together, and you should always be able to rely on that value regardless of the table.
As JohnFx pointed out, should you want a value that shows the order of the user, consider using a built in function when querying.
In SQL Server identity columns are not guaranteed to be sequential. You can use the ROW_NUMBER function to generate a sequential list of ids when you query the data from the database:
SELECT
ROW_NUMBER() OVER (ORDER BY Id) AS SequentialId,
Id As UniqueId,
Name
FROM dbo.Details
If you want sequential numbers don't store them in the database. That is just a maintenance nightmare, and I really can't think of a very good reason you'd even want to bother.
Just generate them dynamically using tSQL's RowNumber function when you query the data.
The whole point of an Identity column is creating a reliable identifier that you can count on pointing to that row in the DB. If you shift them around you undermine the main reason you WANT an ID.
In a real world example, how would you feel if the IRS wanted to change your SSN every week so they could keep the Social Security Numbers sequential after people died off?

Uniqueness in many-to-many

I couldn't figure out what terms to google, so help tagging this question or just pointing me in the way of a related question would be helpful.
I believe that I have a typical many-to-many relationship:
CREATE TABLE groups (
id integer PRIMARY KEY);
CREATE TABLE elements (
id integer PRIMARY KEY);
CREATE TABLE groups_elements (
groups_id integer REFERENCES groups,
elements_id integer REFERENCES elements,
PRIMARY KEY (groups_id, elements_id));
I want to have a constraint that there can only be one groups_id for a given set of elements_ids.
For example, the following is valid:
groups_id | elements_id
1 | 1
1 | 2
2 | 2
2 | 3
The following is not valid, because then groups 1 and 2 would be equivalent.
groups_id | elements_id
1 | 1
1 | 2
2 | 2
2 | 1
Not every subset of elements must have a group (this is not the power set), but new subsets may be formed. I suspect that my design is incorrect since I'm really talking about adding a group as a single entity.
How can I create identifiers for subsets of elements without risk of duplicating subsets?
That is an interesting problem.
One solution, albeit a klunky one, would be to store a concatenation of groups_id and elements_id in the groups table: 1-1-2 and make it a unique index.
Trying to do a search for duplicate groups before inserting a new row, would be an enormous performance hit.
The following query would spit out offending group ids:
with group_elements_arr as (
select groups_id, array_agg(elements_id order by elements_id) elements
from group_elements
group by groups_id )
select elements, count(*), array_agg(groups_id) offending_groups
from group_elements_arr
group by elements
having count(*) > 1;
Depending on the size of group_elements and its change rate you might get away with stuffing something along this lines into a trigger watching group_elements. If that's not fast enough you can materialize group_elements_arr into a real table managed by triggers.
And I think, the trigger should be FOR EACH STATEMENT and INITIALLY DEFERRED for easy building up a new group.
This link from user ypercube was most helpful: unique constraint on a set. In short, a bit of what everyone is saying is correct.
It's a question of tradeoffs, but here are the best options:
a) Add a hash or some other combination of element values to the groups table and make it unique, then populate the groups_elements table off of it using triggers. Pros of this method are that it preserves querying ability and enforces the constraint so long as you deny naked updates to groups_elements. Cons are that it adds complexity and you've now introduced logic like "how do you uniquely represent a set of elements" into your database.
b) Leave the tables as-is and control the access to groups_elements with your access layer, be it a stored procedure or otherwise. This has the advantage of preserving querying ability and keeps the database itself simple. However, it means that you are moving an analytic constraint into your access layer, which necessarily means that your access layer will need to be more complex. Another point is that it separates what the data should be from the data itself, which has both pros and cons. If you need faster access to whether or not a set already exists, you can attack that problem separately.

redundant column

I have a database that has two tables, these tables look like this
codes
id | code | member_id
1 | 123 | 2
2 | 234 | 1
3 | 345 |
4 | 456 | 3
members
id | code_id | other info
1 | 2 | blabla
2 | 1 | blabla
3 | 4 | blabla
the basic idea is that if a code is taken then its member id field is filled in, however this is creating a circle link (members points to codes, codes points to members) is there a different way of doing this? is this actually a bad thing?
Update
To answer your questions there are three different code tables with approx 3.5 million codes each, each table is searched depending on different criteria, if the member_id column is empty then the code is unclaimed, else, the code is claimed, this is done so that when we are searching the database we do not need to include another table to tell if it it claimed.
the members table contains the claimants for every single code, so all 10.5 million members
the additional info has things like mobile, flybuys.
the mobile is how we identify the member, but each entry is considered a different member.
It's a bad thing because you can end up with anomalies. For example:
codes
id | code | member_id
1 | 123 | 2
members
id | code_id | other info
2 | 4 | blabla
See the anomaly? Code 1 references its corresponding member, but that member doesn't reference the same code in return. The problem with anomalies is you can't tell which one is the correct, intended reference and which one is a mistake.
Eliminating redundant columns reduces the chance for anomalies. This is a simple process that follows a few very well defined rules, called rules of normalization.
In your example, I would drop the codes.member_id column. I infer that a member must reference a code, but a code does not necessarily reference a member. So I would make members.code_id reference codes.id. But it could go the other way; you don't give enough information for the reader to be sure (as #OMG Ponies commented).
Yeah, this is not good because it presents opportunities for data integrity problems. You've got a one-to-one relationship, so either remove Code_id from the members table, or member_id from the codes table. (in this case it seems like it would make more sense to drop code_id from members since it sounds like you're more frequently going to be querying codes to see which are not assigned than querying members to see which have no code, but you can make that call)
You could simply drop the member_id column and use a foreign key relationship (or its absence) to signify the relationship or lack thereof. The code_id column would then be used as a foreign key to the code. Personally, I do think it's bad simply because it makes it more work to ensure that you don't have corrupt relationships in the DB -- i.e., you have to check that the two columns are synchronized between the tables -- and it doesn't really add anything in the general case. If you are running into performance problems, then you may need to denormalize, but I'd wait until it was definitely a problem (and you'd likely replicate more than just the id in that case).
It depends on what you're doing. If each member always gets exactly one unique code then just put the actual code in the member table.
If there are a set of codes and several members share a code (but each member still has just one) then remove the member_id from the codes table and only store the unique codes. Access a specific code through a member. (you can still join the code table to search on codes)
If a member can have multiple codes then remove the code_id from the member table and the member_id from the code table can create a third table that relates members to codes. Each record in the member table should be a unique record and each record in the code table should be a unique record.
What is the logic behind having the member code in the code table?
It's unnecessary since you can always just do a join if you need both pieces of information.
By having it there you create the potential for integrity issues since you need to update BOTH tables whenever an update is made.
Yes this is a bad idea. Never set up a database to have circular references if you can help it. Now any change has to be made both places and if one place is missed, you have a severe data integrity problem.
First question can each code be assigned to more than one member? Or can each member have more than one code? (this includes over time as well as at any one moment if you need historical records of who had what code when))If the answer to either is yes, then your current structure cannot work. If the answer to both is no, why do you need two tables?
If you can have mulitple codes and multiple members you need a bridging table that has memberid and code id. If you can have multiple members assigned one code, put the code id in the members table. If it is the other way it should be the memberid in the code table. Then properly set up the foreign key relationship.
#Bill Karwin correctly identifies this as a probably design flaw which will lead to anomalies.
Assuming code and member are distinct entities, I would create a thrid table...
What is the relationship between a code and member called? An oath? If this is a real life relationship, someone with domain knowledge in the business will be able to give it a name. If not look for further design flaws:
oaths
code_id | member_id
1 | 2
2 | 1
4 | 3
The data suggest that a unique constraint is required for (code_id, member_id).
Once the data is 'scrubbed', drop the columns codes.member_id and members.code_id.

Change all primary keys in access table to new numbers

I have an access table with an automatic primary key, a date, and other data. The first record starts at 36, due to deleted records. I want to change all the primary keys so they begin at 1 and increment, ordered by the date. Whats the best way to do this?
I want to change the table from this:
| TestID | Date | Data |
| 36 | 12/02/09 | .54 |
| 37 | 12/04/09 | .52 |
To this:
| TestID | Date | Data |
| 1 | 12/02/09 | .54 |
| 2 | 12/04/09 | .52 |
EDIT: Thanks for the input and those who answered. I think some were reading a little too much into my question, which is okay because it still adds to my learning and thinking process. The purpose of my question was two fold: 1) It would simply be nicer for me to have the PK match with the order of my data's dates and 2) to learn if something like this was possible for later use. Such as, if I want to add a new column to the table which numbers the tests, labels the type of test, etc. I am trying to learn a lot at once right now so I get a little confused where to start sometimes. I am building .NET apps and trying to learn SQL and database management and it is sometimes confusing finding the right info with the different RDMS's and ways to interact with them.
Following from MikeW, you can use the following SQL command to copy the data from the old to the new table:
INSERT
TestID, Date, Data
INTO
NewTable
SELECT
TestID, Date, Data
FROM
OldTable;
The new TestID will start from 1 if you use an AutoIncrement field.
I would create a new table, with autoincrement.
Then select all the existing data into it, ordering by date. That will result in the IDs being recreated from "1".
Then you could drop the original table, and rename the new one.
Assuming no foreign keys - if so you'd have to drop and recreate those too.
An Autonumber used as a surrogate primary keys is not data, but metadata used to do nothing but connect records in related tables. If you need to control the values in that field, then it's data, and you can't use an Autonumber, but have to roll your own autoincrement routine. You might want to look at this thread for a starting point, but code for this for use in Access is available everywhere Access programmers congregate on the Net.
I agree that the value of the auto-generated IDENTITY values should have no meaning, even for the coder, but for education purposes, here's how to reseed the IDENTITY using ADO:
ACC2000: Cannot Change Default Seed and Increment Value in UI
Note the article as out of date because it says, "there are no options available in the user interface (UI) for you to make this change." In later version the Access, the SQL DLL could be executed when in ANSI-92 Query Mode e.g. something like this:
ALTER TABLE MyTable ALTER TestID INTEGER IDENTITY (1, 1) NOT NULL;

SQL Table Design - Identity Columns

SQL Server 2008 Database Question.
I have 2 tables, for arguments sake called Customers and Users where a single Customer can have 1 to n Users. The Customers table generates a CustomerId which is a seeded identity with a +1 increment on it. What I'm after in the Users table is a compound key comprising the CustomerId and a sequence number such that in all cases, the first user has a sequence of 1 and subsequent users are added at x+1.
So the table looks like this...
CustomerId (PK, FK)
UserId (PK)
Name
...and if for example, Customer 485 had three customers the data would look like...
CustomerId | UserId | Name
----------
485 | 1 | John
485 | 2 | Mark
485 | 3 | Luke
I appreciate that I can manually add the 1,2,3,...,n entry for UserId however I would like to get this to happen automatically on row insert in SQL, so that in the example shown I could effectively insert rows with the CustomerId and the Name with SQL Server protecting the Identity etc. Is there a way to do this through the database design itself - when I set UserId as an identity it runs 1 to infinity across all customers which isn't what I am looking for - have I got a setting wrong somewhere, or is this not an option?
Hope that makes sense - thanks for your help
I can think of no automatic way to do this without implementing a custom Stored Procedure that inserted the rows and checked to increment the Id appropriately, althouh others with more knowledge may have a better idea.
However, this smells to me of naturalising a surrogate key - which is not always a good idea.
More info here:
http://www.agiledata.org/essays/keys.html
That's not really an option with a regular identity column, but you could set up an insert trigger to auto populate the user id though.
The naive way to do this would be to have the trigger select the max user id from the users table for the customer id on the inserted record, then add one to that. However, you'll run into concurrency problems there if more than one person is creating a user record at the same time.
A better solution would be to have a NextUserID column on the customers table. In your trigger you would:
Start a transaction.
Increment the NextUserID for the customer (locking the row).
Select the updated next user id.
use that for the new User record.
commit the transaction.
This should ensure that simultaneous additions of users don't result in the same user id being used more than once.
All that said, I would recommend that you just don't do it. It's more trouble than it's worth and just smells like a bad idea to begin with.
So you want a generated user_id field that increments within the confines of a customer_id.
I can't think of one database where that concept exists.
You could implement it with a trigger. But my question is: WHY?
Surrogate keys are supposed to not have any kind of meaning. Why would you try to make a key that, simultaneously, is the surrogate and implies order?
My suggestions:
Create a date_created field, defaulting to getDate(). That will allow you to know the order (time based) in which each user_id was created.
Create an ordinal field - which can be updated by a trigger, to support that order.
Hope that helps.