Updating multiple rows which are in conflict with unique index - sql

I am using Microsoft SQL Server and I have a master-detail scenario where I need to store the order of details. So in the Detail table I have ID, MasterID, Position and some other columns. There is also a unique index on MasterID and Position. It works OK except one case: when I have some existing details and I change their order. For example when I change a detail on position 3 with a detail on position 2. When I save the detail on position 2 (which in the database has Position equal to 3) SQL Server protests, because the index uniqueness constraint.
How to solve this problem in a reasonable way?
Thank you in advance
Lukasz Glaz

This is a classic problem and the answer is simple: if you want to move item 3 to position 2, you must first change the sort column of 2 to a temporary number (e.g. 99). So it goes like this:
Move 2 to 99
Move 3 to 2
Move 99 to 3
You must be careful, though, that your temporary value is never used in normal processing and that you respect multiple threads if applicable.
Update: BTW - one way to deal with the "multiple users may be changing the order" issue is to do what I do: give each user a numberical ID and then add this to the temporary number (my staff ID is actually the Unique Identity field ID from the staff table used to gate logins). So, for example, if your positions will never be negative, you might use -1000 - UserID as your temporary value. Trust me on one thing though: you do not want to just assume that you'll never have a collision. If you think that and one does occur, it'll be extremely hard to debug!
Update: GUZ points out that his users may have reordered an entire set of line items and submitted them as a batch - it isn't just a switch of two records. You can approach this in one of two ways, then.
First, you could change the existing sort fields of the entire set to a new set of non-colliding values (e.g. -100 - (staffID * maxSetSize) + existingOrderVal) and then go record-by-record and change each record to the new order value.
Or you could essentially treat it like a bubble sort on an array where the orderVal value is the equivalent of your array index. Either this makes perfect sense to you (and is obvious) or you should stick with solution 1 (which is easier in any event).

you could just remove the unique constraint (but leave an index key) on the order column, and ensure uniqueness in your code if necessary.

Related

Updating the index in an SQL database (python)

while creating a tkinter application to store book information, I realize that simply deleting a row of information from the SQL database does not update the indexes. Kinda hard to explain but here is a picture of what I meant:
link to picture. (still young on this account, so pictures can't be embedded, sorry for the inconvenience)
As you can see, the first column represents the index and index 3 is missing because I deleted it. Is there a way such that upon deleting a row, anything below it just shifts up to cover for the empty spot?
Your use of the word "index" must be based on the application language, not the database language. In databases, indexes are additional data structures that speed certain operations on tables.
You are referring to an "id" column, presumably one that is defined automatically as identity, auto_increment, serial, or whatever the underlying database uses.
A very important point is that deleting a row from a table does not affect other rows in the table (unless you have gone through the work of writing triggers to make that happen). It just deletes the rows.
The second more important point is that you do not want to change the "identity" of rows -- and that is what the column you are calling an "index" is doing. It identifies the row. It not only identifies the row today, but it identifies the same row tomorrow. And, if it existed, yesterday. That is, you don't want to change the identity.
This is even more important when you have foreign key relationships -- that is, other tables that refer to this row. Those relationships could get all messed up if the ids start changing.
SQL does offer a simple way to get a number with no gaps:
select row_number() over (order by "index") as seqnum
from t;

How to increment a Couner column by 1 with each inserted row BUT per each guid

Apologies for the clumsy title feel free to suggest an improvement.
I have a table Records and there's a UserId column that refers to who's made the deposition. There's also Counter column which is identity(1,1) (it's important to keep in mind that it's not same one as the Id column that is the primary key).
The problem got obvious when we started depositing from different accounts, because before, the user could ask for record number 123 through 127, getting 5 amounts but now, their picks might be 123, 125, 126 or even worse - nothing at all.
The only option to handle it as far I can imagine to create a business logic layer that checks for the highest deposition counter for a user and adds the new record with that increased by one.
But it sure would be nice to have it automagically working. Something like identity(1,1,guid). Is it possible?
The only option to handle it as far I can imagine to create a business
logic layer that checks for the highest deposition counter for a user
and adds the new record with that increased by one.
Time to learn.
Add the last given number to the account table.
Use a trigger to assign higher numbers in the insert event of SQL Server.
Finished. This obviously assumes your relational database is used in a relational fashion so the records are related to a user table, not just holding a user id (which would be a terrible design).
If not, you can also maintain a table of all seen GUID's by means of said trigger.
To maintain such a column, you would need a trigger.
You might consider calculating the value when you query the table:
select r.*, row_number() over (partition by guid order by id) as seqnum
from records r;

rebuild/refresh my table's PK list - gap in numbers

I have finished all my changes to a database table in sql server management studio 2012, but now I have a large gap between some values due to editing. Is there a way to keep my data, but re-assign all the ID's from 1 up to my last value?
I would like this cleaned up as I populate dropdownlists with these values and then I make interactions with my database with the assumption that my dropdownlist index and the table's ID match up, which is not the case right now.
My current DB has a large gap from 7 to 28, I would like to shift everything from 28 and up, back down to 8, 9, 10, 11, ect... so that my database has NO gaps from 1 and onward.
If the solution is tricky please give me some steps as I am new to SQL.
Thank you!
Yes, there are any number of ways to "close the gaps" in an auto generated sequence. You say you're new to SQL so I'll assume you're also new to relational concepts. Here is my advice to you: don't do it.
The ID field is a surrogate key. There are several aspects of surrogates one must be mindful of when using them, but the one I want to impress upon you is,
-- A surrogate key is used to make the row unique. Other than the guarantee that
-- the value is unique, no other assumptions may be made concerning the value.
-- In particular, no meaning may be derived from the value as to the contents of
-- the row or the row's relationship to any other row.
You have designed your app with a built-in assumption of the value of the key field (that they will be consecutive). Already it is causing you problems. Do you really want to go through this every time you make changes to the table? And suppose a future feature requires you to filter out some of the choices according to an option the user has selected? Or enable the user to specify the order of the items? Not going to be easy. So what is the solution?
You can create an additional (non-visible) field in the dropdown list that contains the key value. When the user makes a selection, use that index to get the key value of the selection and then go out to the database and get whatever additional data you need. This will work if you populate the list from the entire table or just select a few according to some as yet unknown filtering criteria or change the order in any way.
Viola. You never have this problem again, no matter how often you add and remove rows in the table.
However, on the off chance that you are as stubborn as me (not likely!) or just refuse to listen to the melodious voice of reason and experience, then try this:
Create a new table exactly like the old table, including auto incrementing PK.
Populate the new table using a Select from the old table. You can specify any order you want.
Drop the old table.
Rename the new table to the old table name.
You will have to drop and redefine any FKs from other tables. But this entire process
can be placed in a script because if you do this once, you'll probably do it again.
Now all the values are consecutive. Until you edit the table again...
You should refactor the code for your dropdown list and not the PK of the table.
If you do not agree, you can do one of the following:
Insert another column holding the dropdown's "order of appearance", make a unique index on it and fill this by hand (or programmatically).
Replace the SERIAL with an INT would work, make a unique index on the column and fill this by hand (or programmatically).
Remove the large ids and reseed your serial - the code depending on your DBMS
This happens to me all the time. If you don't have any foreign key constraints then it should be an easy fix.
Remember a DELETE statement will remove the record but keep the identity seed the same. (If I remove id # 5 and #5 was the last record inserted then SQL server still stores the identity seed value at "6").
TRUNCATING the table will reset the identity seed back to it's original value.
INSERT_IDENTITY [TABLE] ON can also be used to insert the correct data in the correct order if tuncating cannot happen.
SELECT *
INTO #tempTable
FROM [TableTryingToFix]
TRUNCATE TABLE [TableTryingToFix];
INSERT INTO [TableTryingToFix] (COL1, COL2, COL3, ETC)
SELECT COL1, COL2, COL2, ETC
FROM #tempTable
ORDER BY oldTableID

Too many columns design question

I have a design question.
I have to store approx 100 different attributes in a table which should be searchable also. So each attribute will be stored in its own column. The value of each attribute will always be less than 200, so I decided to use TINYINT as data type for each attribute.
Is it a good idea to create a table which will have approx 100 columns (Each of TINYINT)? What could be wrong in this design?
Or should I classify the attributes into some groups (Say 4 groups) and store them in 4 different tables (Each approx have 25 columns)
Or any other data storage technique I have to follow.
Just for example the table is Table1 and it has columns Column1,Column2 ... Column100 of each TINYINT data type.
Since size of each row is going to be very small, Is it OK to do what I explained above?
I just want to know the advantages/disadvantages of it.
If u think that it is not a good idea to have a table with 100 columns, then please suggest other alternatives.
Please note that I don;t want to store the information in composite form (e.g. few xml columns)
Thanks in advance
Wouldn't a many-to-many setup work here?
Say Table A would have a list of widget, which your attributes would apply to
Table B has your types of attributes (color, size, weight, etc), each as a different row (not column)
Table C has foreign keys to the widget id (Table A) and the attribute type (Table B) and then it actually has the attribute value
That way you don't have to change your table structure when you've got a new attribute to add, you simply add a new attribute type row to Table C
Its ok to have 100 columns. Why not? Just employ code generation to reduce handwriting of this columns.
I wouldn't worry much about the number of columns per se (unless you're stuck using some really terrible relational engine, in which case upgrading to a decent one would be my most hearty recommendation -- what engine[s] do you plan/need to support, btw?) but about the searchability thereby.
Does the table need to be efficiently searchable by the value of an attribute? If you need 100 indexes on that table, THAT might make insert and update operations slow -- how frequent are such modifications (vs reads to the table and especially searches on attribute values) and how important is their speed to you?
If you do "need it all" there just possibly may be no silver bullet of a "perfect" solution, just compromises among unpleasant alternatives -- more info is needed to weigh them. Are typical rows "sparse", i.e. mostly NULL with just a few of the 100 attributes "active" for any given row (just different subsets for each)? Is there (at least statistically) some correlation among groups of attributes (e.g. most of the time when attribute 12 is worth 93, attribute 41 will be worth 27 or 28 -- that sort of thing)?
BAsed on your last, it seems to me that you may have a bad design. WHat is the nature of these columns? Are you storing information together that shouldn't be together, are you storing information that shoul be in related tables?
So really what we need to best help you is to see what the nature of the data you have is.
what would be in
column1,column3,column10 vice column4,column15,column20,column25
I had a table with 250 columns. There's nothing wrong. For some cases, it's how it works.
unless some of the columns you are defining have a meaning "per se" as independent entities and they can be shared by multiple rows. In that case, it makes sense to normalize out the set of columns in a different table, and put a column in the original table (possibly with a foreign key constraint)
I think the correct way is to have a table that looks more like:
CREATE TABLE [dbo].[Settings](
[key] [varchar](250) NOT NULL,
[value] tinyint NOT NULL
) ON [PRIMARY]
Put an index on the key column. You can eventually make a page where the user can update the values.
Having done a lot of these in the real world, I don't understand why anyone would advocate having each variable be its own column. You have "approx 100 different attributes" so far, you don't think you are going to want to add and delete to this list? Every time you do it is a table change and a production release? You will not be able to build something to hand the maintenance off to a power user. Your reports are going to be hard-coded too? Things take off and you reach the max number of columns of 1,024 are you going to rework the whole thing?
It is nothing to expand the table above - add Category, LastEditDate, LastEditBy, IsActive, etc. or to create archiving functionality. Much more awkward to do this with the column based solution.
Performance is not going to be any different with this small amount of data, but to rely on the programmer to make and release a change every time the list changes is unworkable.

Database-wide unique-yet-simple identifiers in SQL Server

First, I'm aware of this question, and the suggestion (using GUID) doesn't apply in my situation.
I want simple UIDs so that my users can easily communicate this information over the phone :
Hello, I've got a problem with order
1584
as opposed to
hello, I've got a problem with order
4daz33-d4gerz384867-8234878-14
I want those to be unique (database wide) because I have a few different kind of 'objects' ... there are order IDs, and delivery IDs, and billing-IDs and since there's no one-to-one relationship between those, I have no way to guess what kind of object an ID is referring to.
With database-wide unique IDs, I can immediately tell what object my customer is referring to. My user can just input an ID in a search tool, and I save him the extra-click to further refine what is looking for.
My current idea is to use identity columns with different seeds 1, 2, 3, etc, and an increment value of 100.
This raises a few question though :
What if I eventually get more than 100 object types? granted I could use 1000 or 10000, but something that doesn't scale well "smells"
Is there a possibility the seed is "lost" (during a replication, a database problem, etc?)
more generally, are there other issues I should be aware of?
is it possible to use an non integer (I currently use bigints) as an identity columns, so that I can prefix the ID with something representing the object type? (for example a varchar column)
would it be a good idea to user a "master table" containing only an identity column, and maybe the object type, so that I can just insert a row in it whenever a need a new idea. I feel like it might be a bit overkill, and I'm afraid it would complexify all my insertion requests. Plus the fact that I won't be able to determine an object type without looking at the database
are there other clever ways to address my problem?
Why not use identities on all the tables, but any time you present it to the user, simply tack on a single char for the type? e.g. O1234 is an order, D123213 is a delivery, etc.? That way you don't have to engineer some crazy scheme...
Handle it at the user interface--add a prefix letter (or letters) onto the ID number when reporting it to the users. So o472 would be an order, b531 would be a bill, and so on. People are quite comfortable mixing letters and digits when giving "numbers" over the phone, and are more accurate than with straight digits.
You could use an autoincrement column to generate the unique id. Then have a computed column which takes the value of this column and prepends it with a fixed identifier that reflects the entity type, for example OR1542 and DL1542, would represent order #1542 and delivery #1542, respectively. Your prefix could be extended as much as you want and the format could be arranged to help distiguish between items with the same autoincrement value, say OR011542 and DL021542, with the prefixes being OR01 and DL02.
I would implement by defining a generic root table. For lack of a better name call it Entity. The Entity table should have at a minimum a single Identity column on it. You could also include other fields that are common accross all your objects or even meta data that tells you this row is an order for example.
Each of your actual Order, Delivery...tables will have a FK reference back to the Entity table. This will give you a single unique ID column
Using the seeds in my opinion is a bad idea, and one that could lead to problems.
Edit
Some of the problems you mentioned already. I also see this being a pain to track and ensure you setup all new entities correctly. Imagine a developer updating the system two years from now.
After I wrote this answer I had thought a but more about why your doing this, and I came to the same conclusion that Matt did.
MS's intentional programing project had a GUID-to-word system that gave pronounceable names from random ID's
Why not a simple Base36 representation of a bigint? http://en.wikipedia.org/wiki/Base_36
We faced a similar problem on a project. We solved it by first creating a simple table that only has one row: a BIGINT set as auto-increment identity.
And we created an sproc that inserts a new row in that table, using default values and inside a transaction. It then stores the SCOPE_IDENTITY in a variable, rolls back the transaction and then returns the stored SCOPE_IDENTITY.
This gives us a unique ID inside the database without filling up a table.
If you want to know what kind of object the ID is referring to, I'd lose the transaction rollback and also store the type of object along side the ID. That way findout out what kind of object the Id is referring to is only one select (or inner join) away.
I use a high/low algorithm for this. I can't find a description for this online though. Must blog about it.
In my database, I have an ID table with an counter field. This is the high part. In my application, I have a counter that goes from 0 to 99. This is the low part. The generated key is 100 * high + low.
To get a key, I do the following
initially high = -1
initially low = 0
method GetNewKey()
begin
if high = -1 then
high = GetNewHighFromDatabase
newkey = 100 * high + low.
Inc low
If low = 100 then
low = 0
high = -1
return newKey
end
The real code is more complicated with locks etc but that is the general gist.
There are a number of ways of getting the high value from the database including auto inc keys, generators etc. The best way depends on the db you are using.
This algorithm gives simple keys while avoiding most the db hit of looking up a new key every time. In testing, I found it had similar performance to guids and vastly better performance than retrieving an auto inc key every time.
You could create a master UniqueObject table with your identity and a subtype field. Subtables (Orders, Users, etc.) would have a FK to UniqueObject. INSTEAD OF INSERT triggers should keep the pain to a minimum.
Maybe an itemType-year-week-orderNumberThisWeek variant?
o2009-22-93402
Such identifier can consist of several database column values and simply formatted into a form of an identifier by the software.
I had a similar situation with a project.
My solution: By default, users only see the first 7 characters of the GUID.
It's sufficiently random that collisions are extremely unlikely (1 in 268 million), and it's efficient for speaking and typing.
Internally, of course, I'm using the entire GUID.