Unique constraint vs pre checking - sql

I use SQL Server 2008, and I have a table with a column of type varchar(X) which I want to have unique values.
What is the best way to achieve that? Should I use unique constraint and catch an exception, or should I pre-check before inserting a new value?
One issue, the application is used by many users so I guess that pre-checking might result in race condition, in case that two users will insert the same values.
Thanks

Race condition is an excellent point to be aware of.
Why not do both? - pre-check so you can give good feedback to the user, but definitely have the unique constraint as your ultimate safeguard.

Let the DB do the work for you. Create the unique constraint.

If it's a requirement that the values be unique --- then a constraint is the only guaranteed way to achieve that. reliable so-called pre-checking will require a level of locking that will make that part of your system essentially single user.

Use a constraint (UNIQUE or PRIMARY KEY). That way the key is enforced for every application. You could perform additional checks and handling in a store procedure if you need to - either before or after the insert.

Related

constraints different moments

I have a table of schedules
So my question is this : How can I make a constraint to forbid a values to be scheduled no more than once a day.
Thanks ahead.
Simply add a unique constraint/index on the vessel and date:
create unique index unq_tourschedule_vesselid_tourdate on tourschedule(vesselid, tourdate);
(A unique constraint is implemented using a unique index.)
You should do this in the database, so even manual changes to the data enforce this constraint.
It depends on what level you need to "prevent" the scheduling. Do you want to prevent it from the UI, the middle-tier, or at the database level?
UI - Do an AJAX check against DB or middle-tier check and prevent insertion of the record there (not a secure solution, but worth mentioning because it informs your users of an existing record).
Middle Tier - best place. Query your DB to see if a record exists with that given vesselID and TourDate. If any records are returned, do not allow insertion. You could then redirect to the page with a helpful message to the user. Business logic goes here typically, and it is best to decouple your business logic from your database.
Database level - most robust, but least maintainable and bad practice for business logic visibility. Many options, all of them cumbersome:
Stored procedure - upon insert, check the records, same procedure as middle tier, but you have to funnel your "error" message up through all the tiers.
Compound key using vesselID and TourDate ensures automatically that only unique entries can be inserted.
Constraint on the table data upon insertion - not just an index, which is for searching optimization, but an actual constraint. This constraint may be added to an existing table or be part of the table creation statement itself.
Yes I have created a unique Index and everything worked out all right thank you for helping me out.

SQL Server: How to allow duplicate records on small table

I have a small table "ImgViews" that only contains two columns, an ID column called "imgID" + a count column called "viewed", both set up as int.
The idea is to use this table only as a counter so that I can track how often an image with a certain ID is viewed / clicked.
The table has no primary or foreign keys and no relationships.
However, when I enter some data for testing and try entering the same imgID multiple times it always appears greyed out and with a red error icon.
Usually this makes sense as you don't want duplicate records but as the purpose is different here it does make sense for me.
Can someone tell me how I can achieve this or work around it ? What would be a common way to do this ?
Many thanks in advance, Tim.
To address your requirement to store non-unique values, simply remove primary keys, unique constraints, and unique indexes. I expect you may still want a non-unique clustered index on ImgID to improve performance of aggregate queries that would otherwise require a scan the entire table and sort. I suggest you store an insert timestamp, not to provide uniqueness, but to facilitate purging data by date, should the need arise in the future.
You must have some unique index on that table. Make sure there is no unique index and no unique or primary key constraint.
Or, SSMS simply doesn't know how to identify the row that was just inserted because it has no key.
It is generally not best practice to have a table without a (logical) primary key. In your case, I'd make the image id the primary key and increment the counter. The MERGE statement is well-suited for performing and insert or update at the same time. Alternatives exist.
If you don't like that, create a surrogate primary key (an identity column set as the primary key).
At the moment you have no way of addressing a specific row. That makes the table a little unwieldy.
If you allow multiple rows being absolutely identical, how would you update/delete one of those rows?
How would you expect the database being able to "know" what row you referred to??
At the very least add a separate identity column (preferred being the clustered index, too).
As a side note: It's weird that you "like to avoid unneeded data" but at the same time insert duplicates over and over again instead of simply add up the click count per single image...
Use SQL statements, not GUI, if the table has not primary key or unique constraint.

Fixing holes/gaps in numbers generated by Postgres sequence

I have a postgres database that uses sequences extensively to generate primary keys of tables.
After a lot of usage of this database i.e. Adding/Update/Delete operation the columns that uses sequences for primary keys now have a lot holes/gaps in them and the sequence value itself is very high.
My question is: Are there any ways in which we can fix these gaps in Primary Keys? which should inturn bring down the max value of the number in that columns and then reset the sequence?
Note: A lot of these columns are also referenced by other columns as ForeignKeys.
If you feel the need to fill gaps in auto-generated Posgresql sequence numbers, I have the feeling you need another field in your table, like some kind of "number" you increment programmatically, either in your code, or in a trigger.
It is possible to solve this problem, but is expensive for the database to do (especially IO) and is guaranteed to reoccur. I would not worry about this problem. If you get close to 4B, upgrade your primary and foreign keys to BIGSERIAL and BIGINT. If you're getting close to 2^64... well... I'd be interested in hearing more about your application. :~]
Postgres allows you to update PKs, although a lot of people think it's bad practice. So you could lock the table, and UPDATE. (You can make an oldkey, newkey table all sorts of ways, e.g., window function.) All the FK relationships have to be marked to cascade. Then you can reset the currval of the id sequence.
Personally, I would just use a BIGSERIAL. If you have so many updates and deletes that you may run out even so, maybe there is some composite PK based on (say) a timestamp and id that would help you.

Does this query guarantee me a 'race free' PK value?

I was just reading How to avoid a database race condition when manually incrementing PK of new row.
There was a lot of good suggestions like having a separate table to get the PK values.
So I wonder if a query like this:
INSERT INTO Party VALUES(
(SELECT MAX(id)+1 FROM
(SELECT id FROM Party) as x),
'A-XXXXXXXX-X','Joseph')
could avoid race conditions?
Is the whole statement guaranteed to be atomic? Isn't in mysql? postgresql?
The best way to avoid race conditions while creating primary keys in a relational database is to allow the database to generate the primary keys.
It would work on tables which use table-level locking (MyISAM), but on Innodb etc, it could deadlock or produce duplicate keys, I think, depending on the isolation level in use.
In any case doing this is an extremely bad idea as it won't work well in the general case, but might appear to work during low-concurrency testing. It's a recipe for trouble.
You'd be better off using another table and incrementing a value in there; that's more likely to be race-free / deadlock-free.
No, you still have a problem, as, if two queries try to increment at the same time there may be a situation where the inner select is done, then another query is processed.
Your best bet, if you want a guarantee, if you don't want the database doing it, is to have a unique key on there.
In the event that there is an error in inserting, then try your query again, and once the primary key is unique it will work.
In this case, your best bet is to first insert only the id and any other non-null columns, and then do an update to set the nullable columns to whatever is correct.

Is it OK not to use a Primary Key When I don't Need one

If I don't need a primary key should I not add one to the database?
You do need a primary key. You just don't know that yet.
A primary key uniquely identifies a row in your table.
The fact it's indexed and/or clustered is a physical implementation issue and unrelated to the logical design.
You need one for the table to make sense.
If you don't need a primary key then don't use one. I usually have the need for primary keys, so I usually use them. If you have related tables you probably want primary and foreign keys.
Yes, but only in the same sense that it's okay not to use a seatbelt if you're not planning to be in an accident. That is, it's a small price to pay for a big benefit when you need it, and even if you think you don't need it odds are you will in the future. The difference is you're a lot more likely to need a primary key than to get in a car accident.
You should also know that some database systems create a primary key for you if you don't, so you're not saving that much in terms of what's going on in the engine.
No, unless you can find an example of, "This database would work so much better if table_x didn't have a primary key."
You can make an arguement to never use a primary key, if performance, data integrity, and normalization are not required. Security and backup/restore capabilities may not be needed, but eventually, you put on your big-boy pants and join the real world of database implementation.
Yes, a table should ALWAYS have a primary key... unless you don't need to uniquely identify the records in it. (I like to make absolute statements and immediately contradict them)
When would you not need to uniquely identify the records in a table? Almost never. I have done this before though for things like audit log tables. Data that won't be updated or deleted, and wont be constrained in any way. Essentially structured logging.
A primary key will always help with query performance. So if you ever need to query using the "key" to a "foreign key", or used as lookup then yes, craete a foreign key.
I don't know. I have used a couple tables where there is just a single row and a single column. Will always only be a single row and a single column. There is no foreign key relationships.
Why would I put a primary key on that?
A primary key is mainly formally defined to aid referencial Integrity, however if the table is very small, or is unlikely to contain unique data then it's an un-necessary overhead.
Defining indexes on the table can normally be used to imply a primary key without formally declaring one.
However you should consider that defining the Primary key can be useful for Developers and Schema generation or SQL Dev tools, as having the meta data helps understanding, and some tools rely on this to correctly define the Primary/foreign key relationships in the model.
Well...
Each table in a relational DB needs a primary key. As already noted, a primary key is data that identies a record uniquely...
You might get away with not having an "ID" field, if you have a N-M table that joins 2 different tables, but you can uniquely identifiy the record by the values from both columns you join. (Composite primary key)
Having a table without an primary key is against the first normal form, and has nothing to do in a relational DB
You should always have a primary key, even if it's just on ID. Maybe NoSQL is what you're after instead (just asking)?
That depends very much on how sure you can be that you don't need one. If you have just the slightest bit of doubt, add one - you'll thank yourself later. An indicator being if the data you store could be related to other data in your DB at one point.
One use case I can think of is a logging kind-of table, in which you simply dump one entry after the other (to properly process them later). You probably won't need a primary key there, if you're storing enough data to filter out the relevant messages (like a date). Of course, it's questionable to use a RDBMS for this.