Adding a unique constraint on calculated value of a column

Adding a unique constraint on calculated value of a column - sql

I'm not exactly sure how to phrase this, but here goes...
We have a table structure like the following:
Id | Timestamp | Type | Clientid | ..others..
001 | 1234567890 | TYPE1 | CL1234567 |.....
002 | 1234561890 | TYPE1 | CL1234567 |.....
Now for the data given above... I would like to have a constraint so that those 2 rows could not exist together. Essentially, I want the table to be
Unique for (Type, ClientId, CEIL(Timestamp/10000)*10000)
I don't want rows with the same data created within X time of each other to be added to the db, i.e would like a constraint violation in this case. The problem is that, the above constraint is not something I can actually create.
Before you ask, I know, I know.... why right? Well I know a certain scenario should not be happening, but alas it is. I need a sort of stop gap measure for now, so I can buy some time to investigate the actual matter. Let me know if you need additional info...

Yes, Oracle supports calculated columns:
SQL> alter table test add calc_column as (trunc(timestamp/10000));
Table altered.
SQL> alter table test
add constraint test_uniq
unique (type, clientid, calc_column);
Table altered.
should do what you want.

AFAIK, Oracle does not support computed columns like SQL Server does. You can mimic the functionality of a computed column using Triggers.
Here are the steps for this
Add a column called CEILCalculation to your table.
On your table, put a trigger will update CEILCalculation with the value from CEIL(Timestamp/10000)*10000
Create a Unique Index on the three columns (Unique for (Type, ClientId, CEILCalculation)
If you do not want to modify the table structure, you can put a BEFORE INSERT TRIGGER on the table and check for validity over there.
http://www.techonthenet.com/oracle/triggers/before_insert.php

Related

SQL: String column to unique integer?

I have a table place2022 which has a very long CHAR column
timestamp | user_id | pixel_color | coordinate
-----------------+------------------------------------------------------------------------------------------+-------------+------------
17:38:20.021+00 | p0sXpmkcmg1KLiCdK5e4xKdudb1f8cjscGs35082sKpGBfQIw92nZ7yGvWbQ/ggB1+kkRBaYu1zy6n16yL/yjA== | #FF4500 | 371,488
17:38:20.024+00 | Ctar52ln5JEpXT+tVVc8BtQwm1tPjRwPZmPvuamzsZDlFDkeo3+ItUW89J1rXDDeho6A4zCob1MKmJrzYAjipg== | #51E9F4 | 457,493
17:38:20.025+00 | rNMF5wpFYT2RAItySLf9IcFZwOhczQhkRhmTD4gv0K78DpieXrVUw8T/MBAZjj2BIS8h5exPISQ4vlyzLzad5w== | #000000 | 65,986
17:38:20.025+00 | u0a7l8hHVvncqYmav27EARAE6ciLtpUTPXMI33lDrUmtj5Ei3ixlfRuG28KUvs7r5LpeiE/iOKPALVjkILhrYg== | #3690EA | 73,961
The user_ids are already hashes, so all I really care about here is having some sort of id column which is 1-1 with the user_id.
I've counted the number of unique user_ids, which is 10381163, which fits into 24 bits. Therefore, I can compress the id field down to a 32-bit integer using the obvious scheme of "Assign 1 to the first new user_id you see, 2 to the second new user_id you see", etc. I don't even care that the user_id's are mapped in the order that they're seen: I just need them to be mapped in an invertible manner to 32-bit ints somehow. I'd also like to persist this mapping somewhere so that, if I want to, I can go backwards.
What would be the best way to achieve this? I imagine that we could create a new table (create table place2022_user_ids as select distinct(user_id) from place2022;?) and then reverse-lookup the user_id column in that table, but I don't know quite how to formulate the queries and also make sure that I'm not doing something ridiculously slow.
I am using postgresql, if it matters.

If you have a recent (>8) version of Postgres you can add an auto increment id column to an existing table.
ALTER TABLE place2022
ADD COLUMN id SERIAL PRIMARY KEY;
NB If the existing column is a PRIMARY KEY you will need to drop it first.
See drop primary key constraint in postgresql by knowing schema and table name only

Insert distinct records from duplicate valued records in DATASTAGE

Hi I am using DATASTAGE to import hive data into Oracle, I don't have any primary key constraints in hive but on Oracle I am having a combination primary key.
For example I have a data which doesn't have duplicates on basis of whole record but the pk constraints has duplicate
Table has columns
Table name: item_details ;--(hive)
no primary key constraints
Id mfg_date item exp_date
1 12-01-2018 abc 31-03-2018
2 12-01-2018 cde 28-02-2018
3 15-01-2018 efg 10-04-2018
4 12-01-2018 abc 10-04-2018
Where the mfg_date and item together are primary key for the target table(Oracle) which is same structure.
And I need to push the data into target table.
But it says a primary key violation and gets aborted.
Can anybody give me a solution.
Ps. We cannot change the schema for the tables

This is what I use [[twt]] for. it's faster than doing Lookup and join stages.
start by changing the output sql from Insert to Custom SQL.
then you can create a custom SQL Statement like the one below:
insert into <<target table>>
select Id, mfg_date, item, exp_date from [[twt]]
where not exist (select 1 from <<target table>> where <<target table>>.id = [[twt]].id)
it'll fire off records that don't exist.
because custom sql allows for multiple statements, you can do your whole update this way.
By using [[twt]], you can change this to an ELT form of loading, and control the inserts.
thanks.

How to control version of records with referenced records (Foreign keys)

I am developing an application and in which I have multiple tables,
Table Name : User_master
|Id (PK,AI) | Name | Email | Phone | Address
Table Name: User_auth
|Id (PK,AI) | user_id(FK_userMaster) | UserName | password | Active_Status
Table Name: Userbanking_details
|Id (PK,AI) | user_id(FK_userMaster) | Bank Name | account Name | IFSC
Now, what I want is to save all the updates done in records should not be updated directly instead it should control the version that means I want to track the log of all previous updates user has done.
Which means if user updates the address, then also previous address record history should be stored into the table.
I have tried it by adding fields version_name, version_latest, updated_version_of field and insert new record when update like
|Id (PK,AI) | Name | Email | Phone | Address |version_name |version_latest| updated_version_of
1 | ABC |ABC#gm.com|741852|LA |1 |0 |1
2 | ABC |ABC#gm.com|852741|NY |2 |1 |1
Now the problem comes here is the user table is in FK with other two listed tables so when updating the record their relationship will be lost because of new ID.
I want to preserve the old data shown as old and new updated records will be in effect only with new transactions.
How can I achieve this?

Depending upon your use case you can make a json field in your tables for storing the previous states or a new identical history table for each table.
Dump the entire hash into the history column everytime the user updates anything.
Or insert a new row in the history table for each update in the original.

Storing historical records and current records in the same table, is not a good practice, in a transactional system.
The reasons are:
There will be more I/O due to scanning more number of pages to identify a record
Additional maintenance effort on the table
Transactions getting bigger, longer and cause time out issues
Additional effort of cascading referential integrity changes to child tables
I would suggest to keep historical records in a separate table. You can have OUTPUT clause to have the historical records to be captured and inserted into separate table. In that way, your referential integrity will remain the same. In the historical table, you don't need to have PK defined.
A below sample for using OUTPUT clause with UPDATE. You can read more about OUTPUT clause here
DECLARE #Updated table( [ID] int,
[Name_old] varchar(50),
[Email_old] varchar(50),
[Phone_old] varchar(50),
[Address_old] varchar(50),
[ModifiedDate_old] datetime);
Update User_Master
Set Email= 'NewEmail#Email.com', Name = 'newName', Phone='NewPhone', Address='NewAddress'
ModifiedDate=Getdate()
OUTPUT deleted.Id as Id, deleted.Name as Name_old, deleted.Email as email_old ,
deleted.ModifiedDate as ModifiedDate_old, deleted.Phone as phone_old, deleted.Address AS Address_old, deleted.ModifiedDate as modifiedDate_old
INTO #updated
Where [Id]=1;
INSERT INTO User_Master_History
SELECT * FROM #updated;

When I have faced this situation in the past I have solved it in the following ways:
First Method
Recommended method.
Have a second table which acts as a change history. Because you are not adding rows to the main table your foreign keys maintain integrity.
There are now mechanisms in SQL Server to do this automatically.
SQL Server 2016 Temporal Tables
SQL Server 2017 Change Data Capture
Second Method
I don't recommend this as a good design, but it does work.
Treat one record as the primary record, and this record maintains a foreign key relationship with records in other related tables which are subject to change tracking.
Always update this primary record with any changes thereby maintaining integrity of the foreign keys.
Add a self-referencing key to this table e.g. Parent and a date-of-change column.
Each time the primary record is updated, store the old values into a new record in the same table, and set the Parent value to the id of the primary record. Again the primary record never changes, and therefore your foreign key relationships maintain integrity.
Using the date-of-change column in conjunction with the change history allows you to reconstruct the exact values at any point in time.

Database design about list of constant strings

For database design, if the value of a column is from a constant list of strings, such as status, type. Should I create a new table and have a foreign key or just store plain strings in the same table.
For example, I have a orders table with status:
----------------------------
| id | price | status |
----------------------------
| 1 | 10.00 | pending |
| 2 | 03.00 | in_progress |
| 3 | xx.xx | done |
An alternative for above table is to have a order_status table and store status_id in orders table. I'm not sure if another table is necessary here.

If it's more than just a few different values and/or values are frequently added you should go with a normalized data model, i.e. a table.
Otherwise you also might go for a column, but you need to add a CHECK(status in ('pending','in_progress,'done')) to avoid wrong data. This way you get the same consistency without the FK.
To save space you might use abbreviations (one or a few characters, e.g. 'p', 'i', 'd') but not meaningless numbers(1,2,3). Resolving the long values can be done in a View level using CASE.
ENUMs are proprietary, so IMHO better avoid it...

It's not a good practice to create a table just for static values.
Instead, you could use the ENUM type, which has a pre set value, as the example:
CREATE TABLE orders (
id INT,
price DOUBLE,
status ENUM('pending', 'in progress', 'done')
);

There are pros and cons for each solution and you need pick the best for your own project and you may have to switch later if the initial choice is bad.
In your case, storing status directly can be good enough. But if you want to prevent invalid status stored in your database, or you have a very long status text, you may want to store them separately with a foreign key constraint.
ENUM is another solution. However, if you need a new status later, you have to change your table definition, which can be a very bad thing.

If the status has extra data associated with it, like display order or a colour, then you would need a separate table. Also, choosing pre-entered values from a table prevents semi-duplicate values (for example, one person might write "in progress" whereas another might write "in_progress" or "progressing") and aids in searching for orders with the same status.
I would go for a separate table as it allows more capabilities and lowers error.

I would use an order_status table with the literal as the primary key. Then in your orders table, cascade updates on the status column in case you modify the literals in the order_status table. This way you have data consistency and avoid join queries.

Change all primary keys in access table to new numbers

I have an access table with an automatic primary key, a date, and other data. The first record starts at 36, due to deleted records. I want to change all the primary keys so they begin at 1 and increment, ordered by the date. Whats the best way to do this?
I want to change the table from this:
| TestID | Date | Data |
| 36 | 12/02/09 | .54 |
| 37 | 12/04/09 | .52 |
To this:
| TestID | Date | Data |
| 1 | 12/02/09 | .54 |
| 2 | 12/04/09 | .52 |
EDIT: Thanks for the input and those who answered. I think some were reading a little too much into my question, which is okay because it still adds to my learning and thinking process. The purpose of my question was two fold: 1) It would simply be nicer for me to have the PK match with the order of my data's dates and 2) to learn if something like this was possible for later use. Such as, if I want to add a new column to the table which numbers the tests, labels the type of test, etc. I am trying to learn a lot at once right now so I get a little confused where to start sometimes. I am building .NET apps and trying to learn SQL and database management and it is sometimes confusing finding the right info with the different RDMS's and ways to interact with them.

Following from MikeW, you can use the following SQL command to copy the data from the old to the new table:
INSERT
TestID, Date, Data
INTO
NewTable
SELECT
TestID, Date, Data
FROM
OldTable;
The new TestID will start from 1 if you use an AutoIncrement field.

I would create a new table, with autoincrement.
Then select all the existing data into it, ordering by date. That will result in the IDs being recreated from "1".
Then you could drop the original table, and rename the new one.
Assuming no foreign keys - if so you'd have to drop and recreate those too.

An Autonumber used as a surrogate primary keys is not data, but metadata used to do nothing but connect records in related tables. If you need to control the values in that field, then it's data, and you can't use an Autonumber, but have to roll your own autoincrement routine. You might want to look at this thread for a starting point, but code for this for use in Access is available everywhere Access programmers congregate on the Net.

I agree that the value of the auto-generated IDENTITY values should have no meaning, even for the coder, but for education purposes, here's how to reseed the IDENTITY using ADO:
ACC2000: Cannot Change Default Seed and Increment Value in UI
Note the article as out of date because it says, "there are no options available in the user interface (UI) for you to make this change." In later version the Access, the SQL DLL could be executed when in ANSI-92 Query Mode e.g. something like this:
ALTER TABLE MyTable ALTER TestID INTEGER IDENTITY (1, 1) NOT NULL;

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Adding a unique constraint on calculated value of a column - sql

Yes, Oracle supports calculated columns: SQL> alter table test add calc_column as (trunc(timestamp/10000)); Table altered. SQL> alter table test add constraint test_uniq unique (type, clientid, calc_column); Table altered. should do what you want.

Related

SQL: String column to unique integer?

Insert distinct records from duplicate valued records in DATASTAGE

How to control version of records with referenced records (Foreign keys)

Database design about list of constant strings

Change all primary keys in access table to new numbers

Categories

Resources