Combine results from multiple views and add identifier - sql

I have a view that selects and joins data from several tables.
I have this same view in multiple databases on different servers. (it's part of the same application installed on various servers)
What I'm trying to do is use the 'Export Data' wizard to create an SSIS package that copies the data from these views to a single data warehouse database.
However, since I can't guarantee that there won't be identical rows in the the views, I want to add an ID column in the data warehouse db. But I can't seem to get it to work.
Usually when you want to add an autoincrement ID, one simply inserts 'NULL' into that column. So I've added a 'NULL' value to the select of the view. And I've added an ID column to the destination table, with Identity and auto-increment on.
However, when I run the Export Data wizard, it gives an error
'The value violated the integrity constraints for the column.'
Does anyone have an idea how to combine data from different views on different db servers and add a unique identifier in the destination table?
Cheers, CJ

You can't insert NULL into the Primary Key. Just don't insert anything into the ID column if you have it as auto incremented. Example:
CREATE TABLE test_table
(
ID int IDENTITY(1,1) PRIMARY KEY,
test varchar(255) NOT NULL
);
INSERT INTO test_table(test )
VALUES ('some value');
And ID will be set to 1 for this record.

Related

Postgres BIGSERIAL does not share sequence when inserts are made with multiple remote fdw sources

I am trying to insert into a table in a Postgres database from two other Postgres databases using Foreign Data Wrappers. The objective is to have an autogenerate primary key, independent of the source, as there will be more than two in.
I first defined the tables like so:
Target database:
create table dummy (
dummy_pk bigserial primary key
-- other fields
);
Sources databases:
create foreign table dummy (
dummy_pk bigserial
-- other fields
) server ... ;
This solution worked fine as long as I inserted from only one source, when I tried to insert from the other one, without specifying dummy_pk, I got this message:
Duplicate key (dummy_pk)=(1)
Because postgres tries to insert an id of 1, I believe the sequence used for each source foreign table is different. I changed the source tables a bit in an attempt to let the target table's sequence do the job for the id:
create foreign table dummy (
dummy_pk bigint
-- other fields
) server ... ;
This time I got a diffrent error:
NULL value violates NOT NULL constaint on column « dummy_pk »
Therefore I believe the source server sends a query to the target where the dummy_pk is null, and the target does not replace it with the default value.
So, is there a way I can force the use of the target's sequence in a query executed on the source? Maybe I have to share that sequence, can I create a foreign sequence? I cannot remove the column on the foreign tables as I need a read access to them.
Thanks!
Remove dummy_pk from foreign tables so that destination table does not get NULL nor value and so fall backs to DEFAULT or NULL if no DEFAULT specified. If you attempt to pass DEFAULT to foreign table it will try to use DEFAULT value of foreign table instead.
create foreign table dummy (
/*dummy_pk bigserial,*/
column1 text,
column2 int2,
-- other fields
) server ... ;
Another way would be to grab sequence values from destination server using dblink, but I think this is better (if you can afford to have this column removed from foreign tables).

How to make a field NOT NULL in a multi-tenant database

This is a muti-tenant app. All records have a client id to separate client data. Customers can insert their own data in this table and set their own field nullable or not null. Therefore, setting the whole field not null will not work. I need to set a field null for a specific client id.
I am currently querying the database to check if the value is null. On INSERT I check if the inserting value is null if so I throw an error. I would like the database to do all these checks. is this possible in a multi tenant database like this?
Also, I need suggestions for SQL Server, oracle and postgresql. Thanks
With Postgresql at least you could do this with table inheritance.
You could define an inherited table for this specific client which included the required constraint.
Consider the following example:
psql=> CREATE TABLE a(client INT NOT NULL, id SERIAL, foo TEXT);
CREATE TABLE
psql=> CREATE TABLE b(foo TEXT NOT NULL, CHECK (CLIENT=1) ) INHERITS (a);
NOTICE: moving and merging column "foo" with inherited definition
DETAIL: User-specified column moved to the position of the inherited column.
CREATE TABLE
psql=> INSERT INTO b(client,foo) VALUES (1,'a');
INSERT 0 1
psql=> INSERT INTO b(client,foo) VALUES (1,NULL);
ERROR: null value in column "foo" violates not-null constraint
DETAIL: Failing row contains (1, 2, null).
The table 'b' in this case inherits from 'a' but has a different definition for column 'foo' including a not-null constraint. Also note that I have used a check constraint to ensure that only records for client 1 can go into this table.
For this to work, either your application would have to be updated to insert client records into the correct table, or you would need to write a trigger that does that automatically. Examples of how to do that are given in the manual section on partitioning.
Your application can still make queries against the parent table ('a' from my example) and get the records for all clients, including any in child tables.
You won't be able to do this with a column constraint. Think you're going to have to write a trigger.

postgresql: error duplicate key value violates unique constraint

This question have been asked by several people but my problem seems to be different.
Actually I have to merge same structured tables from different databases in postgresql into a new DB. What I am doing is that I connect to remote db using dblink, reads that table in that db and insert it into the table in the current DB like below
INSERT INTO t_types_of_dementia SELECT * FROM dblink('host=localhost port=5432 dbname=snap-cadence password=xxxxxx', 'SELECT dementia_type, snapid FROM t_types_of_dementia') as x(dementia_type VARCHAR(256),snapid integer);
First time this query runs fine, but when I run it again or try to run it with some other remote database's table: it gives me this error
ERROR: duplicate key value violates unique constraint
"t_types_of_dementia_pkey"
I want that this new tables gets populated by entries of others tables from other dbs.
Some of the solutions proposed talks about sequence, but i am not using any
The structure of the table in current db is
CREATE TABLE t_types_of_dementia(
dementia_type VARCHAR(256),
snapid integer NOT NULL,
PRIMARY KEY (dementia_type,snapid)
);
P.S. There is a specific reason why both the columns are used as a primary key which may not be relevant in this discussion, because same issue happens in other tables where this is not the case.
As the error message tells you - you can not have two rows with the same value in the columns dementia_type, snapid since they need to be unique.
You have to make sure that the two databases has the same values for dementia_type, snapid.
A workaround would be to add a column to your table alter table t_types_of_dementia add column id serial generated always and use that as primary key instead of your current.

Is there a smart way to append a number to an PK identity column in a Relational database w/o total catastrophe?

It's far from the ideal situation, but I need to fix a database by appending the number "1" to the PK Identiy column which has FK relations to four other tables. I'm basically making a four digit number a five digit number. I need to maintain the relations. I could store the number in a var, do a Set query and append the 1, and do that for each table...
Is there a better way of doing this?
You say you are using an identity data type for your primary key so before you update the numbers you will have to SET IDENTITY_INSERT ON (documentation here) and then turn it off again after the update.
As long as you have cascading updates set for your relations the other tables should be updated automatically.
EDIT: As it's not possible to change an identity value I guess you have to export the data, set the new identity values (+10000) and then import your data again.
Anyone have a better suggestion...
Consider adding another field to the PK instead of extending the length of the PK field. Your new field will have to cascade to the related tables, like a field length increase would, but you get to retain your original PK values.
My suggestion is:
Stop writing to the tables.
Copy the tables to new tables with the new PK.
Rename the old tables to backup names.
Rename the new tables to the original table name.
Count the rows in all the tables and double check your work.
Continue using the tables.
Changing a PK after the fact is not fun.
If the column in question has an identity property on it, it gets complicated. This is more-or-less how I'd do it:
Back up your database.
Put it in single user mode. You don't need anybody mucking around whilst you do the surgery.
Execute the ALTER TABLE statements necessary to
disable the primary key constraint on the table in question
disable all triggers on the table in question
disable all foreign key constraints referencing the table in question.
Clone your table, giving it a new name and a column-for-column identical definitions. Don't bother with any triggers, indices, foreign keys or other constraints. Omit the identity property from the table's definition.
Create a new 'map' table that will map your old id values to the new value:
create table dbo.pk_map
(
old_id int not null primary key clustered ,
new_id int not null unique nonclustered ,
)
Populate the map table:
insert dbo.pk_map
select old_id = old.id ,
new_id = f( old.id ) // f(x) is the desired transform
from dbo.tableInQuestion old
Populate your new table, giving the primary key column the new value:
insert dbo.tableInQuestion_NEW
select id = map.id ,
...
from dbo.tableInQuestion old
join dbo.pk_map map on map.old_id = old.id
Truncate the original table: TRUNCATE dbo.tableInQuestion. This should work—safely—since you've disabled all the triggers and foreign key constraints.
Execute SET IDENTITY_INSERT dbo.tableInQuestion ON.
Reload the original table:
insert dbo.tableInQuestion
select *
from dbo.tableInQuestion_NEW
Execute SET IDENTITY_INSERT dbo.tableInQuestion OFF
Execute drop table dbo.tableInQuestion_NEW. We're all done with it.
Execute DBCC CHECKIDENT( dbo.tableInQuestion , reseed ) to get the identity counter back in sync with the data in the table.
Now, use the map table to propagate the changed primary key column down the line. Depending on your E-R model, this can get complicated as foreign keys referencing the updated column may themselves be part of a composite primary key.
When you're all done, start re-enabling the constraints and triggers you disabled. Make sure you do this using the WITH CHECK option. Fix any problems thus uncovered.
Finally, drop the map table, and clear the single user flag and bring your system(s) back online.
Piece of cake! (or something.)
Consider this approach:
Reset the identity seed to the 10000 + the current seed.
Set identity insert on
Insert into the table from the values in the table and add 10000 to the identity column on the way.
EX:
Set identity insert on
Insert Table(identity, column1, eolumn2)
select identity + 10000, column1, column2
From Table
Where identity < origional max identity value
After the insert you know the identity is exactly 10000 more than the origional.
Update the foreign keys by addding 10000.

How to move data between multiple database's table while maintaining foreign-key relationships/referential integrity?

I'm trying to figure out the best way to move/merge a couple tables of worth of data from multiple databases into one.
I have a schema similar to the following:
CREATE TABLE Products(
ProductID int IDENTITY(1,1) NOT NULL,
Name varchar(250) NOT NULL,
Description varchar(1000) NOT NULL,
ImageID int NULL
)
CREATE TABLE Images (
ImageID int IDENTITY(1,1) NOT NULL,
ImageData image NOT NULL
)
With a foreign-key of the Products' ImageID to the Images' ImageID.
So what's the best way to move the data contained within these table from multiple source databases into one destination database with the same schema. My primary issue is maintaining the links between the products and their respective images.
In SQL Server, you can enable identity inserts:
SET IDENTITY_INSERT NewTable ON
<insert queries here>
SET IDENTITY_INSERT NewTable OFF
While idenitity insert is enabled, you can insert a value in the identity column like any other column. This allows you to just copy the tables, for example from a linked server:
insert into newdb.dbo.NewTable
select *
from oldserver.olddb.dbo.OldTable
I preposition the data in staging tables (Adding a newid column to each). I add a column temporarily to the table I'm merging to that is Oldid. I insert the data to the parent table putting the currect oldid inthe oldid column. I use the oldid column to join to the staging table to populate the newid column in the staging table. Now I have the New FK ids for the child tables and ccan insert using them. If you have SQL server 2008, you can use the OUTPUT clause to return the old and newids to a temp table and then use from there rather than dding the column. I prefer, to have the change explicitly stored ina staging table though to troubleshoot issues in the conversion. At the end nullout the values inteh oldid column if you are then going to add records from a third database or drop it if you are done. Leave the staging tables in place for about a month, to make research into any questions easier.
In this case you could move the images and then move the products. This would ensure that any image a product has a reference to will already be in place.