I was trying to figure out an easy and fast way to create a table A from another table B which has more than 400 columns. Just a create or replace table statement as below would have worked.
create or replace table A
AS select * from B where 1=2
However, I need to create table A with an extra column as ID and also need to add clustering on this column ID. Altering the table later will also do but I understand that I cannot add clustering once the table is created. I do not want to write a DDL specifying all 400 columns. Can this be achieved in an easy and faster way?
I was also looking at options to create table dynamically by using INFORMATION_SCHEMA.COLUMNS information. However, I am not yet sure of that.
I have a table people with less than 100,000 records and I have taken a backup of this table using the following:
create table people_backup as select * from people
I add some new records to my people table over time, but eventually I want to merge the records from my backup table into people. Unfortunately I cannot simply DROP my table as my new records will be lost!
So I want to update the records in my people table using the records from people_backup, based on their primary key id and I have found 2 ways to do this:
MERGE the tables together
use some sort of fancy correlated update
Great! However, both of these methods use SET and make me specify what columns I want to update. Unfortunately I am lazy and the structure of people may change over time and while my CTAS statement doesn't need to be updated, my update/merge script will need changes, which feels like unnecessary work for me.
Is there a way merge entire rows without having to specify columns? I see here that not specifying columns during an INSERT will direct SQL to insert values by order, can the same methodology be applied here, is this safe?
NB: The structure of the table will not change between backups
Given that your table is small, you could simply
DELETE FROM table t
WHERE EXISTS( SELECT 1
FROM backup b
WHERE t.key = b.key );
INSERT INTO table
SELECT *
FROM backup;
That is slow and not particularly elegant (particularly if most of the data from the backup hasn't changed) but assuming the columns in the two tables match, it does allow you to not list out the columns. Personally, I'd much prefer writing out the column names (presumably those don't change all that often) so that I could do an update.
I have a students table in postgres that is populated via an external source. Each night we populate the students_swap table, and then after the long running operation is complete we rename it to students and the original table then becomes students_swap to be used the next day.
The problem with this is that when we add a new column or index to the original table we must remember to also do so on the swap table. I am attempting to automate some of this w/ the following:
-- Drop the swap table if it's already there...
DROP TABLE IF EXISTS students_swap;
-- Recreate the swap table using the original as a template...
CREATE TABLE students_swap AS SELECT * FROM students WHERE 1=2;
... populate the swap table ....
ALTER TABLE students RENAME TO students_temp;
ALTER TABLE students_swap RENAME TO ps_students;
ALTER TABLE students_temp RENAME TO students_swap;
This works well for creating the table structure but no indices are created for the swap table.
My question is how do I copy all of the indexes in addition to the table structure to make sure my original table and swap table stay in sync?
Use create table ... like instead:
CREATE TABLE students_swap (LIKE students INCLUDING ALL);
This will include indexes, primary keys and check constraints but will not re-create the foreign keys.
Edit:
INCLUDING ALL will also copy the default settings for columns populated by sequences (e.g. a column defined as serial). It sounds as if you want that. If you do not want that, then use INCLUDING INDEXES INCLUDING CONSTRAINTS instead.
I have a huge existing Order Management Application.
Now, in the main ORDER Table, i am adding a new column: IS_HISTORICAL. If its value is: TRUE, means the Order is Historical now, and should not show up in application.
Now, i have to modify many SQL Queries in my existing application so that they select only those orders whose IS_HISTORICAL is 'FALSE' - i.e add following in WHERE clause:
AND IS_HISTORICAL='FALSE'
Question: *Is there a easier way - so that i do not have to modify so many application queries (to hide away historical orders)?
Essentially all ORDERS marked as IS_HISTORICAL='TRUE' should become invisible/un-available for read/updates!!*
Note: Right now the table sizes are not very huge, but ultimately i intend to partition the table by IS_HISTORICAL true/false.
If you're only going to use the historical data for analysis then I prefer Florin's solution as the amount of data you need to look at for each query remains smaller. It makes the analysis queries more difficult as you need to UNION ALL but everything else will run "quicker" (it may not be noticable).
If some applications/users require access to the historical data the better solution would be to rename your table and create a view on top of it with the query that you need.
The problem with re-writing all your queries is that you're going to forget one or get one incorrect, either now or in the future. A view removes that problem for you as the query is static, every time you query the view the additional conditions you require are automatically added.
Something like:
rename orders to order_history;
create or replace view orders as
select *
from order_history
where is_historical = 'FALSE';
Two further points.
I wouldn't bother with TRUE / FALSE, if the table gets large it's a lot of additional data to scan. Create your column as a VARCHAR2(1) and use T / F or Y / N, they are as immediately obvious but are smaller. Alternatively use a NUMBER(1,0) and 1 / 0.
Don't forget to put a constraint on your table so that the IS_HISTORICAL column can only have the values you've chosen.
If you're only ever going to have the two values then you may want to consider a CHECK CONSTRAINT:
alter table order_history
add constraint chk_order_history_historical
check ( is_historical in ('T','F') );
Otherwise, maybe you should do this anyway, use a FOREIGN KEY CONSTRAINT. Define an extra table, ORDER_HISTORY_TYPES
create table order_history_types (
id varchar2(1)
, description varchar2(4000)
, constraint pk_order_history_types primary key (id)
);
Fill it with your values and then add the foreign key:
alter table order_history
add constraint fk_order_history_historical
foreign key (is_historical)
references order_history_types (id)
You could look into using Virtual Private Database/row-level security. This can be used to automatically add the is_historical = 'FALSE' predicate when certain conditions are met (e.g. you're connected as the application user).
If the user only need nonhistorical records, an option is to create an ORDER_HIST table and move there the historical records. (delete and insert)
If some users/applications need both type of records then the partition aproach is the best.
Hi Im struggling a bit with this and could use some ideas...
Say my database has the following tables ;
Customers
Supplers
SalesInvoices
PurchaseInvoices
Currencies
etc etc
I would like to be able to add a "Notes" record to ANY type of record
The Notes table would like this
NoteID Int (PK)
NoteFK Int
NoteFKType Varchar(3)
NoteText varchar(100)
NoteDate Datetime
Where NoteFK is the PK of a customer or supplier etc and NoteFKType says what type of record the note is against
Now i realise that I cannot add a FK which references multiple tables without NoteFK needing to be present in all tables.
So how would you design the above ?
The note FK needs to be in any of the above tables
Cheers,
Daniel
You have to accept the limitation that you cannot teach the database about this foreign key constraint. So you will have to do without the integrity checking (and cascading deletes).
Your design is fine.
It is easily extensible to extra tables, you can have multiple notes per entity, and the target tables do not even need to be aware of the notes feature.
An advantage that this design has over using a separate notes table per entity table is that you can easily run queries across all notes, for example "most recent notes", or "all notes created by a given user".
As for the argument of that table growing too big, splitting it into say five table will shrink the table to about a fifth of its size, but this will not make any difference for index-based access. Databases are built to handle big tables (as long as they are properly indexed).
I think your design is ok, if you can accept the fact, that the db system will not check whether a note is referencing an existing entity in other table or not. It's the only design I can think of that doesn't require duplication and is scalable to more tables.
The way you designed it, when you add another entity type that you'd like to have notes for, you won't have to change your model. Also, you don't have to include any additional columns in your existing model, or additional tables.
To ensure data integrity, you can create set of triggers or some software solution that will clean notes table once in a while.
I would think twice before doing what you suggest. It might seem simple and elegant in the short term, but if you are truly interested in data integrity and performance, then having separate notes tables for each parent table is the way to go. Over the years, I've approached this problem using the solutions found in the other answers (triggers, GUIDs, etc.). I've come to the conclusion that the added complexity and loss of performance isn't worth it. By having separate note tables for each parent table, with an appropriate foreign key constraints, lookups and joins will be simple and fast. When combining the related items into one table, join syntax becomes ugly and your notes table will grow to be huge and slow.
I agree with Michael McLosky, to a degree.
The question in my mind is: What is the technical cost of having multiple notes tables?
In my mind, it Is preferable to consolidate the same functionality into a single table. It aso makes reporting and other further development simpler. Not to mention keeping the list of tables smaller and easier to manage.
It's a balancing act, you need to try to predetermine both the benefits And the costs of doing something like this. My -personal- preference is database referential integrity. Application management of integrity should, in my opinion, be limitted ot business logic. The database should ensure the data is always consistent and valid...
To actually answer your question...
The option I would use is a check constraint using a User Defined Function to check the values. This works in M$ SQL Server...
CREATE TABLE Test_Table_1 (id INT IDENTITY(1,1), val INT)
GO
CREATE TABLE Test_Table_2 (id INT IDENTITY(1,1), val INT)
GO
CREATE TABLE Test_Table_3 (fk_id INT, table_name VARCHAR(64))
GO
CREATE FUNCTION id_exists (#id INT, #table_name VARCHAR(64))
RETURNS INT
AS
BEGIN
IF (#table_name = 'Test_Table_1')
IF EXISTS(SELECT * FROM Test_Table_1 WHERE id = #id)
RETURN 1
ELSE
IF (#table_name = 'Test_Table_2')
IF EXISTS(SELECT * FROM Test_Table_2 WHERE id = #id)
RETURN 1
RETURN 0
END
GO
ALTER TABLE Test_Table_3 WITH CHECK ADD CONSTRAINT
CK_Test_Table_3 CHECK ((dbo.id_exists(fk_id,table_name)=(1)))
GO
ALTER TABLE [dbo].[Test_Table_3] CHECK CONSTRAINT [CK_Test_Table_3]
GO
INSERT INTO Test_Table_1 SELECT 1
GO
INSERT INTO Test_Table_1 SELECT 2
GO
INSERT INTO Test_Table_1 SELECT 3
GO
INSERT INTO Test_Table_2 SELECT 1
GO
INSERT INTO Test_Table_2 SELECT 2
GO
INSERT INTO Test_Table_3 SELECT 3, 'Test_Table_1'
GO
INSERT INTO Test_Table_3 SELECT 3, 'Test_Table_2'
GO
In that example, the final insert statement would fail.
You can get the FK referential integrity, at the costing of having one column in the notes table for each other table.
create table Notes (
id int PRIMARY KEY,
note varchar (whatever),
customer_id int NULL REFERENCES Customer (id),
product_id int NULL REFERENCES Product (id)
)
Then you'll need a constraint to make sure that you have only one of the columns set.
Or maybe not, maybe you might want a note to be able to be associated with both a customer and a product. Up to you.
This design would require adding a new column to Notes if you want to add another referencing table.
You could add a GUID field to the Customers, Suppliers, etc. tables. Then in the Notes table, change the foreign key to reference that GUID.
This does not help for data integrity. But it makes M-to-N relationships easily possible to any number of tables and it saves you from having to define a NoteFKType column in the Notes table.
You can easily implement "multi"-foreign key with triggers. Triggers will give you very flexible mechanism and you can do any integrity checks you wish.
Why dont you do it the other way around and have a foreign key in other tables (Customer, Supplier etc etc) to NotesID. This way you have one to one mapping.