automate schema creation in PostgreSQL (maybe plpgsql?) - sql

I have a big database (in PostgreSQL 8.4) which I try to analyse in smaller parts. To do that, I copy parts of the content of the big database to other schemas (I know it is somehow against the philosophy of databases to copy data, but without that step the analysis is too slow).
There are quite a few SQL-commands that I need to perform, to get a new schema with all necessary tables inside it. However, the difference between the creation of one schema to the creation of another schema is very small (in principle its just the name of the schema and a different value in a "WHERE"-clause).
My question is the following:
Is it possible to write a function that takes a certain value as a parameter and uses this parameter in the where clause (and as the name of the schema?)
If it is possible, which program-language would you suggest (maybe plpgsql), and what would such a script look like (just as a skeleton)?
Thank you in advance!

Not sure I'm making perfect sense of your question, but it sounds like should be using the temporary schema:
create temporary table foo as select * from bar where ...
On occasion it's also useful to use the same name:
create temporary table foo as select * from foo where ...
Else yes, dynamic SQL works:
create function do_stuff(_table regclass) returns void as $$
begin
execute 'select 1 from ' || _table;
end; $$ language plpgsql strict;
select do_stuff('schemaname.tablename');
http://www.postgresql.org/docs/9.0/interactive/plpgsql-statements.html#PLPGSQL-STATEMENTS-EXECUTING-DYN

Related

Query from multiple tables dynamically

I want to query an object from DB that exists in any one of the tables. I am not sure about the table name that a particular object belongs to. For e.g. let's say my DB consists of various tables like Domestic_Passengers, Asian_Passengers, US_Passengers. And this table list may increase as well like in the future we may add the UK_Passengers table too.
So, I want to query something like
SELECT * FROM
(SELECT table_name FROM user_tables where table_name like '%PASSENGER')
WHERE NAME LIKE 'John%'
Is this possible?
That's a very bad database design.
I would suggest a view like this:
CREATE OR REPLACE VIEW PASSENGERS AS
SELECT * FROM Domestic_Passengers
UNION ALL
SELECT * FROM Asian_Passengers
UNION ALL
SELECT * FROM US_Passengers;
And then select from this view.
If this is not possible, then you need to run dynamic SQL in PL/SQL package. But this involves some code.
The best answer depends on a lot of details, such if you can create database objects, how static are the tables, and how will this query be consumed.
If you can create schema objects, and the list of tables is somewhat stable, then Wernfried's answer of building a view is probably best.
If you can create schema objects, but the list of tables is very dynamic, and your application understands ref cursors, you should probably create a function that creates the SELECT and returns it through a ref cursor, like in this answer.
If you cannot create schema objects, then you're limited to the DBMS_XMLGEN/XMLTABLE trick. In a single query, build a string for the SELECT statement you want, run it through DBMS_XMLGEN to create an XMLType, and then use XMLTABLE to transform the XML back into rows and columns. This approach is slow and ugly, but it's the only way to have dynamic SQL in SQL without creating any custom PL/SQL objects. See my answer here for an example.

What will provide better performance, a dedicated pgsql trigger function per table or a global one?

I have hundreds of tables which are configured with update triggers.
I was wondering what will be a better approach:
1. Create per table trigger function where the trigger code (which its logic is the same for all tables) is specific for the table.
2. Create a global function which knows to handle all tables by creating dynamic sql statements and configure it as all the tables' trigger function.
I was wondering if the function per tables will work faster since the pgsql can be pre-compiled and reused the function while the global function needs to create the sql statements dynamically by the table name on each call.
To make it more clear, in the per table function I can write for TableA :
insert into log_table values('TableA', x, y, z)
while in the global one I will need to write it as:
EXECUTE 'insert into log_table values(' || current_table || ', x, y, z)'
It may be better not to use dynamic sql because of execution plan caching.
See the section "39.10.2. Plan Caching" here.
But only real testing will show performance difference, if any.

TSQL substitution of key words and code blocks

I have blocks of TSQL that I want to create a MACRO for and then reuse in my SQL file. I want this to be a 'compile' time thing only.
Eg:
?set? COMMON = "Field1 int, Field2 char(1),";
?set? MAKEONE = "create table";
MAKEONE XXX (
COMMON
Field3 int
);
Please dont ask why I would want to ... :)
... it is for SQL Server.
Ok, what about conditional execution of SQL:
?set? ISYES = true;
?if? ISYES
create table AAA (...)
?else?
create table BBB (...)
What you are asking makes little sense in SQL terms
Based on your examples:
A CREATE TABLE is exactly that: a CREATE TABLE. Why Macro it? You aren't going to substitute "CREATE PROCEDURE".
Having "common" fields would indicate poor design
You also have to consider:
constraints, keys and indexes
permissions of using dynamic SQL
the cost of developing a "framework" to do what SQL already does
permissions of your objects
Now, what is the business problem you are trying to solve?
Instead of asking about your chosen solution...
Edit: question updated as I typed above:
IF (a condition)
EXEC ('CREATE TABLE ...')
ELSE IF (a condition)
EXEC ('CREATE TABLE ...')
...
Note that much of DDL in SQL must be in it's own batch or the first statement in a batch. Hence use of dynamic SQL again

PL/SQL embedded insert into table that may not exist

I much prefer using this 'embedded' style inserts in a pl/sql block (opposed to the execute immediate style dynamic sql - where you have to delimit quotes etc).
-- a contrived example
PROCEDURE CreateReport( customer IN VARCHAR2, reportdate IN DATE )
BEGIN
-- drop table, create table with explicit column list
CreateReportTableForCustomer;
INSERT INTO TEMP_TABLE
VALUES ( customer, reportdate );
END;
/
The problem here is that oracle checks if 'temp_table' exists and that it has the correct number of colunms and throws a compile error if it doesn't exist.
So I was wondering if theres any way round that?! Essentially I want to use a placeholder for the table name to trick oracle into not checking if the table exists.
EDIT:
I should have mentioned that a user is able to execute any 'report' (as above). A mechanism that will execute an arbitrary query but always write to the temp_table ( in the user's schema). Thus each time the report proc is run it drops the temp_table and recreates it with, most probably, a different column list.
You could use a dynamic SQL statement to insert into the maybe-existent temp_table, and then catch and handle the exception that occurs when the table doesn't exist.
Example:
execute immediate 'INSERT INTO '||TEMP_TABLE_NAME||' VALUES ( :customer, :reportdate )' using customer, reportdate;
Note that having the table name vary in a dynamic SQL statement is not very good, so if you ensure the table names stay the same, that would be best.
Maybe you should be using a global temporary table (GTT). These are permanent table structures that hold temporary data for an Oracle session. Many different sessions can insert data into the same GTT, and each will only be able to see their own data. The data is automatically deleted either on COMMIT or when the session ends, according to the GTT's definition.
You create the GTT (once only) like this:
create globabal temporary table my_gtt
(customer number, report_date date)
on commit delete/preserve* rows;
* delete as applicable
Then your programs can just use it like any other table - the only difference being it always begins empty for your session.
Using GTTs are much preferable to dropping/recreating tables on the fly - if your application needs a different structure for each report, I strongly suggest you work out all the different structures that each report needs, and create separate GTTs as needed by each, instead of creating ordinary tables at runtime.
That said, if this is just not feasible (and I've seen good examples when it's not, e.g. in a system that supports a wide range of ad-hoc requests from users), you'll have to go with the EXECUTE IMMEDIATE approach.

How to refresh the definition of a T-SQL user-defined function after a dependent object is redefined?

Consider the following sequence of events:
A view v_Foo is defined
A user-defined function GetFoo() is defined that includes all columns from v_Foo (using 'Select * ...')
The definition of v_Foo changes and now includes more columns
I now want GetFoo() to include the new columns in v_Foo, but it still references the old definition
I can just re-run the script that created GetFoo in the first place and all will be well; but that's problematic for reasons I won't go into. Is there any other way to refresh the definition of a user-defined function so that it's in sync with its dependent objects?
Short, easy answer is No.
You have to redefine the RETURN TABLE statement in Tabular UDF, GetFoo()
whenever the definition of v_Foo changes.
But there is a way to get around it (translated as not practical).
Create a DDL trigger on ALTER_VIEW event.
Then use a dynamic SQL to create the GetFoo().
It would be nice to see the definition of the function. All you've said is it is using SELECT *. Can you be more specific?
You also forgot to tell us what version of SQL Server you are using. If >= 2005, have you looked at sp_refreshsqlmodule?
Curious what your reasons are for insisting on SELECT *. Lots of discussion about it here, but the cons still outweigh the pros by a large margin, IMHO:
Bad habits to kick : using SELECT * / omitting the column list