Trying to query a redshift within SELECT statement - sql

Current table1:
col1
-------------
schema.table1
schema.table2
schema.table3
Desired table1:
col1 col2
------------------------------------------------------------
schema.table1 value of (select count(*) from schema.table1)
schema.table2 value of (select count(*) from schema.table1)
schema.table3 value of (select count(*) from schema.table1)
It is not working, I tried using function too, but function doesn't allow to use 'FROM'
select col1, (select count(*) from col1)
from table1
I am trying to create this query in redshift. Can anyone please help me out?

To perform this task you will need a stored procedure AND a defined cursor. The stored procedure allows for looping and the cursor provides the ability to execute a newly created statement (dynamic querying).
For example:
Create the starting materials, 3 tables and a table that references these tables.
create table foo as (select 1 as A);
create table goo as (select 2 as A);
create table hoo as (select 3 as A);
create table tabs as (select 'foo' as tab union all select 'goo' union all select 'hoo');
Next define the stored procedure the will create the dynamic SQL
CREATE OR REPLACE procedure count_tabs(curs1 INOUT refcursor)
AS
$$
DECLARE
row record;
statement varchar := '';
union_needed BOOL := false;
BEGIN
for row in select tab from tabs LOOP
IF union_needed THEN
statement := statement || ' UNION ALL ';
END IF;
statement := statement || 'select \'' || row.tab || '\' as table_name, count(*) as table_count from ' || row.tab ;
union_needed := true;
END LOOP;
RAISE NOTICE 'sql to execute: %',statement;
open curs1 for execute statement;
END;
$$ LANGUAGE plpgsql;
Lastly we need to call the procedure and execute the cursor
call count_tabs('mycursor');
fetch 1000 from mycursor;
A few notes on this:
This assumes you want the results as output on your bench. If you want to create a table with the results this is doable in the same structure
Since the FROM clause value(s) is unknown at compile time this needs to be done in 2 steps - create the query and then execute the query.
I believe you can have the procedure walk this same cursor itself but doing this is exceptionally slow

Related

How to get value printed on Postgres

I have a requirement to translate it to an SQL script.
I am using the information schema to get all the columns of a table and print their distinct count.
I was able to get the count, but not able to print the column name properly,
PFA the below code.
I have to pass the value of the "colum_lbl" to my select clause, if I do so it is giving me an group by error.
So I passed the "colum_lbl" within quotes. now all the values of the result has hardcoded 'colum_lbl' as value, I have to replace it with the original value I read from the for Loop
Any other efficient method for this requirement will be very much appreciated. Thanks in advance
do $$
DECLARE
colum_lbl text;
BEGIN
DROP TABLE IF EXISTS tmp_table;
CREATE TABLE tmp_table
(
colnm varchar(50),
cnt integer
);
FOR colum_lbl IN
SELECT distinct column_name
FROM information_schema.columns
WHERE table_schema = 'cva_aggr'
AND table_name = 'employee' AND column_name in ('empid','empnm')
LOOP
EXECUTE
'Insert into tmp_table
SELECT '' || colum_lbl || '',count(distinct ' || colum_lbl || ')
FROM employee ';
END LOOP;
END; $$

How to return result of many select statements as one custom table

I have a table (let's name it source_tab) where I store list of all database tables that meet some criteria.
tab_name: description:
table1 some_desc1
table2 some_desc2
Now I need to execute a select statement on each of these tables and return a result as a table (I created custom TYPE). However I have a problem - when using bulk collect, only the last select statement is returned. The same issue was with open cursor. Is there any possibility to achieve this goal, another then concatenating all select statements using union all and executing it as one statement? And because I'm the begginer in sql, my second question is, is it ok to use this dynamic sql in terms of sql injection issues? Below is simplified version of my code:
CREATE OR REPLACE FUNCTION my_function RETURN newly_created_table_type IS
ret_tab_type newly_created_table_type;
BEGIN
for r in (select * from source_tab)
loop
execute immediate 'select value1, value2,''' || r.tab_name || ''' from ' || r.tab_name bulk collect into ret_tab_type;
end loop;
return ret_tab_type;
END;
I'm using Oracle 11.
In your case you are trying to populate a collection dynamically and wanted result in a single collection. In your case its not possible to do that in a single loop. Also as mentioned by #OldProgrammer, piperow would be a better solution from performance point. See below demo:
--Tables and Values:
CREATE TABLE SOURCE_TAB(TAB_NAME VARCHAR2(100), DESCRIPTION VARCHAR2(100));
/
SELECT * FROM SOURCE_TAB;
/
INSERT INTO SOURCE_TAB VALUES('table1','some_desc1');
INSERT INTO SOURCE_TAB VALUES('table2','some_desc2');
/
CREATE TABLE TABLE1(COL1 NUMBER, COL2 NUMBER);
/
INSERT INTO TABLE1 VALUES(1,2);
INSERT INTO TABLE1 VALUES(3,4);
INSERT INTO TABLE1 VALUES(5,6);
/
Select * from TABLE1;
/
CREATE TABLE TABLE2(COL1 NUMBER, COL2 NUMBER);
/
INSERT INTO TABLE2 VALUES(7,8);
INSERT INTO TABLE2 VALUES(9,10);
INSERT INTO TABLE2 VALUES(11,12);
/
Select * from TABLE2;
/
--Object Created
--UDT
CREATE OR REPLACE TYPE NEWLY_CREATED_TABLE_TYPE IS OBJECT (
VALUE1 NUMBER,
VALUE2 NUMBER
);
/
--Type of UDT
CREATE OR TYPE NEWLY_CRTD_TYP AS TABLE OF NEWLY_CREATED_TABLE_TYPE;
/
--Function:
--Function
CREATE OR REPLACE FUNCTION MY_FUNCTION
RETURN NEWLY_CRTD_TYP PIPELINED
AS
CURSOR CUR_TAB
IS
SELECT *
FROM SOURCE_TAB;
RET_TAB_TYPE NEWLY_CRTD_TYP;
BEGIN
FOR I IN CUR_TAB
LOOP
--Here i made sure that all the tables have col1 & col2 columns since you are using dynamic sql.
EXECUTE IMMEDIATE 'select NEWLY_CREATED_TABLE_TYPE(COL1, COL2) from '|| I.TAB_NAME
BULK COLLECT INTO RET_TAB_TYPE;
EXIT WHEN CUR_TAB%NOTFOUND;
FOR REC IN 1 .. RET_TAB_TYPE.COUNT
LOOP
PIPE ROW (RET_TAB_TYPE (REC) );
END LOOP;
END LOOP;
RETURN;
END;
/
Output:
SQL> Select * from table(MY_FUNCTION);
VALUE1 VALUE2
---------- ----------
1 2
3 4
5 6
7 8
9 10
11 12
6 rows selected.
May be you can combine all the queries into one using UNION ALL before execution, if the number and type of columns to be retrieved from all the tables are identical.
CREATE OR REPLACE FUNCTION my_function
RETURN newly_created_table_type
IS
ret_tab_type newly_created_table_type;
v_query VARCHAR2 (4000);
BEGIN
SELECT LISTAGG (' select VALUE1,VALUE2 FROM ' || tab_name, ' UNION ALL ')
WITHIN GROUP (ORDER BY tab_name)
INTO v_query
FROM source_tab;
EXECUTE IMMEDIATE v_query BULK COLLECT INTO ret_tab_type;
RETURN ret_tab_type;
END;
You could then use a single select statement to get all the values.
select * FROM TABLE ( my_function );

How to select a column from all tables in which it resides?

I have many tables that have the same column 'customer_number'.
I can get a list of all these table by query:
SELECT table_name FROM ALL_TAB_COLUMNS
WHERE COLUMN_NAME = 'customer_number';
The question is how do I get all the records that have a specific customer number from all these tables without running the same query against each of them.
To get record from a table, you have write a query against that table. So, you can't get ALL the records from tables with specified field without a query against each one of these tables.
If there is a subset of columns that you are interested in and this subset is shared among all tables, you may use UNION/UNION ALL operation like this:
select * from (
select customer_number, phone, address from table1
union all
select customer_number, phone, address from table2
union all
select customer_number, phone, address from table3
)
where customer_number = 'my number'
Or, in simple case where you just want to know what tables have records about particular client
select * from (
select 'table1' src_tbl, customer_number from table1
union all
select 'table2', customer_number from table2
union all
select 'table3', customer_number from table3
)
where customer_number = 'my number'
Otherwise you have to query each table separatelly.
DBMS_XMLGEN enables you to run dynamic SQL statements without custom PL/SQL.
Sample Schema
create table table1(customer_number number, a number, b number);
insert into table1 values(1,1,1);
create table table2(customer_number number, a number, c number);
insert into table2 values(2,2,2);
create table table3(a number, b number, c number);
insert into table3 values(3,3,3);
Query
--Get CUSTOMER_NUMBER and A from all tables with the column CUSTOMER_NUMBER.
--
--Convert XML to columns.
select
table_name,
to_number(extractvalue(xml, '/ROWSET/ROW/CUSTOMER_NUMBER')) customer_number,
to_number(extractvalue(xml, '/ROWSET/ROW/A')) a
from
(
--Get results as XML.
select table_name,
xmltype(dbms_xmlgen.getxml(
'select customer_number, a from '||table_name
)) xml
from user_tab_columns
where column_name = 'CUSTOMER_NUMBER'
);
TABLE_NAME CUSTOMER_NUMBER A
---------- --------------- -
TABLE1 1 1
TABLE2 2 2
Warnings
These overly generic solutions often have issues. They won't perform as well as a plain old SQL statements and they are more likely to run into bugs. In general, these types of solutions should be avoided for production code. But they are still very useful for ad hoc queries.
Also, this solution assumes that you want the same columns from each row. If each row is different then things get much more complicated and you may need to look into technologies like ANYDATASET.
I assume you want to automate this. Two approaches.
SQL to generate SQL scripts
.
spool run_rep.sql
set head off pages 0 lines 200 trimspool on feedback off
SELECT 'prompt ' || table_name || chr(10) ||
'select ''' || table_name ||
''' tname, CUSTOMER_NUMBER from ' || table_name || ';' cmd
FROM all_tab_columns
WHERE column_name = 'CUSTOMER_NUMBER';
spool off
# run_rep.sql
PLSQL
Similar idea to use dynamic sql:
DECLARE
TYPE rcType IS REF CURSOR;
rc rcType;
CURSOR c1 IS SELECT table_name FROM all_table_columns WHERE column_name = 'CUST_NUM';
cmd VARCHAR2(4000);
cNum NUMBER;
BEGIN
FOR r1 IN c1 LOOP
cmd := 'SELECT cust_num FROM ' || r1.table_name ;
OPEN rc FOR cmd;
LOOP
FETCH rc INTO cNum;
EXIT WHEN rc%NOTFOUND;
-- Prob best to INSERT this into a temp table and then
-- select * that to avoind DBMS_OUTPUT buffer full issues
DBMS_OUTPUT.PUT_LINE ( 'T:' || r1.table_name || ' C: ' || rc.cust_num );
END LOOP;
CLOSE rc;
END LOOP;
END;

Find all tables updated on a specific date

I'm using an Oracle DB, and I'm trying to find all tables that were updated on a certain date. All of the tables that track updates have a column called DT_UPDATE. I've been trying this:
SELECT * FROM
(SELECT TABLE_NAME FROM ALL_TAB_COLUMNS WHERE COLUMN_NAME = 'DT_UPDATE')
WHERE DT_UPDATE = <date>
But get this error:
ORA-00904: "DT_UPDATE": invalid identifier
00904. 00000 - "%s: invalid identifier"
*Cause:
*Action:
Error at Line: 3 Column: 7
I've also tried aliasing the nested Select clause.
As #zaratustra said, you have to use dynamic SQL. You can do something like this:
set serveroutput on
declare
counter number;
begin
for r in (
select owner, table_name
from all_tab_columns
where column_name = 'DT_UPDATE'
) loop
execute immediate 'select count(*) from "'
|| r.owner || '"."' || r.table_name
|| '" where dt_update = :dt and rownum = 1'
into counter
using date '2014-07-07';
if counter = 1 then
dbms_output.put_line(r.table_name);
end if;
end loop;
end;
/
For each table_name (and owner, for completeness) identified in all_tab_columns as having a column called dt_update, a new dynamic select is generated, in the form:
select count(*) from "<owner>"."<table_name>"
where dt_update = date '2014-07-07'
and rownum = 1;
The rownum = 1 filter lets the query execution stop as soon as a matching row is found; since you said you want to know which tables were updated, not how many rows or exactly which rows, if one row matches then that is all you really need to know. So for every table the dynamic query gets either 0 or 1.
For any tables that have at least one row matching the date, this printd the table name using dbms_output, so you have to have that enabled - with set serveroutput on, or with the DBMS_OUTPUT panel in SQL Developer, or your favourite client's equivalent.
If I create some tables with that column, but only populate one with the date I'm looking for:
create table tab1 (dt_update date);
create table tab2 (dt_update date);
create table tab3 (dt_update date);
insert into tab1 values (trunc(sysdate) - 1);
insert into tab2 values (trunc(sysdate));
... then running my anonymous block produces:
anonymous block completed
TAB1
Use your own target date, obviously. This assumes your date field doesn't contain a time component. If it does then you'd need to turn that into a range to cover the whole day.
You could also turn this into a pipelined function that takes a date as an argument; this also handles date fields with time elements:
create or replace function get_updated_tables(p_date date)
return sys.odcivarchar2list pipelined as
counter number;
begin
for r in (
select owner, table_name
from all_tab_columns
where column_name = 'DT_UPDATE'
) loop
execute immediate 'select count(*) from "'
|| r.owner || '"."' || r.table_name
|| '" where dt_update >= :dt1 and dt_update < :dt2'
|| ' and rownum = 1'
into counter
using p_date, p_date + interval '1' day;
if counter = 1 then
pipe row (r.table_name);
end if;
end loop;
end;
/
Then you can query it with:
select column_value from table(get_updated_tables(date '2014-07-07'));
COLUMN_VALUE
------------------------------
TAB1
Dynamic SQL is interesting, as you said in a comment, but should only be used when necessary. The generated statement can't be parsed until it's executed, so you might not spot syntax or other errors until run-time. Also make sure you use bind variables for values (but not object names) to avoid SQL injection.
Let's assume we have three tables with the field dt_update, and each of them has one record (doesn't matter if more):
create table tt1 (
dt_update date
);
insert into tt1 values (sysdate);
create table tt2 (
dt_update date
);
insert into tt2 values (sysdate - 1);
create table tt3 (
dt_update date
);
insert into tt3 values (sysdate - 2);
This PL/SQL anonym block prints only tables' names that have record with the value of the column dt_update more than or equals today:
declare
type table_names_tp is table of user_tables.table_name%type index by binary_integer;
table_names table_names_tp;
l_res number(1);
l_deadline date := to_date('2014-07-08', 'YYYY-MM-DD');
begin
select table_name
BULK COLLECT INTO table_names
from user_tab_columns
where lower(column_name) = 'dt_update'
;
for i in table_names.first..table_names.last
loop
execute immediate 'select count(*) from dual where exists (select null from ' || table_names(i) || ' where dt_update >= :dead_line)'
into l_res
using l_deadline;
if l_res = 1
then
DBMS_OUTPUT.put_line('Table ' || table_names(i) || ' was updated after ' || l_deadline);
end if;
end loop;
end;
You can use this code as an example to start writing your code. Pay carefully attention to protect yourself from SQL injections, DO NOT(!) use concatenation of your values, always use bind variables instead. It also helps you to store a cached query plan in SGA, the application will read data from the SGA area and perform soft parsing.

Dynamically select the columns to be used in a SELECT statement

I would love to be able to use the system tables (Oracle in this case) to drive which fields are used in a SELECT statement. Something like:
SELECT
(
select column_name
from all_tab_cols
where table_Name='CLARITY_SER'
AND OWNER='CLARITY'
AND data_type='DATE'
)
FROM CLARITY_SER
This syntax doesn't work, as the subquery returns multiple rows, instead of one row with multiple columns.
Is it possible to generate a SQL statement dynamically by querying the table schema information in order to select only certain columns?
** edit **
Do this without using a function or procedure, if possible.
You can do this:
declare
l_sql varchar2(32767);
rc sys_refcursor;
begin
l_sql := 'select ';
for r in
( select column_name
from all_tab_cols
where table_Name='CLARITY_SER'
AND OWNER='CLARITY'
AND data_type='DATE'
)
loop
l_sql := l_sql || r.column_name || ',';
end loop;
l_sql := rtrim(l_sql,',') || ' from clarity_ser';
open rc for l_sql;
...
end;
No, it's not possible to specify a column list dynamically in SQL. You'll need to use a procedural language to run the first query, use that to construct a second query, then run the second query.
You could use dynamic SQL. Create a function that takes the table name, owner, data type, executes the inner query and returns a comma-separated list of column names, or an array table if you prefer. Then construct the outer query and execute it with execute immediate.
CREATE FUNCTION get_column_list(
table_name IN varchar2,
owner_name IN varchar2,
data_type IN varchar2)
RETURN varchar2
IS
BEGIN
...... (get columns and return comma-separated list)
END;
/
If your function returns a comma-separated list you can inline it:
execute immediate 'select ' || get_column_list(table_name, owner_name, datatype) || ' from ' || table_name
Admittedly it's a long time since I played with oracle so I may be a bit off but I'm pretty sure this is quite doable.
In SQLPlus you could do this:
COLUMN cols NEW_VALUE cols
SELECT max( ltrim( sys_connect_by_path( column_name, ',' ), ',' ) ) cols
FROM
(
select rownum rn, column_name
from all_tab_cols
where table_Name='CLARITY_SER'
and OWNER='CLARITY'
AND data_type='DATE'
)
start with rn = 1 connect by rn = prior rn +1
;
select &cols from clarity.clarity_ser;