How to select values from a table, whose name is derived from another table? - sql

I have a table called folder that stores the name of others tables (named fileXXX, where X is a digit), having the same structure, in the same Postgres DB.
I want to build up a SQL statement that retrieve the name of all the fileXXX tables in the DB from the folder table and create a single SQL Statement with this structure
SELECT * FROM _file001_
UNION
SELECT * FROM _file002_
UNION
SELECT * FROM _file003_
...
I've found a lot of example on how to use SELECT statements in the WHERE clause, but none for using one in the FROM clause, in such this way.

It is possible to write a function for that (see here)
demo:db<>fiddle
Query all table names from information schema:
SELECT table_name
FROM information_schema.tables
WHERE table_name LIKE 'file%'
Instead of SELECT table_name write include table name into query string
SELECT
'SELECT * FROM ' || table_name
...
Group every result row with string_agg, use UNION ALL as delimiter:
SELECT
string_agg(/*see (2)*/, ' UNION ALL ')
...
This results in your query you mentioned in the question.
Last this string can be interpreted as real query and can be executed within this function:
CREATE OR REPLACE function union_all() returns table (ids int) AS $$
declare
_t text := '';
begin
SELECT
string_agg('SELECT * FROM ' || table_name, ' UNION ALL ')
into _t
FROM information_schema.tables
WHERE table_name LIKE 'file%';
return query execute _t;
end;$$ language plpgsql;
call this function:
SELECT * FROM union_all()

Related

Trying to query a redshift within SELECT statement

Current table1:
col1
-------------
schema.table1
schema.table2
schema.table3
Desired table1:
col1 col2
------------------------------------------------------------
schema.table1 value of (select count(*) from schema.table1)
schema.table2 value of (select count(*) from schema.table1)
schema.table3 value of (select count(*) from schema.table1)
It is not working, I tried using function too, but function doesn't allow to use 'FROM'
select col1, (select count(*) from col1)
from table1
I am trying to create this query in redshift. Can anyone please help me out?
To perform this task you will need a stored procedure AND a defined cursor. The stored procedure allows for looping and the cursor provides the ability to execute a newly created statement (dynamic querying).
For example:
Create the starting materials, 3 tables and a table that references these tables.
create table foo as (select 1 as A);
create table goo as (select 2 as A);
create table hoo as (select 3 as A);
create table tabs as (select 'foo' as tab union all select 'goo' union all select 'hoo');
Next define the stored procedure the will create the dynamic SQL
CREATE OR REPLACE procedure count_tabs(curs1 INOUT refcursor)
AS
$$
DECLARE
row record;
statement varchar := '';
union_needed BOOL := false;
BEGIN
for row in select tab from tabs LOOP
IF union_needed THEN
statement := statement || ' UNION ALL ';
END IF;
statement := statement || 'select \'' || row.tab || '\' as table_name, count(*) as table_count from ' || row.tab ;
union_needed := true;
END LOOP;
RAISE NOTICE 'sql to execute: %',statement;
open curs1 for execute statement;
END;
$$ LANGUAGE plpgsql;
Lastly we need to call the procedure and execute the cursor
call count_tabs('mycursor');
fetch 1000 from mycursor;
A few notes on this:
This assumes you want the results as output on your bench. If you want to create a table with the results this is doable in the same structure
Since the FROM clause value(s) is unknown at compile time this needs to be done in 2 steps - create the query and then execute the query.
I believe you can have the procedure walk this same cursor itself but doing this is exceptionally slow

How to get value printed on Postgres

I have a requirement to translate it to an SQL script.
I am using the information schema to get all the columns of a table and print their distinct count.
I was able to get the count, but not able to print the column name properly,
PFA the below code.
I have to pass the value of the "colum_lbl" to my select clause, if I do so it is giving me an group by error.
So I passed the "colum_lbl" within quotes. now all the values of the result has hardcoded 'colum_lbl' as value, I have to replace it with the original value I read from the for Loop
Any other efficient method for this requirement will be very much appreciated. Thanks in advance
do $$
DECLARE
colum_lbl text;
BEGIN
DROP TABLE IF EXISTS tmp_table;
CREATE TABLE tmp_table
(
colnm varchar(50),
cnt integer
);
FOR colum_lbl IN
SELECT distinct column_name
FROM information_schema.columns
WHERE table_schema = 'cva_aggr'
AND table_name = 'employee' AND column_name in ('empid','empnm')
LOOP
EXECUTE
'Insert into tmp_table
SELECT '' || colum_lbl || '',count(distinct ' || colum_lbl || ')
FROM employee ';
END LOOP;
END; $$

How to use pl/pgSQL to handle 'comma separated list' returns?

I'am trying UNION ALL many tables into a new table.The columns of the old tables are the same, but the order of the columns is different, so the below SQL statement will get wrong result:
CREATE TABLE sum_7_2018_xia_weijian
AS
(
SELECT * FROM huiwen
UNION
SELECT * FROM penglai
UNION
SELECT * FROM baoluo
UNION
SELECT * FROM dongge
UNION
SELECT * FROM resultdonglu
UNION
SELECT * FROM resultwencheng
UNION
SELECT * FROM tan_illeg
);
I finally corrected it, but the SQL statements is too redundant:
step 1. get column names of one of the old tables named huiwen
SELECT string_agg(column_name, ',')
FROM information_schema.columns
WHERE table_schema = 'public' AND table_name = 'huiwen';
results:
> string_agg
> ----------------------------------------------------------------------
>
> gid,id,geom,sxm,sxdm,sxxzqdm,xzqhdm,xzmc,sfzgjsyd,sfkfbj,sfjbnt,sfld,sflyhx,sfhyhx
step 2. union tables as a new table. I copy the string_agg of table huiwen to each SELECT-UNION to keep the order of columns, this is clumsy.
CREATE TABLE sum_2018_xia_weijian
AS
(
SELECT gid,id,geom,sxm,sxdm,sxxzqdm,xzqhdm,xzmc,sfzgjsyd,sfkfbj,sfjbnt,sfld,sflyhx,sfhyhx
FROM huiwen
UNION ALL
SELECT gid,id,geom,sxm,sxdm,sxxzqdm,xzqhdm,xzmc,sfzgjsyd,sfkfbj,sfjbnt,sfld,sflyhx,sfhyhx
FROM penglai
UNION ALL
SELECT gid,id,geom,sxm,sxdm,sxxzqdm,xzqhdm,xzmc,sfzgjsyd,sfkfbj,sfjbnt,sfld,sflyhx,sfhyhx
FROM baoluo
);
results:
> Query returned successfully: 2206 rows affected, 133 msec execution time.
I tried to do some optimization by pl/pgSQL using Declarations of variable to handle column names, but failed to find any SQL data type can handle this. Using of RECORD result Pseudo-Types ERROR:
CREATE or replace FUNCTION ct() RETURNS RECORD AS $$
DECLARE
clms RECORD;
BEGIN
SELECT column_name INTO clms
FROM information_schema.columns
WHERE table_schema = 'public' AND table_name = 'huiwen';
RETURN clms;
END;
$$ LANGUAGE plpgsql;
CREATE TABLE sum_2018_xia_weijian
AS
(
SELECT ct() FROM huiwen
UNION ALL
SELECT ct() FROM penglai
UNION ALL
SELECT ct() FROM baoluo
UNION ALL
SELECT ct() FROM dongge
UNION ALL
SELECT ct() FROM resultdonglu
UNION ALL
SELECT ct() FROM resultwencheng
UNION ALL
SELECT ct() FROM tan_illeg
);
You may use STRING_AGG twice for getting the UNION ALL. You can get all the columns in specific order by explicitly ordering it by column_name in the string_agg.
Here's a generic function which takes an array of tables and a final table name.
CREATE or replace FUNCTION fn_create_tab(tname_arr TEXT[], p_tab_name TEXT)
RETURNS VOID AS $$
DECLARE
l_select TEXT;
BEGIN
select STRING_AGG(query,' UNION ALL ' ) INTO l_select
FROM
(
SELECT 'select ' || string_agg( column_name,','
ORDER BY column_name ) || ' from ' || table_name as query
FROM information_schema.columns
WHERE table_schema = 'public' AND table_name = ANY (tname_arr)
GROUP BY table_name
) s;
IF l_select IS NOT NULL
THEN
EXECUTE format ('DROP TABLE IF EXISTS %I',p_tab_name);
EXECUTE format ('create table %I AS %s',p_tab_name,l_select);
END IF;
END;
$$ LANGUAGE plpgsql;
Now, run the function like this:
select fn_create_tab(ARRAY['huiwen','penglai'],'sum_2018_xia_weijian');
Instead of making the programming block complex you can follow some below concepts from the documentation of Union or Union All as it says :
The number of columns in all queries must be the same.
The corresponding columns must have the compatible data type.
The column names of the first query determine the column names of the combined result set.
The GROUP BY and HAVING clauses are applied to each individual query, not the final result set.
The ORDER BY clause is applied to the combined result set, not within the individual result set.
By following the 3rd point make your Union query adjusted to refer to the table whose column order is expected in the result.

Display column based on certain condition

I want to display column based on certain condition, is it possible for that as i do this getting an error.
select (
select column_name
from all_tab_cols
where table_name='BED_2016_MAR_CIT4114A_FYP1_G_'
and column_name like '%na%'
)
from BED_2016_MAR_CIT4114A_FYP1_G_;
A SQL query must list the columns it is using explicitly. You can do what you want using dynamic SQL (execute immediate). For example:
declare
sql varchar2(4000);
cols varchar2(4000);
begin
select listagg(column_name, ',') within group (order by column_name)
into cols
from all_tab_cols
where table_name = 'BED_2016_MAR_CIT4114A_FYP1_G_' and column_name like '%na%' ;
sql := '
create table newtab as
select #cols
from BED_2016_MAR_CIT4114A_FYP1_G_';
sql := replace(sql, '#cols', cols);
execute immediate sql;
end;
select *
from newtab;
Looks like the problem is in the where condition:
where table_name='BED_2016_MAR_CIT4114A_FYP1_G_'
and column_name like '%na%'
Remove the condition for table_name. You are already selecting from BED_2016_MAR_CIT4114A_FYP1_G_ so only columns in that table will be shown.
Here is a more simplified version of the query:
select column_name
from BED_2016_MAR_CIT4114A_FYP1_G_
where column_name like '%na%'
Hope that helped

How to select a column from all tables in which it resides?

I have many tables that have the same column 'customer_number'.
I can get a list of all these table by query:
SELECT table_name FROM ALL_TAB_COLUMNS
WHERE COLUMN_NAME = 'customer_number';
The question is how do I get all the records that have a specific customer number from all these tables without running the same query against each of them.
To get record from a table, you have write a query against that table. So, you can't get ALL the records from tables with specified field without a query against each one of these tables.
If there is a subset of columns that you are interested in and this subset is shared among all tables, you may use UNION/UNION ALL operation like this:
select * from (
select customer_number, phone, address from table1
union all
select customer_number, phone, address from table2
union all
select customer_number, phone, address from table3
)
where customer_number = 'my number'
Or, in simple case where you just want to know what tables have records about particular client
select * from (
select 'table1' src_tbl, customer_number from table1
union all
select 'table2', customer_number from table2
union all
select 'table3', customer_number from table3
)
where customer_number = 'my number'
Otherwise you have to query each table separatelly.
DBMS_XMLGEN enables you to run dynamic SQL statements without custom PL/SQL.
Sample Schema
create table table1(customer_number number, a number, b number);
insert into table1 values(1,1,1);
create table table2(customer_number number, a number, c number);
insert into table2 values(2,2,2);
create table table3(a number, b number, c number);
insert into table3 values(3,3,3);
Query
--Get CUSTOMER_NUMBER and A from all tables with the column CUSTOMER_NUMBER.
--
--Convert XML to columns.
select
table_name,
to_number(extractvalue(xml, '/ROWSET/ROW/CUSTOMER_NUMBER')) customer_number,
to_number(extractvalue(xml, '/ROWSET/ROW/A')) a
from
(
--Get results as XML.
select table_name,
xmltype(dbms_xmlgen.getxml(
'select customer_number, a from '||table_name
)) xml
from user_tab_columns
where column_name = 'CUSTOMER_NUMBER'
);
TABLE_NAME CUSTOMER_NUMBER A
---------- --------------- -
TABLE1 1 1
TABLE2 2 2
Warnings
These overly generic solutions often have issues. They won't perform as well as a plain old SQL statements and they are more likely to run into bugs. In general, these types of solutions should be avoided for production code. But they are still very useful for ad hoc queries.
Also, this solution assumes that you want the same columns from each row. If each row is different then things get much more complicated and you may need to look into technologies like ANYDATASET.
I assume you want to automate this. Two approaches.
SQL to generate SQL scripts
.
spool run_rep.sql
set head off pages 0 lines 200 trimspool on feedback off
SELECT 'prompt ' || table_name || chr(10) ||
'select ''' || table_name ||
''' tname, CUSTOMER_NUMBER from ' || table_name || ';' cmd
FROM all_tab_columns
WHERE column_name = 'CUSTOMER_NUMBER';
spool off
# run_rep.sql
PLSQL
Similar idea to use dynamic sql:
DECLARE
TYPE rcType IS REF CURSOR;
rc rcType;
CURSOR c1 IS SELECT table_name FROM all_table_columns WHERE column_name = 'CUST_NUM';
cmd VARCHAR2(4000);
cNum NUMBER;
BEGIN
FOR r1 IN c1 LOOP
cmd := 'SELECT cust_num FROM ' || r1.table_name ;
OPEN rc FOR cmd;
LOOP
FETCH rc INTO cNum;
EXIT WHEN rc%NOTFOUND;
-- Prob best to INSERT this into a temp table and then
-- select * that to avoind DBMS_OUTPUT buffer full issues
DBMS_OUTPUT.PUT_LINE ( 'T:' || r1.table_name || ' C: ' || rc.cust_num );
END LOOP;
CLOSE rc;
END LOOP;
END;