Replacing concat operation in Where clause - sql

We have a requirement of querying the ALL_TABLES view, based on a combination of schema name and table name.
There are two schemas "A" and "B" and they have same table "TAB1" in both of them, here my requirement is to select the table associated with schema A and not the schema B.
Currently, we are doing a concatenation operation on the table name and owner name for achieving it as shown below
There will be multiple owner and table name combinations available within a single query
select table_name from all_tables where concat(owner_name,table_name) in ('ATAB1','ATAB2','BTAB2','CTAB1')
select table_name from all_tables where concat(owner_name,table_name) not in ('ATAB1','ATAB2','BTAB2','CTAB1')
Here there are three schemas A, B and C with their respective table name combinations
How can we achieve the same result without using the CONCAT function ?

WHERE 0=1
OR (owner_name = 'A' AND table_name = 'T1')
OR (owner_name = 'B' AND table_name = 'T2')
OR (owner_name = 'A' AND table_name = 'T3')
The strange 0=1 is just to make the lines below syntactically identical for easy mainenance and/or code-generation. The optimizer removes it.

Oracle allows for multiple columns in an IN condition (see the documentation for some more examples).
select table_name
from all_tables
where (owner_name, table_name) in
(('A','TAB1'), ('A','TAB2'), ('B','TAB2'), ('C','TAB1'))
This would probably be equivalent to usr's answer in terms of performance.

You could arrange the string values you need to match against into a virtual table, then use that table in a join as a filter:
SELECT t.*
FROM all_tables t
INNER JOIN (
SELECT 'A' AS owner_name, 'TAB1' AS table_name FROM DUAL
UNION ALL SELECT 'A', 'TAB2' FROM DUAL
UNION ALL SELECT 'B', 'TAB2' FROM DUAL
UNION ALL SELECT 'C', 'TAB1' FROM DUAL
) s
ON t.owner_name = s.owner_name
AND t.table_name = s.table_name
;
I would expect this to give the query planner more room for optimisation than your present approach gives.

Related

How, in Oracle, to SELECT FROM a table or another depending on certain criteria?

I have a query that must choose from one table if certain criteria is meet or from another if the criteria is different.
In my case i need to select from Table_A if we are on Database_A or from Table_B if we are on Database_B, but the criteria could be a different one.
I want to do something like this:
SELECT
COLUMN_1, COLUMN_2, ..., COLUMN_N
FROM
(if database is Database_A then Table_A
but if database is Database_B then from Table B
else... it could be DUAL if any other situation, it will throw an error, but i can manage it)
WHERE
(my where criteria)
How can i do it with pure SQL not PL-SQL? I can use nested queries or WITH or similar stuff, but not "coding". I cannot create tables or views, i can only query the database for data.
I tried using CASE or other options, with no luck.
You may use the database name in the condition criteria and UNION operator to select from the right table.
SELECT COLUMN_1, COLUMN_2, ..., COLUMN_N
FROM
(
SELECT COLUMN_1, COLUMN_2, ..., COLUMN_N
FROM TableA
WHERE ora_database_name = 'DatabaseA'
UNION ALL
SELECT COLUMN_1, COLUMN_2, ..., COLUMN_N
FROM TableB
WHERE ora_database_name = 'DatabaseB'
UNION ALL
SELECT 'ERROR', null, ..., null
FROM DUAL
WHERE ora_database_name NOT IN ('DatabaseA', 'DatabaseB')
)
WHERE
(my where criteria)
If it is the case of selecting from one of two tables then something like this could help:
WITH
tab_a AS
( Select COL_1, COL_2, COL_3 From TABLE_A ), -- It is possible to do the filter here or in the main SQL or both
tab_b AS
( Select COL_1, COL_2, COL_3 From TABLE_B ) -- It is possible to do the filter here or in the main SQL or both
Select DISTINCT
CASE WHEN 1 = 1 THEN a.COL_1 ELSE b.COL_1 END "COL_1",
CASE WHEN 1 = 1 THEN a.COL_2 ELSE b.COL_2 END "COL_2",
CASE WHEN 1 = 1 THEN a.COL_3 ELSE b.COL_3 END "COL_3"
From
tab_a a
Inner Join
tab_b b ON(1 = 1)
Where
CASE WHEN 1 = 1 THEN b.COL_1 ELSE a.COL_2 END = 'XXX' -- filter could be set on the same columns or on different ones of the same type
You can put the where clause either in the cte definition (WITH clause) or in the main SQL or in both.
WHEN condition within CASE expresions should be put acording to your data context and that is the place that will select/filter the data from one or another table.
INNER JOIN ON condition may stay as is (1=1) for it creates cross link that will be solved by the DISTINCT keyword.
If both tables are not reachable from both databases then you shall have a separate scripts, one for each db. You can, though, check which is the active db and/or if there is a specific table.
-- to get the db name ...
Select NAME "DB_NAME" From V$database
-- ... to check if the table exist ...
Select TABLE_NAME from ALL_TABLES where TABLE_NAME = 'TABLE_A'
-- ... or some additional info besides the table name
Select
u.USER_ID, u.USERNAME,
t.TABLE_NAME, t.OWNER "TABLE_OWNER"
From ALL_USERS u
Inner Join ALL_TABLES t ON(t.OWNER = u.USERNAME)
Where TABLE_NAME = 'TABLE_A'

How to merge two tables with different column number in Snowflake?

I am querying TABLE_SCHEMA,TABLE_NAME,CREATED,LAST_ALTERED columns from Snowflake information schema. VIEWS. Next, I would like to MERGE that table with row count for the view. Below are my queries I am running in Snowflake my issue is I am not sure how to combine these two table in 1 table ?
Note: I am new to Snowflake. Please provide code with explanation.
Thanks in advance for help!
Query 1
SELECT TABLE_SCHEMA,TABLE_NAME,CREATED,LAST_ALTERED FROM DB.SCHEMA.VIEWS
WHERE TABLE_SCHEMA="MY_SHEMA" AND TABLE_NAME IN ('VIEW_TABLE1','VIEW_TABLE2','VIEW_TABLE3')
Query 2
SELECT COUNT(*) FROM DB.SCHEMA.VIEW_TABLE1
UNION ALL SELECT COUNT(*) FROM DB.SCHEMA.VIEW_TABLE2
To get result of the COUNT(*) needs to be built dynamically and attached to the "driving query".
Sample data:
CREATE VIEW VIEW_TABLE1(c)
AS
SELECT 1;
CREATE VIEW VIEW_TABLE2(e)
AS
SELECT 2 UNION ALL SELECT 4;
CREATE VIEW VIEW_TABLE3(f)
AS
SELECT 3;
Full query:
DECLARE
QUERY STRING;
RES RESULTSET;
BEGIN
SELECT
LISTAGG(
REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(
$$SELECT '<TABLE_SCHEMA>' AS TABLE_SCHEMA,
'<TABLE_NAME>' AS TABLE_NAME,
'<CREATED>' AS CREATED,
'<LAST_ALTERED>' AS LAST_ALTERED,
COUNT(*) AS cnt
FROM <tab_name>
$$,
'<TABLE_SCHEMA>', v.TABLE_SCHEMA),
'<TABLE_NAME>', v.TABLE_NAME),
'<CREATED>', v.CREATED),
'<LAST_ALTERED>', v.LAST_ALTERED),
'<tab_name>', CONCAT_WS('.', v.table_catalog, v.table_schema, v.table_name)),
' UNION ALL ') WITHIN GROUP (ORDER BY CONCAT_WS('.', v.table_catalog, v.table_schema, v.table_name))
INTO :QUERY
FROM INFORMATION_SCHEMA.VIEWS v
WHERE TABLE_SCHEMA='PUBLIC'
AND TABLE_NAME IN ('VIEW_TABLE1','VIEW_TABLE2','VIEW_TABLE3');
RES := (EXECUTE IMMEDIATE :QUERY);
RETURN TABLE(RES);
END;
Output:
Rationale:
The ideal query would be(pseudocode):
SELECT TABLE_SCHEMA,TABLE_NAME,CREATED,LAST_ALTERED,
EVAL('SELECT COUNT(*) FROM ' || view_name) AS row_count
FROM INFORMATION_SCHEMA.VIEWS
WHERE TABLE_SCHEMA='MY_SHEMA'
AND TABLE_NAME IN ('VIEW_TABLE1','VIEW_TABLE2','VIEW_TABLE3');
Such construct EVAL(dynamic query) at SELECT list does not exist as it would require building a query on the fly and execute per each row. Though for some RDBMSes are workaround like dbms_xmlgen.getxmltype
Include table/view names as string in your count(*) queries and then you can join.
Example below -
select * from
(SELECT TABLE_SCHEMA,TABLE_NAME,CREATED FROM information_schema.tables
WHERE TABLE_SCHEMA='PUBLIC' AND TABLE_NAME IN ('D1','D2')) t1
left join
(
SELECT 'D1' table_name, COUNT(*) FROM d1
UNION ALL SELECT 'D2',COUNT(*) FROM d2) t2
on t1.table_name = t2.table_name ;
TABLE_SCHEMA
TABLE_NAME
CREATED
TABLE_NAME
COUNT(*)
PUBLIC
D1
2022-04-06 14:24:56.224 -0700
D1
12
PUBLIC
D2
2022-04-06 14:25:27.276 -0700
D2
5

How to extract the table name from a CREATE/UPDATE/INSERT statement in an SQL query?

I am trying to parse the table being created, inserted into or updated from the following sql queries stored in a table column.
Let's call the table column query. Following is some sample data to demonstrate variations in how the data could look like.
with sample_data as (
select 1 as id, 'CREATE TABLE tbl1 ...' as query union all
select 2 as id, 'CREATE OR REPLACE TABLE tbl1 ...' as query union all
select 3 as id, 'DROP TABLE IF EXISTS tbl1; CREATE TABLE tbl1 ...' as query union all
select 4 as id, 'INSERT /*some comment*/ INTO tbl2 ...' as query union all
select 5 as id, 'INSERT /*some comment*/ INTO tbl2 ...' as query union all
select 6 as id, 'UPDATE tbl3 SET col1 = ...' as query union all
select 7 as id, '/*some garbage comments*/ UPDATE tbl3 SET col1 = ...' as query union all
select 8 as id, 'DELETE tbl4 ...' as query
),
Following are the formats of the queries (we are trying to extract table_name ):
#1
some optional statements like drop table
CREATE some comments or optional statement like OR REPLACE TABLE table_name
everything else
#2
some optional statements like drop table
INSERT some comments INTO some comments table_name
#3
some optional statements like drop table
UPDATE some comments table_name
everything else
Regular Expression
To construct a suitable regex, let's start with the following relatively simple/readable version:
((CREATE( OR REPLACE)?|DROP) TABLE( IF EXISTS)?|UPDATE|DELETE|INSERT INTO) ([^\s\/*]+)
All the spaces above could be replaced with "at least one whitespace character", i.e. \s+. But we also need to allow comments. For a comment that looks like /*anything*/ the regex looks like \/\*.*\*\/ (where the comment characters are escaped with \ and "anything" is the .* in the middle). Given there could be multiple such comments, optionally separated by whitespace, we end up with (\s*\/\*.*\*\/\s*?)*\s+. Plugging this in everywhere there was a space gives:
((CREATE((\s*\/\*.*\*\/\s*?)*\s+OR(\s*\/\*.*\*\/\s*?)*\s+REPLACE)?|DROP)(\s*\/\*.*\*\/\s*?)*\s+TABLE((\s*\/\*.*\*\/\s*?)*\s+IF(\s*\/\*.*\*\/\s*?)*\s+EXISTS)?|UPDATE|DELETE|INSERT(\s*\/\*.*\*\/\s*?)*\s+INTO)(\s*\/\*.*\*\/\s*?)*\s+([^\s\/*]+)
One further refinement needs to be made: Bracketed expressions have been used for choices, e.g. (CHOICE1|CHOICE2). But this syntax includes them as capturing groups. Actually we only require one capturing group for the table name so we can exclude all the other capturing groups via ?:, e.g. (?:CHOICE1|CHOICE2). This gives:
(?:(?:CREATE(?:(?:\s*\/\*.*\*\/\s*?)*\s+OR(?:\s*\/\*.*\*\/\s*?)*\s+REPLACE)?|DROP)(?:\s*\/\*.*\*\/\s*?)*\s+TABLE(?:(?:\s*\/\*.*\*\/\s*?)*\s+IF(?:\s*\/\*.*\*\/\s*?)*\s+EXISTS)?|UPDATE|DELETE|INSERT(?:\s*\/\*.*\*\/\s*?)*\s+INTO)(?:\s*\/\*.*\*\/\s*?)*\s+([^\s\/*]+)
Online Regex Demo
Here's a demo of it working with your examples: Regex101 demo
SQL
The Google BigQuery documentation for REGEXP_EXTRACT says it will return the substring matched by the capturing group. So I'd expect something like this to work:
with sample_data as (
select 1 as id, 'CREATE TABLE tbl1 ...' as query union all
select 2 as id, 'CREATE OR REPLACE TABLE tbl1 ...' as query union all
select 3 as id, 'DROP TABLE IF EXISTS tbl1; CREATE TABLE tbl1 ...' as query union all
select 4 as id, 'INSERT /*some comment*/ INTO tbl2 ...' as query union all
select 5 as id, 'INSERT /*some comment*/ INTO tbl2 ...' as query union all
select 6 as id, 'UPDATE tbl3 SET col1 = ...' as query union all
select 7 as id, '/*some garbage comments*/ UPDATE tbl3 SET col1 = ...' as query union all
select 8 as id, 'DELETE tbl4 ...' as query
)
SELECT
*, REGEXP_EXTRACT(query, r"(?:(?:CREATE(?:(?:\s*\/\*.*\*\/\s*?)*\s+OR(?:\s*\/\*.*\*\/\s*?)*\s+REPLACE)?|DROP)(?:\s*\/\*.*\*\/\s*?)*\s+TABLE(?:(?:\s*\/\*.*\*\/\s*?)*\s+IF(?:\s*\/\*.*\*\/\s*?)*\s+EXISTS)?|UPDATE|DELETE|INSERT(?:\s*\/\*.*\*\/\s*?)*\s+INTO)(?:\s*\/\*.*\*\/\s*?)*\s+([^\s\/*]+)") AS table_name
FROM sample_data;
(The above is untested so please let me know in the comments if there are any issues.)
I think it really depends on your data, but you might find some success using an approach like this:
with data as (
select 1 as id, 'CREATE TABLE tbl1 ...' as query union all
select 2 as id, 'INSERT INTO tbl2 ...' as query union all
select 3 as id, 'UPDATE tbl3 ...' as query union all
select 4 as id, 'DELETE tbl4 ...' as query
),
splitted as (
select id, split(query, ' ') as query_parts from data
)
select
id,
case
when query_parts[safe_offset(0)] in('CREATE', 'INSERT') then query_parts[safe_offset(2)]
when query_parts[safe_offset(0)] in('UPDATE', 'DELETE') then query_parts[safe_offset(1)]
else 'Error'
end as table_name
from splitted
Of course this depends on the cleanliness and syntax in your query column. Also, if your table_name is qualified with project.table.dataset you would need to do further splitting.

Postgres PL/pgSQL To Consolidate Columns Existing Across Various Tables

I am implementing a tool to clean up all customer names across various tables in a schema called stage. The customer names could be coming from columns billing_acc_name or cust_acc_names. I do not know in advance how many tables have these columns, but as long as they do, they will be part of clean up.
However, prior to clean up, I need to select all unique customer names across the tables in the schema.
For better separation of concerns, I am looking at implementing this in PL/pgSQL. Currently, this is how I'm implementing this in Python/pandas/SQLAlchemy etc.
table_name = 'information_schema.columns'
table_schema_src = 'stage'
cols = ['billing_acc_name', 'cust_acc_name']
# get list of all table names and column names to query in stage schema
sql = text(f"""
SELECT table_name, column_name FROM {table_name} WHERE table_schema ='{table_schema_src}'
AND column_name = ANY(ARRAY{cols})
""")
src = pd.read_sql(sql, con=engine)
# explore implementation in pgsql
# establish query string
cnames = []
for i, row in src.iterrows():
s = text(f"""
SELECT DISTINCT upper({row['column_name']}) AS cname FROM stage.{row['table_name']}
""")
cnames.append(str(s).strip())
sql = ' UNION '.join(cnames)
df = pd.read_sql(sql, con=engine)
The auto-generated SQL query string are then as below:
SELECT DISTINCT upper(cust_acc_name) AS cname FROM stage.journal_2017_companyA UNION
SELECT DISTINCT upper(billing_acc_name) AS cname FROM stage.journal_2017_companyA UNION
SELECT DISTINCT upper(cust_acc_name) AS cname FROM stage.journal_2017_companyB UNION
SELECT DISTINCT upper(billing_acc_name) AS cname FROM stage.journal_2017_companyB UNION
SELECT DISTINCT upper(cust_acc_name) AS cname FROM stage.journal_2017_companyC UNION
SELECT DISTINCT upper(billing_acc_name) AS cname FROM stage.journal_2017_companyC UNION
SELECT DISTINCT upper(cust_acc_name) AS cname FROM stage.journal_2017_companyD UNION
SELECT DISTINCT upper(billing_acc_name) AS cname FROM stage.journal_2017_companyD
The plpgsql function may look like this:
create or replace function select_acc_names(_schema text)
returns setof text language plpgsql as $$
declare
rec record;
begin
for rec in
select table_name, column_name
from information_schema.columns
where table_schema = _schema
and column_name = any(array['cust_acc_name', 'billing_acc_name'])
loop
return query
execute format ($fmt$
select upper(%I) as cname
from %I.%I
$fmt$, rec.column_name, _schema, rec.table_name);
end loop;
end $$;
Use:
select *
from select_acc_names('stage');

Check all column name in the tables

Good day!
The database has a table that you want to archive. A copy of the table, with the addition of the prefix name "ARCH_".
e.g. Table: BALANCE. Archiving table: ARCH_BALANCE.
I need to write a query to check: that in the tables "ARCH_%" present all fields of the base tables. *also have a database table that are not archived.
I wrote the following query:
select distinct COLUMN_NAME, TABLE_NAME
from ALL_TAB_COLUMNS res
where TABLE_NAME in(
SELECT TABLE_NAME
FROM all_tables core_t
where TABLE_NAME not like 'ARCH_%' AND
EXISTS (
SELECT TABLE_NAME
FROM all_tables hist_t
WHERE hist_t.TABLE_NAME = concat('ARCH_', core_t.TABLE_NAME)
)
) and COLUMN_NAME NOT IN (
select COLUMN_NAME
from ALL_TAB_COLUMNS
where TABLE_NAME = concat('ARCH_', res.TABLE_NAME)
);
Parts of the code works, but generally runs indefinitely.
Perhaps there is somebody other ideas.
This query joins both tables and columns and displays which fields they have in common and if there is one missing, it shows it in the fourth column:
select orig.column_name
, arch.column_name
, case
when orig.column_name is null
then 'column doesn''t exist in orig'
when arch.column_name is null
then 'column doesn''t exist in arch'
else 'exists in both'
end
status
from ( select table_name
, column_name
from all_tab_columns
where table_name = 'X'
)
orig
full
outer
join ( select table_name
, column_name
from all_tab_columns
where table_name = 'ARCH_X'
)
arch
on orig.column_name = arch.column_name
The problem can be in that part of query:
SELECT TABLE_NAME
FROM all_tables core_t
where TABLE_NAME not like 'ARCH_%' AND
EXISTS (
SELECT TABLE_NAME
FROM all_tables hist_t
WHERE hist_t.TABLE_NAME = concat('ARCH_', core_t.TABLE_NAME)
)
You are trying to fetch records that match 2 condidtion in the same time:
1) TABLE_NAME not like 'ARCH_%'
2) TABLE_NAME = concat ('ARCH_',TABLE_NAME)
Those are two conditions that stays in opposite sides.
Also you could fix prefixes for column names (TABLE_NAME can be from ALL_TAB_COLUMNS table or ALL_TABLES).