SQL query optimization for selecting specific columns from a table - sql

I'm trying to select all tables which have columns OWNERID or L1ID or L1, but not have any of the following columns, HID, DHID, SPHID, SPHID, LINKHID, NODEID and SITEID.
The following is my SQL query. It's taking a long time. Can you give me a better solution or optimize this query.
SELECT
TABLE_NAME
FROM
USER_TABLES
WHERE
TABLE_NAME NOT IN (
SELECT TABLE_NAME
FROM USER_TAB_COLUMNS
WHERE COLUMN_NAME IN('HID', 'DHID', 'SPHID', 'LINKID', 'NODEID', 'SITEID')
GROUP BY TABLE_NAME
HAVING COUNT(*) = 1)
AND TABLE_NAME IN (
SELECT TABLE_NAME
FROM USER_TAB_COLUMNS
WHERE COLUMN_NAME IN ('OWNERID', 'L1ID', 'L1')
GROUP BY TABLE_NAME
HAVING COUNT(*) = 1)
Please don't mark this as duplicate. I have already tried other solutions and they don't seem to workout for me

Try not exist / exists instead. As long as you don't need to select values from the secondary tables, this is much faster than IN and especially NOT IN. Basic reason is that as soon as there is a match, the evaluation is done.
SELECT
TABLE_NAME
FROM USER_TABLES
WHERE NOT EXISTS
(select 1
FROM USER_TAB_COLUMNS
WHERE COLUMN_NAME IN('HID', 'DHID', 'SPHID', 'LINKID', 'NODEID', 'SITEID')
AND USER_TABLES.TABLE_NAME = USER_TAB_COLUMNS.TABLE_NAME
)
and EXISTS (
SELECT 1
FROM USER_TAB_COLUMNS
WHERE COLUMN_NAME IN ('OWNERID', 'L1ID', 'L1')
AND USER_TABLES.TABLE_NAME = USER_TAB_COLUMNS.TABLE_NAME
)
However, I am not totally sure what your COUNT(*) = 1 means. Only one of OWNERID or L1ID or L1, but not say L1ID and L1 in the same table?
My code, as written will work if you only care if one or more of your condition columns are present, i.e. how I understood your question in English. If you need only 1 of them to be present, different query is needed.

Try doing it this way:
SELECT TABLE_NAME
FROM USER_TAB_COLUMNS
GROUP BY TABLE_NAME
HAVING SUM(CASE WHEN COLUMN_NAME IN ('OWNERID', 'L1ID', 'L1') THEN 1 ELSE 0 END) = 1 AND
SUM(CASE WHEN COLUMN_NAME IN ('HID', 'DHID', 'SPHID', 'LINKID', 'NODEID', 'SITEID') THEN 1 ELSE 0 END) = 0;

With exclude AS (
select TABLE_NAME
FROM USER_TAB_COLUMNS
WHERE COLUMN_NAME IN('HID', 'DHID', 'SPHID', 'LINKID', 'NODEID', 'SITEID')
GROUP BY TABLE_NAME
)
, preInclude AS (
SELECT TABLE_NAME
FROM USER_TAB_COLUMNS
WHERE COLUMN_NAME IN ('OWNERID', 'L1ID', 'L1')
GROUP BY TABLE_NAME
)
, Include AS (
SELECT preInclude.TABLE_NAME
FROM preInclude
LEFT JOIN exclude ON preIncude.TABLE_NAME = exclude.TABLE_NAME
WHERE exclude.TABLE_NAME is NULL
)
SELECT *
FROM Include

Related

ORACLE: SQL syntax to find table with two columns with names like ID, NUM

My question is based on:
Finding table with two column names
If interested, please read the above as it covers much ground that I will not repeat here.
For the answer given, I commented as follows:
NOTE THAT You could replace the IN with = and an OR clause, but generalizing this to like may not work because the like could get more than 1 count per term: e.g.
SELECT OWNER, TABLE_NAME, count(DISTINCT COLUMN_NAME) as ourCount
FROM all_tab_cols WHERE ( (column_name LIKE '%ID%') OR (COLUMN_NAME LIKE '%NUM%') )
GROUP BY OWNER, TABLE_NAME
HAVING COUNT(DISTINCT column_name) >= 2
ORDER BY OWNER, TABLE_NAME ;
This code compiles and runs. However, it will not guarantee that the table has both a column with a name containing ID and a column with a name containging NUM, because there may be two or more columns with names like ID.
Is there a way to generalize the answer given in the above link for a like command.
GOAL: Find tables that contain two column names, one like ID (or some string) and one like NUM (or some other string).
Also, after several answers came in, as "extra credit", I re-did an answer by Ahmed to use variables in Toad, so I've added a tag for Toad as well.
You may use conditional aggregation as the following:
SELECT OWNER, TABLE_NAME, COUNT(CASE WHEN COLUMN_NAME LIKE '%ID%' THEN COLUMN_NAME END) as ID_COUNT,
COUNT(CASE WHEN COLUMN_NAME LIKE '%NUM%' THEN COLUMN_NAME END) NUM_COUNT
FROM all_tab_cols
GROUP BY OWNER, TABLE_NAME
HAVING COUNT(CASE WHEN COLUMN_NAME LIKE '%ID%' THEN COLUMN_NAME END)>=1 AND
COUNT(CASE WHEN COLUMN_NAME LIKE '%NUM%' THEN COLUMN_NAME END)>=1
ORDER BY OWNER, TABLE_NAME ;
See a demo.
If you want to select tables that contain two column names, one like ID and one like NUM, you may replace >=1 with =1 in the having clause.
If I understood you correctly, you want to return tables that contain two (or more) columns whose names contain both ID and NUM (sub)strings.
My all_tab_cols CTE mimics that data dictionary view, just to illustrate the problem.
EMP table contains 3 columns that have the ID (sub)string, but it should count as 1 (not 3); also, as that table doesn't contain any columns that have the NUM (sub)string in their name, the EMP table shouldn't be part of the result set
DEP table contains one ID and one NUM column, so it should be returned
Therefore: the TEMP CTE counts number of ID and NUM columns (duplicates are ignored). The final query expects that table contains both columns.
Sample data:
SQL> with all_tab_cols (table_name, column_name) as
2 (select 'EMP', 'ID_EMP' from dual union all
3 select 'EMP', 'ID_MGR' from dual union all
4 select 'EMP', 'SAL' from dual union all
5 select 'EMP', 'DID_ID' from dual union all
6 --
7 select 'DEP', 'ID_DEP' from dual union all
8 select 'DEP', 'DNUM' from dual union all
9 select 'DEP', 'LOC' from dual
10 ),
Query begins here:
11 temp as
12 (select table_name, column_name,
13 sum(case when regexp_count(column_name, 'ID') = 0 then 0
14 when regexp_count(column_name, 'ID') >= 1 then 1
15 end) cnt_id,
16 sum(case when regexp_count(column_name, 'NUM') = 0 then 0
17 when regexp_count(column_name, 'NUM') >= 1 then 1
18 end) cnt_num
19 from all_tab_cols
20 group by table_name, column_name
21 )
22 select table_name
23 from temp
24 group by table_name
25 having sum(cnt_id) = sum(cnt_num)
26 and sum(cnt_id) = 1;
TABLE_NAME
--------------------
DEP
SQL>
You could do a UNION ALL and then a GroupBy with a Count on a subquery to determine the tables you want by separating your query into seperate result sets, 1 based on ID and the other based on NUM:
SELECT *
FROM
(
SELECT OWNER, TABLE_NAME
FROM all_tab_cols
WHERE column_name LIKE '%ID%'
GROUP BY OWNER, TABLE_NAME
UNION ALL
SELECT OWNER, TABLE_NAME
FROM all_tab_cols
WHERE column_name LIKE '%NUM%'
GROUP BY OWNER, TABLE_NAME
) x
GROUP BY x.OWNER, x.TABLE_NAME
HAVING COUNT(x.TABLE_NAME) >= 2
ORDER BY x.OWNER, x.TABLE_NAME ;
Make functions to re-use easely:
CREATE OR REPLACE FUNCTION get_user_tables_with_collist( i_collist IN VARCHAR2 )
RETURN SYS.ODCIVARCHAR2LIST
AS
w_result SYS.ODCIVARCHAR2LIST := SYS.ODCIVARCHAR2LIST();
w_re VARCHAR2(64) := '[^,;./+=*\.\?%[:space:]-]+' ;
BEGIN
WITH collist(colname) AS (
SELECT REGEXP_SUBSTR( UPPER(i_collist), w_re, 1, LEVEL ) FROM DUAL
CONNECT BY REGEXP_SUBSTR( UPPER(i_collist), w_re, 1, LEVEL ) IS NOT NULL
)
SELECT table_name BULK COLLECT INTO w_result FROM (
SELECT table_name, COUNT(column_name) AS n FROM user_tab_columns
WHERE EXISTS(
SELECT 1 FROM collist
WHERE colname = column_name
)
GROUP BY table_name
) d
WHERE d.n = (SELECT COUNT(*) FROM collist)
;
RETURN w_result;
END ;
/
CREATE OR REPLACE FUNCTION get_all_tables_with_collist( i_owner IN VARCHAR2, i_collist IN VARCHAR2 )
RETURN SYS.ODCIVARCHAR2LIST
AS
w_result SYS.ODCIVARCHAR2LIST := SYS.ODCIVARCHAR2LIST();
w_re VARCHAR2(64) := '[^,;./+=*\.\?%[:space:]-]+' ;
BEGIN
WITH collist(colname) AS (
SELECT REGEXP_SUBSTR( UPPER(i_collist), w_re, 1, LEVEL ) FROM DUAL
CONNECT BY REGEXP_SUBSTR( UPPER(i_collist), w_re, 1, LEVEL ) IS NOT NULL
)
SELECT table_name BULK COLLECT INTO w_result FROM (
SELECT table_name, COUNT(column_name) AS n FROM all_tab_columns
WHERE EXISTS(
SELECT 1 FROM collist
WHERE colname = column_name
)
AND owner = UPPER(i_owner)
GROUP BY table_name
) d
WHERE d.n = (SELECT COUNT(*) FROM collist)
;
RETURN w_result;
END ;
/
select * from get_all_tables_with_collist('sys', 'table_name;column_name') ;
ALL_COL_COMMENTS
ALL_COL_PENDING_STATS
ALL_COL_PRIVS
...
This is essentially an "edit" of Littlefoot's answer, that I believe makes things better. I give due credit, but I was asked to make this a separate answer, so I am doing so.
11 temp as -- USE WITH IF not using the data part above
12 (select table_name, column_name,
13 sum(case when regexp_count(column_name, 'ID') = 0 then 0
14 when regexp_count(column_name, 'ID') >= 1 then 1
15 end) cnt_id,
16 sum(case when regexp_count(column_name, 'NUM') = 0 then 0
17 when regexp_count(column_name, 'NUM') >= 1 then 1
18 end) cnt_num
19 from all_tab_cols
20 group by table_name, column_name
21 )
22 select table_name
23 from temp
24 group by table_name
25 having sum(cnt_id) >= 1
26 and sum(cnt_num) >= 1;
This is a variant of the answer by Ahmed that uses conditional aggregation. I just updated it to use variables. This works in Toad. It may not work on other Oracle systems.
I think p3consulting gave a nice answer also, but the code below is shorter and somewhat easier to read (in my opinion).
For how I figured out how to add the variables in Toad, see answers by Alan in: How do I declare and use variables in PL/SQL like I do in T-SQL?
Also, to use the script variables, run in Toad with "Run as script" otherwise, one would input variables, which, to me, is not very desirable.
var searchVal1 varchar2(20);
var searchVal2 varchar2(20);
exec :searchVal1 := '%ID%';
exec :searchVal2 := '%NUM%';
SELECT OWNER, TABLE_NAME
, COUNT(CASE WHEN COLUMN_NAME LIKE :searchVal1 THEN COLUMN_NAME END) as COUNT_1,
COUNT(CASE WHEN COLUMN_NAME LIKE :searchVal2 THEN COLUMN_NAME END) as COUNT_2
FROM all_tab_cols
GROUP BY OWNER, TABLE_NAME
HAVING COUNT(CASE WHEN COLUMN_NAME LIKE :searchVal1 THEN COLUMN_NAME END)>=1 AND
COUNT(CASE WHEN COLUMN_NAME LIKE :searchVal2 THEN COLUMN_NAME END)>=1
ORDER BY OWNER, TABLE_NAME ;

How to merge two tables with different column number in Snowflake?

I am querying TABLE_SCHEMA,TABLE_NAME,CREATED,LAST_ALTERED columns from Snowflake information schema. VIEWS. Next, I would like to MERGE that table with row count for the view. Below are my queries I am running in Snowflake my issue is I am not sure how to combine these two table in 1 table ?
Note: I am new to Snowflake. Please provide code with explanation.
Thanks in advance for help!
Query 1
SELECT TABLE_SCHEMA,TABLE_NAME,CREATED,LAST_ALTERED FROM DB.SCHEMA.VIEWS
WHERE TABLE_SCHEMA="MY_SHEMA" AND TABLE_NAME IN ('VIEW_TABLE1','VIEW_TABLE2','VIEW_TABLE3')
Query 2
SELECT COUNT(*) FROM DB.SCHEMA.VIEW_TABLE1
UNION ALL SELECT COUNT(*) FROM DB.SCHEMA.VIEW_TABLE2
To get result of the COUNT(*) needs to be built dynamically and attached to the "driving query".
Sample data:
CREATE VIEW VIEW_TABLE1(c)
AS
SELECT 1;
CREATE VIEW VIEW_TABLE2(e)
AS
SELECT 2 UNION ALL SELECT 4;
CREATE VIEW VIEW_TABLE3(f)
AS
SELECT 3;
Full query:
DECLARE
QUERY STRING;
RES RESULTSET;
BEGIN
SELECT
LISTAGG(
REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(
$$SELECT '<TABLE_SCHEMA>' AS TABLE_SCHEMA,
'<TABLE_NAME>' AS TABLE_NAME,
'<CREATED>' AS CREATED,
'<LAST_ALTERED>' AS LAST_ALTERED,
COUNT(*) AS cnt
FROM <tab_name>
$$,
'<TABLE_SCHEMA>', v.TABLE_SCHEMA),
'<TABLE_NAME>', v.TABLE_NAME),
'<CREATED>', v.CREATED),
'<LAST_ALTERED>', v.LAST_ALTERED),
'<tab_name>', CONCAT_WS('.', v.table_catalog, v.table_schema, v.table_name)),
' UNION ALL ') WITHIN GROUP (ORDER BY CONCAT_WS('.', v.table_catalog, v.table_schema, v.table_name))
INTO :QUERY
FROM INFORMATION_SCHEMA.VIEWS v
WHERE TABLE_SCHEMA='PUBLIC'
AND TABLE_NAME IN ('VIEW_TABLE1','VIEW_TABLE2','VIEW_TABLE3');
RES := (EXECUTE IMMEDIATE :QUERY);
RETURN TABLE(RES);
END;
Output:
Rationale:
The ideal query would be(pseudocode):
SELECT TABLE_SCHEMA,TABLE_NAME,CREATED,LAST_ALTERED,
EVAL('SELECT COUNT(*) FROM ' || view_name) AS row_count
FROM INFORMATION_SCHEMA.VIEWS
WHERE TABLE_SCHEMA='MY_SHEMA'
AND TABLE_NAME IN ('VIEW_TABLE1','VIEW_TABLE2','VIEW_TABLE3');
Such construct EVAL(dynamic query) at SELECT list does not exist as it would require building a query on the fly and execute per each row. Though for some RDBMSes are workaround like dbms_xmlgen.getxmltype
Include table/view names as string in your count(*) queries and then you can join.
Example below -
select * from
(SELECT TABLE_SCHEMA,TABLE_NAME,CREATED FROM information_schema.tables
WHERE TABLE_SCHEMA='PUBLIC' AND TABLE_NAME IN ('D1','D2')) t1
left join
(
SELECT 'D1' table_name, COUNT(*) FROM d1
UNION ALL SELECT 'D2',COUNT(*) FROM d2) t2
on t1.table_name = t2.table_name ;
TABLE_SCHEMA
TABLE_NAME
CREATED
TABLE_NAME
COUNT(*)
PUBLIC
D1
2022-04-06 14:24:56.224 -0700
D1
12
PUBLIC
D2
2022-04-06 14:25:27.276 -0700
D2
5

Check all column name in the tables

Good day!
The database has a table that you want to archive. A copy of the table, with the addition of the prefix name "ARCH_".
e.g. Table: BALANCE. Archiving table: ARCH_BALANCE.
I need to write a query to check: that in the tables "ARCH_%" present all fields of the base tables. *also have a database table that are not archived.
I wrote the following query:
select distinct COLUMN_NAME, TABLE_NAME
from ALL_TAB_COLUMNS res
where TABLE_NAME in(
SELECT TABLE_NAME
FROM all_tables core_t
where TABLE_NAME not like 'ARCH_%' AND
EXISTS (
SELECT TABLE_NAME
FROM all_tables hist_t
WHERE hist_t.TABLE_NAME = concat('ARCH_', core_t.TABLE_NAME)
)
) and COLUMN_NAME NOT IN (
select COLUMN_NAME
from ALL_TAB_COLUMNS
where TABLE_NAME = concat('ARCH_', res.TABLE_NAME)
);
Parts of the code works, but generally runs indefinitely.
Perhaps there is somebody other ideas.
This query joins both tables and columns and displays which fields they have in common and if there is one missing, it shows it in the fourth column:
select orig.column_name
, arch.column_name
, case
when orig.column_name is null
then 'column doesn''t exist in orig'
when arch.column_name is null
then 'column doesn''t exist in arch'
else 'exists in both'
end
status
from ( select table_name
, column_name
from all_tab_columns
where table_name = 'X'
)
orig
full
outer
join ( select table_name
, column_name
from all_tab_columns
where table_name = 'ARCH_X'
)
arch
on orig.column_name = arch.column_name
The problem can be in that part of query:
SELECT TABLE_NAME
FROM all_tables core_t
where TABLE_NAME not like 'ARCH_%' AND
EXISTS (
SELECT TABLE_NAME
FROM all_tables hist_t
WHERE hist_t.TABLE_NAME = concat('ARCH_', core_t.TABLE_NAME)
)
You are trying to fetch records that match 2 condidtion in the same time:
1) TABLE_NAME not like 'ARCH_%'
2) TABLE_NAME = concat ('ARCH_',TABLE_NAME)
Those are two conditions that stays in opposite sides.
Also you could fix prefixes for column names (TABLE_NAME can be from ALL_TAB_COLUMNS table or ALL_TABLES).

count number of null and not null rows per attribute in oracle

Is there is a solution how to calculate the number of NULL and NOT NULL records per each attribute in the view?
For example there are 50 views and each one has 20 attributes and the result I'm expecting looks like (for example):
table_name -----Column_name---Nulls_count----Not_null_count------count(*)
T1 -----------------C1-------------------20---------------40-----------------------60
T1------------------C2-------------------11--------------49---------------------60
T1------------------C3-------------------25--------------35---------------------60
T2------------------C1-------------------0--------------100---------------------100
T2------------------C2-------------------40--------------60---------------------100
all of views are stored in a sys.all_views and columns are in the sys.all_tab_columns and there is a link between them by table_name field. But there is a need to use a dynamic SQL or PL/SQL becouse there is a madness to count() null rows for each attribute and then to count() not null rows for the same attributes in the views manually :)
Did anyone face with such task? I'll appreciate all your comments and help.
Since the number of rows is count(*), you can get null and non-null rows per column with:
select
count(*) total_rows ,
count(col1) col1_nonnull,
count(*) - count(col1) col1_null ,
count(col2) col2_nonnull,
count(*) - count(col2) col2_null ,
...
from
my_view
here is the correct select:
select t.table_name, T.NUM_ROWS, c.column_name, c.num_nulls, T.NUM_ROWS - c.num_nulls num_not_nulls, c.data_type, c.last_analyzed
from all_tab_cols c
join sys.all_all_tables t on C.TABLE_NAME = t.table_name
where c.table_name like 'MV_%' and c.nullable ='Y'
group by t.table_name, T.NUM_ROWS, c.column_name, c.num_nulls, c.data_type, c.last_analyzed
order by t.table_name, c.column_name;

Replacing concat operation in Where clause

We have a requirement of querying the ALL_TABLES view, based on a combination of schema name and table name.
There are two schemas "A" and "B" and they have same table "TAB1" in both of them, here my requirement is to select the table associated with schema A and not the schema B.
Currently, we are doing a concatenation operation on the table name and owner name for achieving it as shown below
There will be multiple owner and table name combinations available within a single query
select table_name from all_tables where concat(owner_name,table_name) in ('ATAB1','ATAB2','BTAB2','CTAB1')
select table_name from all_tables where concat(owner_name,table_name) not in ('ATAB1','ATAB2','BTAB2','CTAB1')
Here there are three schemas A, B and C with their respective table name combinations
How can we achieve the same result without using the CONCAT function ?
WHERE 0=1
OR (owner_name = 'A' AND table_name = 'T1')
OR (owner_name = 'B' AND table_name = 'T2')
OR (owner_name = 'A' AND table_name = 'T3')
The strange 0=1 is just to make the lines below syntactically identical for easy mainenance and/or code-generation. The optimizer removes it.
Oracle allows for multiple columns in an IN condition (see the documentation for some more examples).
select table_name
from all_tables
where (owner_name, table_name) in
(('A','TAB1'), ('A','TAB2'), ('B','TAB2'), ('C','TAB1'))
This would probably be equivalent to usr's answer in terms of performance.
You could arrange the string values you need to match against into a virtual table, then use that table in a join as a filter:
SELECT t.*
FROM all_tables t
INNER JOIN (
SELECT 'A' AS owner_name, 'TAB1' AS table_name FROM DUAL
UNION ALL SELECT 'A', 'TAB2' FROM DUAL
UNION ALL SELECT 'B', 'TAB2' FROM DUAL
UNION ALL SELECT 'C', 'TAB1' FROM DUAL
) s
ON t.owner_name = s.owner_name
AND t.table_name = s.table_name
;
I would expect this to give the query planner more room for optimisation than your present approach gives.