How to merge two tables with different column number in Snowflake?

How to merge two tables with different column number in Snowflake? - sql

I am querying TABLE_SCHEMA,TABLE_NAME,CREATED,LAST_ALTERED columns from Snowflake information schema. VIEWS. Next, I would like to MERGE that table with row count for the view. Below are my queries I am running in Snowflake my issue is I am not sure how to combine these two table in 1 table ?
Note: I am new to Snowflake. Please provide code with explanation.
Thanks in advance for help!
Query 1
SELECT TABLE_SCHEMA,TABLE_NAME,CREATED,LAST_ALTERED FROM DB.SCHEMA.VIEWS
WHERE TABLE_SCHEMA="MY_SHEMA" AND TABLE_NAME IN ('VIEW_TABLE1','VIEW_TABLE2','VIEW_TABLE3')
Query 2
SELECT COUNT(*) FROM DB.SCHEMA.VIEW_TABLE1
UNION ALL SELECT COUNT(*) FROM DB.SCHEMA.VIEW_TABLE2

To get result of the COUNT(*) needs to be built dynamically and attached to the "driving query".
Sample data:
CREATE VIEW VIEW_TABLE1(c)
AS
SELECT 1;
CREATE VIEW VIEW_TABLE2(e)
AS
SELECT 2 UNION ALL SELECT 4;
CREATE VIEW VIEW_TABLE3(f)
AS
SELECT 3;
Full query:
DECLARE
QUERY STRING;
RES RESULTSET;
BEGIN
SELECT
LISTAGG(
REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(
$$SELECT '<TABLE_SCHEMA>' AS TABLE_SCHEMA,
'<TABLE_NAME>' AS TABLE_NAME,
'<CREATED>' AS CREATED,
'<LAST_ALTERED>' AS LAST_ALTERED,
COUNT(*) AS cnt
FROM <tab_name>
$$,
'<TABLE_SCHEMA>', v.TABLE_SCHEMA),
'<TABLE_NAME>', v.TABLE_NAME),
'<CREATED>', v.CREATED),
'<LAST_ALTERED>', v.LAST_ALTERED),
'<tab_name>', CONCAT_WS('.', v.table_catalog, v.table_schema, v.table_name)),
' UNION ALL ') WITHIN GROUP (ORDER BY CONCAT_WS('.', v.table_catalog, v.table_schema, v.table_name))
INTO :QUERY
FROM INFORMATION_SCHEMA.VIEWS v
WHERE TABLE_SCHEMA='PUBLIC'
AND TABLE_NAME IN ('VIEW_TABLE1','VIEW_TABLE2','VIEW_TABLE3');
RES := (EXECUTE IMMEDIATE :QUERY);
RETURN TABLE(RES);
END;
Output:
Rationale:
The ideal query would be(pseudocode):
SELECT TABLE_SCHEMA,TABLE_NAME,CREATED,LAST_ALTERED,
EVAL('SELECT COUNT(*) FROM ' || view_name) AS row_count
FROM INFORMATION_SCHEMA.VIEWS
WHERE TABLE_SCHEMA='MY_SHEMA'
AND TABLE_NAME IN ('VIEW_TABLE1','VIEW_TABLE2','VIEW_TABLE3');
Such construct EVAL(dynamic query) at SELECT list does not exist as it would require building a query on the fly and execute per each row. Though for some RDBMSes are workaround like dbms_xmlgen.getxmltype

Include table/view names as string in your count(*) queries and then you can join.
Example below -
select * from
(SELECT TABLE_SCHEMA,TABLE_NAME,CREATED FROM information_schema.tables
WHERE TABLE_SCHEMA='PUBLIC' AND TABLE_NAME IN ('D1','D2')) t1
left join
(
SELECT 'D1' table_name, COUNT(*) FROM d1
UNION ALL SELECT 'D2',COUNT(*) FROM d2) t2
on t1.table_name = t2.table_name ;
TABLE_SCHEMA
TABLE_NAME
CREATED
TABLE_NAME
COUNT(*)
PUBLIC
D1
2022-04-06 14:24:56.224 -0700
D1
12
PUBLIC
D2
2022-04-06 14:25:27.276 -0700
D2
5

Related

ORACLE: SQL syntax to find table with two columns with names like ID, NUM

My question is based on:
Finding table with two column names
If interested, please read the above as it covers much ground that I will not repeat here.
For the answer given, I commented as follows:
NOTE THAT You could replace the IN with = and an OR clause, but generalizing this to like may not work because the like could get more than 1 count per term: e.g.
SELECT OWNER, TABLE_NAME, count(DISTINCT COLUMN_NAME) as ourCount
FROM all_tab_cols WHERE ( (column_name LIKE '%ID%') OR (COLUMN_NAME LIKE '%NUM%') )
GROUP BY OWNER, TABLE_NAME
HAVING COUNT(DISTINCT column_name) >= 2
ORDER BY OWNER, TABLE_NAME ;
This code compiles and runs. However, it will not guarantee that the table has both a column with a name containing ID and a column with a name containging NUM, because there may be two or more columns with names like ID.
Is there a way to generalize the answer given in the above link for a like command.
GOAL: Find tables that contain two column names, one like ID (or some string) and one like NUM (or some other string).
Also, after several answers came in, as "extra credit", I re-did an answer by Ahmed to use variables in Toad, so I've added a tag for Toad as well.

You may use conditional aggregation as the following:
SELECT OWNER, TABLE_NAME, COUNT(CASE WHEN COLUMN_NAME LIKE '%ID%' THEN COLUMN_NAME END) as ID_COUNT,
COUNT(CASE WHEN COLUMN_NAME LIKE '%NUM%' THEN COLUMN_NAME END) NUM_COUNT
FROM all_tab_cols
GROUP BY OWNER, TABLE_NAME
HAVING COUNT(CASE WHEN COLUMN_NAME LIKE '%ID%' THEN COLUMN_NAME END)>=1 AND
COUNT(CASE WHEN COLUMN_NAME LIKE '%NUM%' THEN COLUMN_NAME END)>=1
ORDER BY OWNER, TABLE_NAME ;
See a demo.
If you want to select tables that contain two column names, one like ID and one like NUM, you may replace >=1 with =1 in the having clause.

If I understood you correctly, you want to return tables that contain two (or more) columns whose names contain both ID and NUM (sub)strings.
My all_tab_cols CTE mimics that data dictionary view, just to illustrate the problem.
EMP table contains 3 columns that have the ID (sub)string, but it should count as 1 (not 3); also, as that table doesn't contain any columns that have the NUM (sub)string in their name, the EMP table shouldn't be part of the result set
DEP table contains one ID and one NUM column, so it should be returned
Therefore: the TEMP CTE counts number of ID and NUM columns (duplicates are ignored). The final query expects that table contains both columns.
Sample data:
SQL> with all_tab_cols (table_name, column_name) as
2 (select 'EMP', 'ID_EMP' from dual union all
3 select 'EMP', 'ID_MGR' from dual union all
4 select 'EMP', 'SAL' from dual union all
5 select 'EMP', 'DID_ID' from dual union all
6 --
7 select 'DEP', 'ID_DEP' from dual union all
8 select 'DEP', 'DNUM' from dual union all
9 select 'DEP', 'LOC' from dual
10 ),
Query begins here:
11 temp as
12 (select table_name, column_name,
13 sum(case when regexp_count(column_name, 'ID') = 0 then 0
14 when regexp_count(column_name, 'ID') >= 1 then 1
15 end) cnt_id,
16 sum(case when regexp_count(column_name, 'NUM') = 0 then 0
17 when regexp_count(column_name, 'NUM') >= 1 then 1
18 end) cnt_num
19 from all_tab_cols
20 group by table_name, column_name
21 )
22 select table_name
23 from temp
24 group by table_name
25 having sum(cnt_id) = sum(cnt_num)
26 and sum(cnt_id) = 1;
TABLE_NAME
--------------------
DEP
SQL>

You could do a UNION ALL and then a GroupBy with a Count on a subquery to determine the tables you want by separating your query into seperate result sets, 1 based on ID and the other based on NUM:
SELECT *
FROM
(
SELECT OWNER, TABLE_NAME
FROM all_tab_cols
WHERE column_name LIKE '%ID%'
GROUP BY OWNER, TABLE_NAME
UNION ALL
SELECT OWNER, TABLE_NAME
FROM all_tab_cols
WHERE column_name LIKE '%NUM%'
GROUP BY OWNER, TABLE_NAME
) x
GROUP BY x.OWNER, x.TABLE_NAME
HAVING COUNT(x.TABLE_NAME) >= 2
ORDER BY x.OWNER, x.TABLE_NAME ;

Make functions to re-use easely:
CREATE OR REPLACE FUNCTION get_user_tables_with_collist( i_collist IN VARCHAR2 )
RETURN SYS.ODCIVARCHAR2LIST
AS
w_result SYS.ODCIVARCHAR2LIST := SYS.ODCIVARCHAR2LIST();
w_re VARCHAR2(64) := '[^,;./+=*\.\?%[:space:]-]+' ;
BEGIN
WITH collist(colname) AS (
SELECT REGEXP_SUBSTR( UPPER(i_collist), w_re, 1, LEVEL ) FROM DUAL
CONNECT BY REGEXP_SUBSTR( UPPER(i_collist), w_re, 1, LEVEL ) IS NOT NULL
)
SELECT table_name BULK COLLECT INTO w_result FROM (
SELECT table_name, COUNT(column_name) AS n FROM user_tab_columns
WHERE EXISTS(
SELECT 1 FROM collist
WHERE colname = column_name
)
GROUP BY table_name
) d
WHERE d.n = (SELECT COUNT(*) FROM collist)
;
RETURN w_result;
END ;
/
CREATE OR REPLACE FUNCTION get_all_tables_with_collist( i_owner IN VARCHAR2, i_collist IN VARCHAR2 )
RETURN SYS.ODCIVARCHAR2LIST
AS
w_result SYS.ODCIVARCHAR2LIST := SYS.ODCIVARCHAR2LIST();
w_re VARCHAR2(64) := '[^,;./+=*\.\?%[:space:]-]+' ;
BEGIN
WITH collist(colname) AS (
SELECT REGEXP_SUBSTR( UPPER(i_collist), w_re, 1, LEVEL ) FROM DUAL
CONNECT BY REGEXP_SUBSTR( UPPER(i_collist), w_re, 1, LEVEL ) IS NOT NULL
)
SELECT table_name BULK COLLECT INTO w_result FROM (
SELECT table_name, COUNT(column_name) AS n FROM all_tab_columns
WHERE EXISTS(
SELECT 1 FROM collist
WHERE colname = column_name
)
AND owner = UPPER(i_owner)
GROUP BY table_name
) d
WHERE d.n = (SELECT COUNT(*) FROM collist)
;
RETURN w_result;
END ;
/
select * from get_all_tables_with_collist('sys', 'table_name;column_name') ;
ALL_COL_COMMENTS
ALL_COL_PENDING_STATS
ALL_COL_PRIVS
...

This is essentially an "edit" of Littlefoot's answer, that I believe makes things better. I give due credit, but I was asked to make this a separate answer, so I am doing so.
11 temp as -- USE WITH IF not using the data part above
12 (select table_name, column_name,
13 sum(case when regexp_count(column_name, 'ID') = 0 then 0
14 when regexp_count(column_name, 'ID') >= 1 then 1
15 end) cnt_id,
16 sum(case when regexp_count(column_name, 'NUM') = 0 then 0
17 when regexp_count(column_name, 'NUM') >= 1 then 1
18 end) cnt_num
19 from all_tab_cols
20 group by table_name, column_name
21 )
22 select table_name
23 from temp
24 group by table_name
25 having sum(cnt_id) >= 1
26 and sum(cnt_num) >= 1;

This is a variant of the answer by Ahmed that uses conditional aggregation. I just updated it to use variables. This works in Toad. It may not work on other Oracle systems.
I think p3consulting gave a nice answer also, but the code below is shorter and somewhat easier to read (in my opinion).
For how I figured out how to add the variables in Toad, see answers by Alan in: How do I declare and use variables in PL/SQL like I do in T-SQL?
Also, to use the script variables, run in Toad with "Run as script" otherwise, one would input variables, which, to me, is not very desirable.
var searchVal1 varchar2(20);
var searchVal2 varchar2(20);
exec :searchVal1 := '%ID%';
exec :searchVal2 := '%NUM%';
SELECT OWNER, TABLE_NAME
, COUNT(CASE WHEN COLUMN_NAME LIKE :searchVal1 THEN COLUMN_NAME END) as COUNT_1,
COUNT(CASE WHEN COLUMN_NAME LIKE :searchVal2 THEN COLUMN_NAME END) as COUNT_2
FROM all_tab_cols
GROUP BY OWNER, TABLE_NAME
HAVING COUNT(CASE WHEN COLUMN_NAME LIKE :searchVal1 THEN COLUMN_NAME END)>=1 AND
COUNT(CASE WHEN COLUMN_NAME LIKE :searchVal2 THEN COLUMN_NAME END)>=1
ORDER BY OWNER, TABLE_NAME ;

Query to count all rows in all Snowflake views

I'm trying to get an count of all the rows in a set of views in my Snowflake database.
The built-in row_count from information_schema.tables is not present in information_schema.views, unfortunately.
It seems I'd need to count all rows in each view, something like:
with view_name as (select table_name
from account_usage.views
where table_schema = 'ACCESS' and RIGHT(table_name,7) = 'CURRENT'
)
select count (*) from view_name;
But that returns only one results, instead of one for each line
If I change the select to include the view name, i.e.
select concat('Rows in ', view_name), count (*) from view_name;
…it returns the error "invalid identifier 'VIEW_NAME' (line 5)"
How can I show all results and include the view name?

You can create a query that looks at the information_schema to create a query that will go view by view getting its count:
select listagg(xx, ' union all ')
from (
select 'select count(*) c, \'' || x || '\' v from ' || x as xx
from (
select TABLE_CATALOG ||'.'|| TABLE_SCHEMA ||'."'||TABLE_NAME||'"' x
from KNOEMA_FORECAST_DATA_ATLAS.INFORMATION_SCHEMA.VIEWS
where table_schema='FORECAST'
)
)
See also How to find the number of rows for all views in a schema?

Check all column name in the tables

Good day!
The database has a table that you want to archive. A copy of the table, with the addition of the prefix name "ARCH_".
e.g. Table: BALANCE. Archiving table: ARCH_BALANCE.
I need to write a query to check: that in the tables "ARCH_%" present all fields of the base tables. *also have a database table that are not archived.
I wrote the following query:
select distinct COLUMN_NAME, TABLE_NAME
from ALL_TAB_COLUMNS res
where TABLE_NAME in(
SELECT TABLE_NAME
FROM all_tables core_t
where TABLE_NAME not like 'ARCH_%' AND
EXISTS (
SELECT TABLE_NAME
FROM all_tables hist_t
WHERE hist_t.TABLE_NAME = concat('ARCH_', core_t.TABLE_NAME)
)
) and COLUMN_NAME NOT IN (
select COLUMN_NAME
from ALL_TAB_COLUMNS
where TABLE_NAME = concat('ARCH_', res.TABLE_NAME)
);
Parts of the code works, but generally runs indefinitely.
Perhaps there is somebody other ideas.

This query joins both tables and columns and displays which fields they have in common and if there is one missing, it shows it in the fourth column:
select orig.column_name
, arch.column_name
, case
when orig.column_name is null
then 'column doesn''t exist in orig'
when arch.column_name is null
then 'column doesn''t exist in arch'
else 'exists in both'
end
status
from ( select table_name
, column_name
from all_tab_columns
where table_name = 'X'
)
orig
full
outer
join ( select table_name
, column_name
from all_tab_columns
where table_name = 'ARCH_X'
)
arch
on orig.column_name = arch.column_name

The problem can be in that part of query:
SELECT TABLE_NAME
FROM all_tables core_t
where TABLE_NAME not like 'ARCH_%' AND
EXISTS (
SELECT TABLE_NAME
FROM all_tables hist_t
WHERE hist_t.TABLE_NAME = concat('ARCH_', core_t.TABLE_NAME)
)
You are trying to fetch records that match 2 condidtion in the same time:
1) TABLE_NAME not like 'ARCH_%'
2) TABLE_NAME = concat ('ARCH_',TABLE_NAME)
Those are two conditions that stays in opposite sides.
Also you could fix prefixes for column names (TABLE_NAME can be from ALL_TAB_COLUMNS table or ALL_TABLES).

Oracle SQL : Retrieving non-existing values from IN clause

Having following query:
select table_name
from user_tables
where table_name in ('A','B','C','D','E','F');
Assuming only user_tables records B,C, and F exist, I want to retrieve the non-existing values A,D and E. This is a simple example, on real world the list can be huge.

A good way to generate fake rows is with a standard collection such as sys.odcivarchar2list:
select
tables_to_check.table_name,
case when user_tables.table_name is null then 'No' else 'Yes'end table_exists
from
(
select column_value table_name
from table(sys.odcivarchar2list('does not exist', 'TEST1'))
) tables_to_check
left join user_tables
on tables_to_check.table_name = user_tables.table_name
order by tables_to_check.table_name;
TABLE_NAME TABLE_EXISTS
---------- ------------
TEST1 Yes
does not exist No

if you have list of all those tables to be checked in Table1 then you can use NOT EXISTS clause
select name
from Table1 T1
where not exists ( select 1 from
user_tables U
where T1.name = U.table_name)

Only way is to use NOT EXISTS by converting the IN clause String into a Table of values.(CTE)
This is not a clean solution though. As The maximum length of IN clause expression is going to be 4000 only, including the commas..
WITH MY_STRING(str) AS
(
SELECT q'#'A','B','C','D','E','F'#' FROM DUAL
),
VALUES_TABLE AS
(
SELECT TRIM(BOTH '''' FROM REGEXP_SUBSTR(str,'[^,]+',1,level)) as table_name FROM MY_STRING
CONNECT BY LEVEL <= REGEXP_COUNT(str,',')
)
SELECT ME.* FROM VALUES_TABLE ME
WHERE NOT EXISTS
(SELECT 'X' FROM user_tables u
WHERE u.table_name = ME.table_name);

You can't. These values have to be entered into a temporary table at the least to do the desired operation. Also Oracle's IN clause list cannot be huge (i.e, not more than 1000 values).

Are you restricted to receiving those values as a comma delimited list?
instead of creating a comma delimited list with the source values, populate an array (or a table).
pass the array into a pl/sql procedure (or pull a cursor from the table).
loop through the array(cursor) and use a dynamic cusror to select count(table_name) from user_tables where table_name = value_pulled.
insert into table B when count(table_name) = 0.
then you can select all from table B
select * from tab1;
------------------
A
B
C
D
E
F
Create or replace procedure proc1 as
cursor c is select col1 from tab1;
r tab1.col1%type;
i number;
begin
open c;
loop
fetch c into r;
exit when c%notfound;
select count(tname) into i from tab where tname = r;
if i = 0 then
v_sql := 'insert into tab2 values ('''||r||''');
execute immediate v_sql;
commit;
end if;
end loop;
close c;
end proc1;
select * from tab2;
------------------
A
D
E
if this is not a one-off, then having this proc on hand will be handy.

SQL query optimization for selecting specific columns from a table

I'm trying to select all tables which have columns OWNERID or L1ID or L1, but not have any of the following columns, HID, DHID, SPHID, SPHID, LINKHID, NODEID and SITEID.
The following is my SQL query. It's taking a long time. Can you give me a better solution or optimize this query.
SELECT
TABLE_NAME
FROM
USER_TABLES
WHERE
TABLE_NAME NOT IN (
SELECT TABLE_NAME
FROM USER_TAB_COLUMNS
WHERE COLUMN_NAME IN('HID', 'DHID', 'SPHID', 'LINKID', 'NODEID', 'SITEID')
GROUP BY TABLE_NAME
HAVING COUNT(*) = 1)
AND TABLE_NAME IN (
SELECT TABLE_NAME
FROM USER_TAB_COLUMNS
WHERE COLUMN_NAME IN ('OWNERID', 'L1ID', 'L1')
GROUP BY TABLE_NAME
HAVING COUNT(*) = 1)
Please don't mark this as duplicate. I have already tried other solutions and they don't seem to workout for me

Try not exist / exists instead. As long as you don't need to select values from the secondary tables, this is much faster than IN and especially NOT IN. Basic reason is that as soon as there is a match, the evaluation is done.
SELECT
TABLE_NAME
FROM USER_TABLES
WHERE NOT EXISTS
(select 1
FROM USER_TAB_COLUMNS
WHERE COLUMN_NAME IN('HID', 'DHID', 'SPHID', 'LINKID', 'NODEID', 'SITEID')
AND USER_TABLES.TABLE_NAME = USER_TAB_COLUMNS.TABLE_NAME
)
and EXISTS (
SELECT 1
FROM USER_TAB_COLUMNS
WHERE COLUMN_NAME IN ('OWNERID', 'L1ID', 'L1')
AND USER_TABLES.TABLE_NAME = USER_TAB_COLUMNS.TABLE_NAME
)
However, I am not totally sure what your COUNT(*) = 1 means. Only one of OWNERID or L1ID or L1, but not say L1ID and L1 in the same table?
My code, as written will work if you only care if one or more of your condition columns are present, i.e. how I understood your question in English. If you need only 1 of them to be present, different query is needed.

Try doing it this way:
SELECT TABLE_NAME
FROM USER_TAB_COLUMNS
GROUP BY TABLE_NAME
HAVING SUM(CASE WHEN COLUMN_NAME IN ('OWNERID', 'L1ID', 'L1') THEN 1 ELSE 0 END) = 1 AND
SUM(CASE WHEN COLUMN_NAME IN ('HID', 'DHID', 'SPHID', 'LINKID', 'NODEID', 'SITEID') THEN 1 ELSE 0 END) = 0;

With exclude AS (
select TABLE_NAME
FROM USER_TAB_COLUMNS
WHERE COLUMN_NAME IN('HID', 'DHID', 'SPHID', 'LINKID', 'NODEID', 'SITEID')
GROUP BY TABLE_NAME
)
, preInclude AS (
SELECT TABLE_NAME
FROM USER_TAB_COLUMNS
WHERE COLUMN_NAME IN ('OWNERID', 'L1ID', 'L1')
GROUP BY TABLE_NAME
)
, Include AS (
SELECT preInclude.TABLE_NAME
FROM preInclude
LEFT JOIN exclude ON preIncude.TABLE_NAME = exclude.TABLE_NAME
WHERE exclude.TABLE_NAME is NULL
)
SELECT *
FROM Include

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

How to merge two tables with different column number in Snowflake? - sql

Related

ORACLE: SQL syntax to find table with two columns with names like ID, NUM

Query to count all rows in all Snowflake views

Check all column name in the tables

Oracle SQL : Retrieving non-existing values from IN clause

SQL query optimization for selecting specific columns from a table

Categories

Resources