I am trying to create a SQL code in SQL Server that dynamically selects all the tables in a specific database and then for each column in each table, counts the number of missing values and non-null values. I also want this result inserted into another table.
Is there any way I can do this without manually changing the column names for each:
Table Name - Column selection
I have a teradata code for the same which I tried to convert to SQL Server code. But I am unable to get the dynamic allocation and insertion parts right.
insert into temp
values (select ''CAMP'',
rtrim(''' || tablename || '''),
rtrim(''' || columnname || '''),
rtrim(''' || columnformat || '''),
count(1),
count(rtrim(upper(case when ' || columnname || '='''' then NULL else '|| columnname ||' end))),
(cast (count(rtrim(upper(case when ' || columnname || '='''' then NULL else ' || columnname || ' end))) as float) / (cast (count(1) as float))) * 100,
count(distinct rtrim(upper(case when ' || columnname || '='''' then NULL else '|| columnname ||' end))),
min(rtrim(upper(case when ' || columnname || '='''' then NULL else '|| columnname ||' end))),
max(rtrim(upper(case when ' || columnname || '='''' then NULL else '|| columnname ||' end))),
min(len(rtrim(upper(case when ' || columnname || '='''' then NULL else '|| columnname ||' end)))),
max(len(rtrim(upper(case when ' || columnname || '='''' then NULL else '|| columnname ||' end))))
from ' || tablename ||')
Any help on this front would be great!
Thanks!
Not sure if you need a UNION or a JOIN, but in either case you can just use a three-part name for the object in the other database if you are using multi-database:
USE database1; // Your database name
GO
CREATE VIEW dbo.MyView
AS
SELECT columns FROM dbo.Table1
UNION ALL
SELECT columns FROM database2.dbo.Table2; //second database
GO
select * from dbo.MyView // Getting all data from view
Hope that helps
Does something like this help you:
SELECT [Table] = t.[name]
, [Column] = c.[name]
FROM sys.tables t
INNER JOIN sys.columns c
ON c.[object_id] = t.[object_id]
Related
We are currently undertaking a testing phase which requires us to see if there is any data in each column for each table. Now, the route that is long and labour-intensive is:
SELECT COUNT(Col1), COUNT(Col2)...FROM TABLE
Is there any easier way to do this? We can go down this route by concatenating each column name from our data lineage document with the COUNT() function, but we have a lot of tables and a lot of columns in each table, making this a bit unfeasible.
Essentially we just need a count of records in each column for each table, without having to write long COUNT(Col) queries.
Thanks
This query will return accurate results if the table statistics were recently gathered with the default value for ESTIMATE_PERCENT:
SELECT utab.table_name
, tcol.column_name
, utab.num_rows
from user_tables utab,
user_tab_cols tcol
where utab.table_name = tcol.table_name
and utab.num_rows > 0
and utab.num_rows = tcol.num_nulls;
You could use a dynamic query to build the queries. This will generate all the queries.
SELECT 'SELECT COUNT(' || t.column_name || ' ) FROM ' || t.owner || '.' || t.table_name || ';' FROM dba_tab_columns t
You can generate all the select statements like so:
SELECT CASE WHEN column_id = 1 AND column_id_desc != 1 THEN 'SELECT ''' || LOWER(owner) || '.' || LOWER(table_name) || ''' table_name, ' || CHR(10) || 'COUNT(' || LOWER(column_name) || ') ' || SUBSTR(LOWER(column_name), 1, 26) || '_cnt,'
WHEN column_id = 1 AND column_id_desc = 1 THEN 'SELECT ''' || LOWER(owner) || '.' || LOWER(table_name) || ''' table_name, ' || CHR(10) || 'COUNT(' || LOWER(column_name) || ') ' || SUBSTR(LOWER(column_name), 1, 26) || '_cnt FROM ' || LOWER(owner) || '.' || LOWER(table_name) || ';'
WHEN column_id_desc = 1 THEN ' COUNT(' || LOWER(column_name) || ') ' || SUBSTR(LOWER(column_name), 1, 26) || '_cnt' || CHR(10) || 'FROM ' || LOWER(owner) || '.' || LOWER(table_name) || ';'
ELSE ' COUNT(' || LOWER(column_name) || ') ' || SUBSTR(LOWER(column_name), 1, 26) || '_cnt,'
END sql_text
FROM (SELECT owner,
table_name,
column_name,
column_id,
row_number() OVER (PARTITION BY owner, table_name ORDER BY column_id DESC) column_id_desc
FROM all_tab_columns)
WHERE <predicates to filter on the tables you're interested in>
ORDER BY owner,
table_name,
column_id;
This goes through all the tables you're interested in plus their columns and outputs text that will, when taken together, form a select statement for each table.
The text that is output in the sql_text column depends on whether the column in the list is the first or last (or both!); this way you get the full statement which queries each table once, rather than one per table and column.
You can then copy and paste the results and run that as a script.
It's can help you
SELECT
a.table_name,
a.column_name
FROM
ALL_TAB_COLUMNS a
WHERE owner = '<your user>'
AND a.SAMPLE_SIZE = a.NUM_NULLS
I have a redshift database that is being updated with new tables so I can't just manually list the tables I want. I want to get a count of the rows of all the tables from my query. So far I have:
select 'SELECT ''' || table_name || ''' as table_name, count(*) As con ' ||
'FROM ' || table_name ||
CASE WHEN lead(table_name) OVER (order by table_name ) IS NOT NULL
THEN ' UNION ALL ' END
FROM INFORMATION_SCHEMA.TABLES
WHERE TABLE_NAME LIKE '%results%'
but when I do this I get the error:
Specified types or functions (one per INFO message) not supported on Redshift tables.
I've searched a lot but I can't seem to find a solution for my problem. Any help would be greatly appreciated. Thanks!
EDIT:
I've changed my approach to this and decided to use a for loop in R to get the row counts of each but I'm running into the issue that 'row_counts' is only saving one number, not the count of each row like I want. Here is the code:
schema <- "x"
table_prefix <- "results"
geos <- ad_districts %>% filter(geo != "geo")
row_count <- list()
i = 1
for (geo in geos){
table_name <- paste0(schema, ".", table_prefix, geo)
row_count[[i]] <- dbGetQuery(con,
paste("SELECT COUNT(*) FROM", table_name))
i = i + 1
}
Your query is doing a select * for all tables, this will take a lot of time and resources. Instead use a system table to get the same info
select name, sum(rows) as rows
from stv_tbl_perm
where name like '%results%'
group by 1
[EDIT] - I think this is the root cause - some sql functions are only supported on the leader node. Try connecting to that node and re-run your SQL.
https://docs.aws.amazon.com/redshift/latest/dg/c_sql-functions-leader-node.html
Hope this helps.
select 'select count(*) as "' || table_schema || '.' || table_name || '" from ' || table_schema || '.' || table_name || ' ;' as sql_text
from information_schema.tables
;
[EDIT - refined this a bit to generate a series of statements that can be run at once]
select rownum, case when rownum > 1 then sql_text else replace(sql_text, 'union all', '') end as sql_text
from
(
select rank() over (order by sql_text DESC) as rownum,
sql_text
from
(
select 'select ''' || table_schema || ' ' || table_name || ''' , count(*) as "' || table_schema || '.' || table_name || '" from ' || table_schema || '.' || table_name || ' union all ' as sql_text
from information_schema.tables
where table_schema = 'public'
order by table_schema, table_name
)X
)Y
order by rownum desc ;
SELECT ' Select count(*) , '''+ tablename + ''' from '+'"' + tablename +'"' +' Union ALL '
FROM pg_table_def
GROUP BY tablename
Above query eliminates any table name with space. Remove UNION ALL at the end of the query and query will be ready to be executed.
I'm going mental over this. I'm fairly new to dynamic SQL, so I may just not be asking Google the right question, but here's what I'm trying to do... I have a query with dynamic SQL. When I run that query, it produces several rows. All of these rows (about 30) make up a single union query. I can copy all of those rows and paste into a new query and run - works fine, but what I need to do is run this all in a single query. I've looked up examples of using execute immediate and fetch, but I cannot seem to get them to actually spit out the data...they just end up saying something like "Executed Successfully", but doesn't actually produce any resulting rows. The resulting column name of the below SQL is "qry_txt" - instead of producing it at face value, I want to execute it as a query. Again, I may not be articulating this well, but I'm basically trying to turn 2 queries (with a manual copy/paste step involved) into a single query. Hope this makes sense...
Here's my SQL:
Select CASE when
lead(ROWNUM) over(order by ROWNUM) is null then
'SELECT '||''''||T.TABLE_NAME||''''||' as TABLE_NAME,'||''''||T.COLUMN_NAME||''''||' as COLUMN_NAME, cast('|| T.COLUMN_NAME ||' as
varchar2(100)) as SAMPLE_DATA ||
from rpt.'||T.TABLE_NAME ||' where '||T.COLUMN_NAME||' is not null and ROWNUM=1;'
else
'SELECT '||''''||T.TABLE_NAME||''''||' as TABLE_NAME,'||''''||T.COLUMN_NAME||''''||' as COLUMN_NAME, cast('|| T.COLUMN_NAME ||' as
varchar2(100)) as SAMPLE_DATA from rpt.'||T.TABLE_NAME ||' where '||T.COLUMN_NAME||' is not null and ROWNUM=1 union ' end as qry_txt
from all_tab_columns t where T.OWNER='rpt' and T.DATA_TYPE != 'BLOB' and T.DATA_TYPE != 'LONG' and T.TABLE_NAME = 'NME_DMN'
ORDER BY ROWNUM asc;
You cannot write a dynamic query in a SQL. You need to use PLSQL block to accomploish that. Please see how you can do it.
PS: Code is not tested.
declare
var1 <decalration same of column in select list> ;
var2 <decalration same of column in select list> ;
var3 <decalration same of column in select list> ;
....
varn ;
begin
for i in ( SELECT LEAD (ROWNUM) OVER (ORDER BY ROWNUM) COl1
FROM all_tab_columns t
WHERE T.OWNER = 'rpt'
AND T.DATA_TYPE != 'BLOB'
AND T.DATA_TYPE != 'LONG'
AND T.TABLE_NAME = 'NME_DMN'
ORDER BY ROWNUM ASC)
Loop
If i.col1 IS NULL Then
execute immediate 'SELECT '
|| ''''
|| T.TABLE_NAME
|| ''''
|| ' as TABLE_NAME,'
|| ''''
|| T.COLUMN_NAME
|| ''''
|| ' as COLUMN_NAME, cast('
|| T.COLUMN_NAME
|| ' as
varchar2(100)) as SAMPLE_DATA ||
from rpt.'
|| T.TABLE_NAME
|| ' where '
|| T.COLUMN_NAME
|| ' is not null and ROWNUM=1' into var1 , var2 ,var3 ....varn;
Else
execute immediate 'SELECT '
|| ''''
|| T.TABLE_NAME
|| ''''
|| ' as TABLE_NAME,'
|| ''''
|| T.COLUMN_NAME
|| ''''
|| ' as COLUMN_NAME, cast('
|| T.COLUMN_NAME
|| ' as
varchar2(100)) as SAMPLE_DATA from rpt.'
|| T.TABLE_NAME
|| ' where '
|| T.COLUMN_NAME
|| ' is not null and ROWNUM=1' into var1 , var2 ,var3 ....varn;
end if;
End Loop;
exception
when others then
dbms_output.put_lin(sqlcode ||'--'||sqlerrm);
End;
I have a table with mixed types of data (real, integrer, character ...) but i would only recover columns that have real values.
I can construct this:
SELECT 'SELECT ' || array_to_string(ARRAY(
select 'o' || '.' || c.column_name
from information_schema.columns as c
where table_name = 'final_datas'
and c.data_type = 'real'), ',') || ' FROM final_datas as o' As sqlstmt
that gives that:
"SELECT o.random,o.struct2d_pred2_num,o.pfam_num,o.transmb_num [...] FROM final_datas as o"
The i would like to create a table with these columns. Of course, do this, doesn't work:
create table table2 as (
SELECT 'SELECT ' || array_to_string(ARRAY(
select 'o' || '.' || c.column_name
from information_schema.columns as c
where table_name = 'final_datas'
and c.data_type = 'real'), ',') || ' FROM final_datas as o' As sqlstmt
)
Suggestions?
You need to generate the whole CREATE TABLE statement as dynamic SQL:
SELECT 'CREATE TABLE table2 AS SELECT ' || array_to_string(ARRAY(
select 'o' || '.' || c.column_name
from information_schema.columns as c
where table_name = 'final_datas'
and c.data_type = 'real'), ',') || ' FROM final_datas as o' As sqlstmt
The result can be run with EXECUTE sqlstmt;
I need to run a query on generated generated column names.
Here's the query:
select 'col_'||4 from MY_TABLE
Note:
"4" is a variable that is passed to this query from within the Java code
MY_TABLE is a table that contain columns with names (col_4, col_5, etc..)
Inside Oracle you need use dynamic SQL. (YourVariable value is 4 for your example)
EXECUTE IMMEDIATE ' select col_' || YourVariable || ' from MY_TABLE ';
From Java you can build any SQL and execute them
To run a dynamic SELECT statement, you have two choices:
For single row selects, you use EXECUTE IMMEDIATE ... INTO:
EXECUTE IMMEDIATE 'select col_' || l_num || ' from MY_TABLE WHERE id = 37' INTO l_result;
For selecting multiple rows, you can use a dynamic cursor:
DECLARE
TYPE MyCurType IS REF CURSOR;
my_cv MyCurType;
BEGIN
OPEN emp_cv FOR 'select col_' || l_num || ' from MY_TABLE';
...
END;
This code generates a SELECT that returns the tables with their column name:
SELECT
'SELECT ' ||(
SELECT
LISTAGG(
c.TABLE_NAME || '.' || c.COLUMN_NAME || ' AS "' || c.TABLE_NAME || '.' || c.COLUMN_NAME || '"',
', '
) WITHIN GROUP(
ORDER BY
c.TABLE_NAME
) "TABLE_NAMES"
FROM
USER_TAB_COLS c
WHERE
TABLE_NAME IN(
'PESSOA',
'PESSOA_FISICA',
'PESSOA_JURIDICA'
)
)|| 'FROM PERSON;'
FROM
DUAL;