How can i get a count(*) of all the columns in a table? Using PostgreSql - sql

I have bunch of tables where several of them have hundreds of columns. I need to get a count of non-null values for each column and I've been doing it manually. I would like to figure out a way to get all the counts for all the columns in a table. I looked up stackoverflow and google, but unable to find the answer.
I tried this but it's just returning a value of 1 for each column. I know it's just counting the number of column and not the values in each column. Any suggestions?
select count(COLUMN_NAME)
from information_schema.columns
where table_schema = 'schema_name'
and table_name = 'table_name'
group by COLUMN_NAME

COUNT(column_name) always gives you the count of NON NULL values.
Create a generic function like this which can take schema name and table name as arguments.
Here I am constructing select statements joined together by UNION ALLs each returning the value of the column_name and it's count for all columns when executed dynamically.
CREATE OR REPLACE FUNCTION public.get_count( TEXT, TEXT )
RETURNS TABLE(t_column_name TEXT, t_count BIGINT )
LANGUAGE plpgsql
AS $BODY$
DECLARE
p_schema TEXT := $1;
p_tabname TEXT := $2;
v_sql_statement TEXT;
BEGIN
SELECT STRING_AGG( 'SELECT '''
|| column_name
|| ''','
|| ' count('
|| column_name
|| ') FROM '
|| table_schema
|| '.'
|| table_name
,' UNION ALL ' ) INTO v_sql_statement
FROM information_schema.columns
WHERE table_schema = p_schema
AND table_name = p_tabname;
IF v_sql_statement IS NOT NULL THEN
RETURN QUERY EXECUTE v_sql_statement;
END IF;
END
$BODY$;
Execution
knayak=# select c.col, c.count from
public.get_count( 'public', 'employees' ) as c(col,count);
col | count
----------------+-------
employee_id | 107
first_name | 107
last_name | 107
email | 107
phone_number | 107
hire_date | 107
job_id | 107
salary | 107
commission_pct | 35
manager_id | 106
department_id | 106
(11 rows)

There's not really a magic way to do this. If you need to check each of 100 different columns to see how many non-null values there are, you'll have to specify each of the columns of the table.
About the best you can do is use the system catalogs to help write your queries:
select 'SUM(CASE WHEN ' + column_name + ' IS NULL THEN 1 ELSE 0 END) AS ' + column_name
from information_schema.columns
where table_schema = 'schema_name'
and table_name = 'table_name'
and is_nullable = 'YES'
You may need to add quoted identifiers if you've got spaces or other special characters in your column names.
Then you can copy that output to another query and add the missing parts of the query. I've added and is_nullable = 'YES' because it's a waste of time to check NOT NULL columns. As far as I know, that column is present in PostgreSQL.

Try this
SELECT
sum(case when column1 is not null then 1 else 0 end) as col1_not_null_count
sum(case when column2 is not null then 1 else 0 end) as col2_not_null_count
sum(case when column3 is not null then 1 else 0 end) as col3_not_null_count
FROM information_schema.columns
WHERE table_schema = 'schema_name'
AND table_name = 'table_name'

The best way I've found to do this is to write a case statement to make a non-null value in a column become a 1 and a null become a 0. Then I sum the case to get a count of the non-null values:
SELECT SUM(CASE WHEN COLUMN_NAME1 IS NULL THEN 1 ELSE 0 END) AS COL1_COUNT
, SUM(CASE WHEN COLUMN_NAME2 IS NULL THEN 1 ELSE 0 END) AS COL2_COUNT
FROM TABLE_NAME
I see in your select that you are looking at the information_schema.columns table. You can dynamically generate the code above by a select from that table:
SELECT ', SUM(CASE WHEN ' + column_name + ' IS NULL THEN 1 ELSE 0 END) AS ' + column_name + '_COUNT'
FROM information_schema.columns
WHERE table_schema = 'schema_name'
AND table_name = 'table_name'
You can also dynamically create a different select for every column in the table in question:
SELECT 'SELECT SUM(CASE WHEN ' + column_name + ' IS NULL THEN 1 ELSE 0 END) AS ' + column_name + '_COUNT FROM ' + table_schema + '.' + table_name
FROM information_schema.columns
WHERE table_schema = 'schema_name'
AND table_name = 'table_name'

I came here looking to answer the same question, but I wanted to do this without making a function, as I don't always have the ability to do that in the databases I'm working with. But the databases I work with have the tblfunc module installed, which has the crosstab function, that takes an sql string as input. I was able to produce sql that I could use in that function to get all the column counts from any table. Here is the code I used, where I put in the schema_name and table_name of what I wanted:
select * from crosstab('select column_name, ''num'' as category, sum(num) from (' || (
select
string_agg(sql, '') as sql_string
from (
select
case when row_number() OVER () = 1 then '' else ' union all ' end ||
'select ''' || column_name || ''' as column_name, ' || 'count(' || column_name || ') as num from ' || table_schema || '.' || table_name as sql
from information_schema.columns
where table_schema = 'schema_name'
and table_name = 'table_name') as sql_query limit 1
) || ') as column_counts group by column_name, category')
AS t(column_name text, num numeric) order by num asc

Related

How to find column name that contains specific string value using oracle

How to find column name contains particular string value in my table sku_config using oracle.
for example my string is TRP , I need to find the column name that is having value 'TRP' in mytable.
here column name can be any column belongs to my table.
Here is psudo code for my requirement.
select column_name from sku_config where contains 'TRP'.
You can use xmlquery as follows:
SELECT column_name FROM
(select column_name,
to_number(xmlquery('/ROWSET/ROW/C/text()'
passing xmltype(dbms_xmlgen.getxml(
'select count(1) as c '
|| 'from ' || table_name || ' WHERE ' || column_name || ' LIKE ''%TRP%'''))
returning content)) as c
from all_tab_columns
where TABLE_NAME = 'SKU_CONFIG')
WHERE C > 0;
Example:
Current data of sample table:
SQL> SELECT * FROM ABC;
NAME DE
--------------- --
TEJASH2 SO
TEJASH3 DO
ABC SO
XXXXXXXXX SO
A A
B B
TEJASH1 SO
7 rows selected.
Searching for TEJASH string
SQL> SELECT column_name FROM
2 (select column_name,
3 to_number(xmlquery('/ROWSET/ROW/C/text()'
4 passing xmltype(dbms_xmlgen.getxml(
5 'select count(1) as c '
6 || 'from ' || table_name || ' WHERE ' || column_name || ' LIKE ''%TEJASH%'''))
7 returning content)) as c
8 from all_tab_columns
9 where TABLE_NAME = 'ABC')
10 WHERE C > 0;
COLUMN_NAME
-------------
NAME
Searching for SO string
SQL>
SQL>
SQL> SELECT column_name FROM
2 (select column_name,
3 to_number(xmlquery('/ROWSET/ROW/C/text()'
4 passing xmltype(dbms_xmlgen.getxml(
5 'select count(1) as c '
6 || 'from ' || table_name || ' WHERE ' || column_name || ' LIKE ''%SO%'''))
7 returning content)) as c
8 from all_tab_columns
9 where TABLE_NAME = 'ABC')
10 WHERE C > 0;
COLUMN_NAME
------------
DEPT
SQL>
If you want to find the names of the columns in a table that look like something, then use user_tab_columns:
select column_name
from user_tab_columns
where table_name = 'sku_config' and
column_name like '%TRP%';
you can use the following code to find a column with a specific string.
select COLUMN_NAME
from TABLE_NAME
where COLUMN_NAME="STRING"
group by COLUMN_NAME

How to find the number of columns in which records are more than 3?

To solve this problem I take names of columns from information_schema.columns and make query which should count number of records in one column. But that subquery returns more than 1 value and I don't know why, because it should return number of rows - one value. Please explain me what I'm doing wrong.
SELECT count(COLUMN_NAME)
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME = 'users' group by COLUMN_NAME having((select count(COLUMN_NAME) from users where COLUMN_NAME is not null)>3)
Columns
name | surname
__________________
a | b
c | d
e | f
e | null
Function should return 1 (name column have 4 records)
You can try this solution:
DECLARE dsql STRING;
DECLARE qry STRING;
SET qry = (
SELECT
STRING_AGG(ARRAY_TO_STRING(["SELECT '", column_name, "' AS ColumnName, " || "COUNT(" || column_name || ") AS RowCount FROM mydataset.", table_name, ' union all'], ''), ' ')
FROM
mydataset.INFORMATION_SCHEMA.COLUMNS
WHERE
table_name = 'users' );
SET dsql = 'SELECT * FROM ( ' || SUBSTR(qry, 1, LENGTH(qry) - LENGTH(' union all')) || ') WHERE RowCount > 3';
EXECUTE IMMEDIATE dsql;

Get count of rows from multiple tables Redshift SQL?

I have a redshift database that is being updated with new tables so I can't just manually list the tables I want. I want to get a count of the rows of all the tables from my query. So far I have:
select 'SELECT ''' || table_name || ''' as table_name, count(*) As con ' ||
'FROM ' || table_name ||
CASE WHEN lead(table_name) OVER (order by table_name ) IS NOT NULL
THEN ' UNION ALL ' END
FROM INFORMATION_SCHEMA.TABLES
WHERE TABLE_NAME LIKE '%results%'
but when I do this I get the error:
Specified types or functions (one per INFO message) not supported on Redshift tables.
I've searched a lot but I can't seem to find a solution for my problem. Any help would be greatly appreciated. Thanks!
EDIT:
I've changed my approach to this and decided to use a for loop in R to get the row counts of each but I'm running into the issue that 'row_counts' is only saving one number, not the count of each row like I want. Here is the code:
schema <- "x"
table_prefix <- "results"
geos <- ad_districts %>% filter(geo != "geo")
row_count <- list()
i = 1
for (geo in geos){
table_name <- paste0(schema, ".", table_prefix, geo)
row_count[[i]] <- dbGetQuery(con,
paste("SELECT COUNT(*) FROM", table_name))
i = i + 1
}
Your query is doing a select * for all tables, this will take a lot of time and resources. Instead use a system table to get the same info
select name, sum(rows) as rows
from stv_tbl_perm
where name like '%results%'
group by 1
[EDIT] - I think this is the root cause - some sql functions are only supported on the leader node. Try connecting to that node and re-run your SQL.
https://docs.aws.amazon.com/redshift/latest/dg/c_sql-functions-leader-node.html
Hope this helps.
select 'select count(*) as "' || table_schema || '.' || table_name || '" from ' || table_schema || '.' || table_name || ' ;' as sql_text
from information_schema.tables
;
[EDIT - refined this a bit to generate a series of statements that can be run at once]
select rownum, case when rownum > 1 then sql_text else replace(sql_text, 'union all', '') end as sql_text
from
(
select rank() over (order by sql_text DESC) as rownum,
sql_text
from
(
select 'select ''' || table_schema || ' ' || table_name || ''' , count(*) as "' || table_schema || '.' || table_name || '" from ' || table_schema || '.' || table_name || ' union all ' as sql_text
from information_schema.tables
where table_schema = 'public'
order by table_schema, table_name
)X
)Y
order by rownum desc ;
SELECT ' Select count(*) , '''+ tablename + ''' from '+'"' + tablename +'"' +' Union ALL '
FROM pg_table_def
GROUP BY tablename
Above query eliminates any table name with space. Remove UNION ALL at the end of the query and query will be ready to be executed.

Oracle: Count non-null fields for each column in a table

I need a query to count the total number of non-null values for each column in a table. Since my table has hundreds of columns I'm looking for a solution that only requires me to input the table name.
Perhaps using the result of:
select COLUMN_NAME from ALL_TAB_COLUMNS where TABLE_NAME='ORDERS';
to get the column names and then a subquery to put counts against each column name? The additional complication is that I only have read-only access to the DB so I can't create any temp tables.
Slightly out of my league with this one so any help is appreciated.
Construct the query in SQL or using a spreadsheet. Then run the query.
For instance, assuming that your column names are simple and don't have special characters:
select replace('select ''[col]'', count([col]) from orders union all ',
'[col]', COLUMN_NAME
) as sql
from ALL_TAB_COLUMNS
where TABLE_NAME = 'ORDERS';
(Of course, this can be adapted for more complex column names, but I'm trying to show the idea.)
Then copy the code, remove the final union all and run it.
You can put this in one string if there are not too many columns:
select listagg(replace('select ''[col]'', count([col]) from orders',
'[col]', COLUMN_NAME
), ' union all '
) within group (order by column_name) as sql
from ALL_TAB_COLUMNS
where TABLE_NAME = 'ORDERS';
You can also use execute immediate using the same query, but that seems like overkill.
If you're happy with the results row-ar rather than column-ar:
SELECT 'SELECT ''dummy'', 0 FROM DUAL' FROM DUAL
UNION ALL
SELECT
' UNION ALL SELECT ''' ||
column_name ||
''', COUNT(' ||
column_name ||
') FROM ' ||
TABLE_NAME
FROM
all_tab_columns
WHERE
table_name = 'ORDERS'
This is an "SQL that writes an SQL" that you can then copy and run to get your answers. Should make a resultset that looks like:
SELECT 'dummy', 0 FROM dual
UNION ALL SELECT 'col1', COUNT(col1) FROM ORDERS
UNION ALL SELECT 'col2', COUNT(col2) FROM ORDERS
...
If you want your results column-ar:
SELECT 'SELECT '
UNION ALL
SELECT
'COUNT(' ||
column_name ||
') as count_' ||
column_name ||
', ' ||
TABLE_NAME
FROM
all_tab_columns
WHERE
table_name = 'ORDERS'
UNION ALL
SELECT 'null as dummy_column FROM ORDERS'
Should make a resultset that looks like:
SELECT
COUNT(col1) as count_col1,
COUNT(col2) as count_col2,
...
null as dummycoll FROM orders
Caveat: I don't have oracle installed anywhere I can test these, it's written from memory and may need some debugging
This will generate the SQL to get the counts in columns and will handle case sensitive column names and column names with non-alpha-numeric characters:
SELECT 'SELECT '
|| LISTAGG(
'COUNT("' || column_name || '") AS "' || column_name || '"',
', '
) WITHIN GROUP ( ORDER BY column_id )
|| ' FROM "' || table_name || '"' AS sql
FROM ALL_TAB_COLUMNS
WHERE TABLE_NAME = 'ORDERS'
GROUP BY TABLE_NAME;
or, if you have a large number of columns that is generating a string longer than 4000 characters you can use a custom aggregation function to aggregate VARCHAR2s into a CLOB and then do:
SELECT 'SELECT '
|| CLOBAgg( 'COUNT("' || column_name || '") AS "' || column_name || '"' )
|| ' FROM "' || table_name || '"' AS sql
FROM ALL_TAB_COLUMNS
WHERE TABLE_NAME = 'ORDERS'
GROUP BY TABLE_NAME;
In Oracle 19 (I used similar code in Ora 12, maybe that works too), this works without generating another select to execute:
select * from
(
select table_name, column_name,
to_number( extractvalue( xmltype(dbms_xmlgen.getxml('select count(to_char(substr('||column_name||',1,1))) c from '||table_name)) ,'/ROWSET/ROW/C')) count
from all_tab_columns where owner = user
)
--where table_name = 'MY_TABLE'
;
It will create XML with count, from which it extracts the current count. The substr and to_char functions here are used to extract first character, so it will works with CLOB columns also

Vertica. Count of Null and Not-Null of all columns of a Table

How can we get null and non-null counts of all columns of a Table in Vertica? Table can have n number of columns and for each column we need to get count of nulls and non-nulls values of that table.
For Example.
Below Table has two columns
column1 Column2
1 abc
pqr
3
asd
5
If its a specific column then we can check like
SELECT COUNT(*) FROM table where column1 is null;
SELECT COUNT(*) FROM table where column1 is not null;
Same query for column2
I checked system tables like projection_storage and others but I cant figure out a generic query which gives details by hard coding only TABLE NAME in the query.
Hello #user2452689: Here is a dynamically generated VSQL statement which meets your requirement of counting nulls & not nulls in N columns. Notice that this writes a temporary SQL file out to your working directory, and then execute it via the \i command. You only need to change the first two variables per table. Hope this helps - good luck! :-D
--CHANGE SCHEMA AND TABLE PARAMETERS ONLY:
\set table_schema '\'public\''
\set table_name '\'dim_promotion\''
---------
\o temp_sql_file
\pset tuples_only
select e'select \'' || :table_schema || e'\.' || :table_name || e'\' as table_source' as txt
union all
select * from (
select
', sum(case when ' || column_name || ' is not null then 1 else 0 end) as ' || column_name || '_NOT_NULL
, sum(case when ' || column_name || ' is null then 1 else 0 end) as ' || column_name || '_NULL' as txt
from columns
where table_schema = :table_schema
and table_name = :table_name
order by ordinal_position
) x
union all
select ' from ' || :table_schema || e'.' || :table_name || ';' as txt ;
\o
\pset tuples_only
\i temp_sql_file
You can use:
select count(*) as cnt,
count(column1) as cnt_column1,
count(column2) as cnt_column2
from t;
count() with a column name or expression counts the number of non-NULL values in the column/expression.
(Obviously, the number of NULL values is cnt - cnt_columnX.)
select column1_not_null
,column2_not_null
,column3_not_null
,cnt - column1_not_null as column1_null
,cnt - column2_not_null as column2_null
,cnt - column3_not_null as column3_null
from (select count(*) as cnt
,count (column1) as column1_not_null
,count (column2) as column2_not_null
,count (column3) as column3_not_null
from mytable
) t