Count distinct values for every column individually - sql

Can I count distinct values of every column without enumerating them ?
Say I have a table with col1, col2, col3, and no other column. Without mentioning these columns explicitly, I would like to have the same result as:
SELECT
count(distinct col1) as col1,
count(distinct col2) as col2,
count(distinct col3) as col3
FROM mytable;
How can I do this ?

I think the best you could easily do with plain SQL is to run a query like this to generate the query you want, and then run that.
select 'select count(distinct '
|| listagg(column_name || ') as ' || column_name, ', count(distinct ') within group (order by column_id)
|| ' from ' || max(table_name) || ';' as script
from all_tab_cols
where table_name = 'MYTABLE';

Related

Oracle: Count non-null fields for each column in a table

I need a query to count the total number of non-null values for each column in a table. Since my table has hundreds of columns I'm looking for a solution that only requires me to input the table name.
Perhaps using the result of:
select COLUMN_NAME from ALL_TAB_COLUMNS where TABLE_NAME='ORDERS';
to get the column names and then a subquery to put counts against each column name? The additional complication is that I only have read-only access to the DB so I can't create any temp tables.
Slightly out of my league with this one so any help is appreciated.
Construct the query in SQL or using a spreadsheet. Then run the query.
For instance, assuming that your column names are simple and don't have special characters:
select replace('select ''[col]'', count([col]) from orders union all ',
'[col]', COLUMN_NAME
) as sql
from ALL_TAB_COLUMNS
where TABLE_NAME = 'ORDERS';
(Of course, this can be adapted for more complex column names, but I'm trying to show the idea.)
Then copy the code, remove the final union all and run it.
You can put this in one string if there are not too many columns:
select listagg(replace('select ''[col]'', count([col]) from orders',
'[col]', COLUMN_NAME
), ' union all '
) within group (order by column_name) as sql
from ALL_TAB_COLUMNS
where TABLE_NAME = 'ORDERS';
You can also use execute immediate using the same query, but that seems like overkill.
If you're happy with the results row-ar rather than column-ar:
SELECT 'SELECT ''dummy'', 0 FROM DUAL' FROM DUAL
UNION ALL
SELECT
' UNION ALL SELECT ''' ||
column_name ||
''', COUNT(' ||
column_name ||
') FROM ' ||
TABLE_NAME
FROM
all_tab_columns
WHERE
table_name = 'ORDERS'
This is an "SQL that writes an SQL" that you can then copy and run to get your answers. Should make a resultset that looks like:
SELECT 'dummy', 0 FROM dual
UNION ALL SELECT 'col1', COUNT(col1) FROM ORDERS
UNION ALL SELECT 'col2', COUNT(col2) FROM ORDERS
...
If you want your results column-ar:
SELECT 'SELECT '
UNION ALL
SELECT
'COUNT(' ||
column_name ||
') as count_' ||
column_name ||
', ' ||
TABLE_NAME
FROM
all_tab_columns
WHERE
table_name = 'ORDERS'
UNION ALL
SELECT 'null as dummy_column FROM ORDERS'
Should make a resultset that looks like:
SELECT
COUNT(col1) as count_col1,
COUNT(col2) as count_col2,
...
null as dummycoll FROM orders
Caveat: I don't have oracle installed anywhere I can test these, it's written from memory and may need some debugging
This will generate the SQL to get the counts in columns and will handle case sensitive column names and column names with non-alpha-numeric characters:
SELECT 'SELECT '
|| LISTAGG(
'COUNT("' || column_name || '") AS "' || column_name || '"',
', '
) WITHIN GROUP ( ORDER BY column_id )
|| ' FROM "' || table_name || '"' AS sql
FROM ALL_TAB_COLUMNS
WHERE TABLE_NAME = 'ORDERS'
GROUP BY TABLE_NAME;
or, if you have a large number of columns that is generating a string longer than 4000 characters you can use a custom aggregation function to aggregate VARCHAR2s into a CLOB and then do:
SELECT 'SELECT '
|| CLOBAgg( 'COUNT("' || column_name || '") AS "' || column_name || '"' )
|| ' FROM "' || table_name || '"' AS sql
FROM ALL_TAB_COLUMNS
WHERE TABLE_NAME = 'ORDERS'
GROUP BY TABLE_NAME;
In Oracle 19 (I used similar code in Ora 12, maybe that works too), this works without generating another select to execute:
select * from
(
select table_name, column_name,
to_number( extractvalue( xmltype(dbms_xmlgen.getxml('select count(to_char(substr('||column_name||',1,1))) c from '||table_name)) ,'/ROWSET/ROW/C')) count
from all_tab_columns where owner = user
)
--where table_name = 'MY_TABLE'
;
It will create XML with count, from which it extracts the current count. The substr and to_char functions here are used to extract first character, so it will works with CLOB columns also

Get column names in subquery, and then return values for those columns?

Seems like this is impossible, but I'm so close - maybe someone can take me the last step...
I have a bunch of dynamic code and I don't always know the tables and columns I'm going to be dealing with, but I do know that VARCHAR2 columns with data_lengths of 2000 result in errors. I'd love to be able to identify these 'bad' columns dynamically, and remove them from my results in 1 shot.
This code:
SELECT LISTAGG(probs.column_name, ', ')
WITHIN GROUP (ORDER BY column_name) FROM
(select 1 grp, column_name
from all_tab_columns
where TABLE_NAME = 'MYTABLE' AND
DATA_TYPE <> 'VARCHAR2' AND
DATA_LENGTH < 2000
) probs
GROUP BY GRP
Gives me a nice comma, separated list of all of my acceptable column names like this:
FIELD1, FIELD2, FIELD3, FIELD4...
And I am hopeful that there's a way a can simply do something to drop that list of field names into a select statement like this:
SELECT (<my subquery, above>)
FROM MYTABLE;
Is this possible?
Assuming this situation
create table mytable ( a number, b number, c number)
insert into mytable values (10, 20, 30)
insert into mytable values (1, 2, 3)
and that only exists one table with that name (otherwise you should specify the owner in the query from all_tab_columns), your query could be simplified this way:
SELECT 'select ' || LISTAGG(column_name, ', ') WITHIN GROUP (ORDER BY column_name) || ' from ' || table_name
FROM all_tab_columns
WHERE TABLE_NAME = 'MYTABLE'
AND DATA_TYPE <> 'VARCHAR2'
AND DATA_LENGTH < 2000
GROUP BY table_name
this would give: select A, B, C from MYTABLE.
The problem here is that you can not simply run a statement that returns a variable number of columns; one way to use this could be building an xml:
SELECT xmltype(
DBMS_XMLGEN.getxml(
( SELECT 'select ' || LISTAGG(column_name, ', ') WITHIN GROUP (ORDER BY column_name) || ' from ' || table_name
FROM all_tab_columns
WHERE TABLE_NAME = 'MYTABLE'
AND DATA_TYPE <> 'VARCHAR2'
AND DATA_LENGTH < 2000
GROUP BY table_name)
)
)
FROM DUAL
<?xml version="1.0"?>
<ROWSET>
<ROW>
<A>10</A>
<B>20</B>
<C>30</C>
</ROW>
<ROW>
<A>1</A>
<B>2</B>
<C>3</C>
</ROW>
</ROWSET>
Another way could be using some PLSQL and dynamic SQL, with a little modification of yur query to concatenate the fields, to build the result in a unique string:
declare
type tTabResults is table of varchar2(1000);
vSQL varchar2(1000);
vTabResults tTabResults;
begin
SELECT 'select ' || LISTAGG( column_name, '|| '', '' ||') WITHIN GROUP (ORDER BY column_name) || ' from ' || table_name
into vSQL
FROM all_tab_columns
WHERE TABLE_NAME = 'MYTABLE'
AND DATA_TYPE <> 'VARCHAR2'
AND DATA_LENGTH < 2000
GROUP BY table_name;
--
execute immediate vSQL bulk collect into vTabResults;
--
for i in vTabResults.first .. vTabResults.last loop
dbms_output.put_line(vTabResults(i));
end loop;
end;
10, 20, 30
1, 2, 3
Notice that I oversimplified the problem, treating numbers as strings and not using any conversion, by simply printing the values in your table, no matter their type; in a real solution you should handle the possible types of your columns and modify the initial query to add some type conversions.

Comparing column by column between two rows in Oracle DB

I need to write a query to compare column by column (ie: find differences) between two rows in the database. For example:
row1: 10 40 sometext 24
row2: 10 25 sometext 24
After the query executed, it should shows only the fields that have difference (ie: the second field)
Here's what I have done so far:
select table1.column1, table1.column2, table1.column3, table1.column4
from table1
where somefield in (field1, field2);
The above query will show me two rows one above another like this:
10 40 sometext 24
10 25 sometext 24
Then I have to manually do the comparison and it takes a lot of time b/c the row contains a lot of column.
So again my question is: How can I write a query that will show me only the columns that have differences??
Thanks
Use UNPIVOT clause (see http://www.oracle-developer.net/display.php?id=506) to turn columns into rows, then filter out the same rows (using GROUP BY HAVING COUNT and finally use PIVOT to get rows with different columns only.
To do this easily you need to query the metadata for the table to get each row. You can use the following code as a script.
Replace the define table_name with your table name and define yes_drop_it = NO. Put your raw WHERE syntax into the where_clause. The comparison logic always compares the first two rows returned for the where clause.
whenever sqlerror exit failure rollback;
set linesize 150
define test_tab_name = tst_cf_cols
define yes_drop_it = YES
define order_by = 1, 2
define where_clause = 1 = 1
define tab_owner = user
<<clearfirst>> begin
for clearout in (
select 'drop table ' || table_name as cmd
from all_tables
where owner = &&tab_owner and table_name = upper('&&test_tab_name')
and '&&yes_drop_it' = 'YES'
) loop
execute immediate clearout.cmd;
execute immediate '
create table &&test_tab_name as
select 10 as column1, 40 as column2, ''sometext'' as column3, 24 as column4 from dual
union all
select 10 as column1, 25 as column2, ''sometext'' as column3, 24 as column4 from dual
';
end loop;
end;
/
column cfsynt format a4000 word_wrap new_value comparison_syntax
with parms as (select 'parmquery' as cte_name, 'row_a' as corr_name_1, 'row_b' as corr_name_2 from dual)
select
'select * from (select ' || LISTAGG(cfcol || ' AS cf_' || trim (to_char (column_id, '000')) || '_' || column_name
, chr(13) || ', ') WITHIN GROUP (order by column_id)
|| chr(13) || ' from (select * from parmquery where row_number = 1) ' || corr_name_1
|| chr(13) || ', (select * from parmquery where row_number = 2) ' || corr_name_2
|| chr(13) || ') where ''DIFFERENT'' IN (' || LISTAGG ('cf_' || trim (to_char (column_id, '000')) || '_' || column_name, chr(13) || ', ') within group (order by column_id) || ')'
as cfsynt
from parms, (
select
'decode (' || corr_name_1 || '.' || column_name || ', ' || corr_name_2
|| '.' || column_name || ', ''SAME'', ''DIFFERENT'')'
as cfcol,
column_name,
column_id
from
parms,
all_tab_columns
where
owner = &&tab_owner and table_name = upper ('&&test_tab_name')
);
with parmquery as (select rownum as row_number, vals.* from (
select * from &&test_tab_name
where &&where_clause
order by &&order_by
) vals
) &&comparison_syntax
;

DB2 Dynamic View CREATE statement from SYS tables

Uncertain if this is even possible, but here goes...
In DB2 v9.7, I'm trying to create a view for every table in a schema.
The view should contain each column, but appends one more timestamp column.
i.e.
TABLE1 (COL1, COL2, COL3)
V_TABLE1 (COL1, COL2, COL3, INS_DTTM)
TABLE2 (COL1, COL4)
V_TABLE2 (COL1, COL4, INS_DTTM)
Here's what I've got so far
SELECT 'CREATE VIEW V_' || NAME || ' (' ||
(SELECT LISTAGG(NAME, ', ') FROM SYSIBM.SYSCOLUMNS WHERE TBCREATOR=CREATOR) || ')
AS SELECT ' || (SELECT LISTAGG(NAME, ', ')
FROM SYSIBM.SYSCOLUMNS WHERE TBCREATOR='SCHEMA') || ', CAST(NULL AS TIMESTAMP)
AS INS_DTTM FROM ' || CREATOR || '.' || NAME
FROM SYSIBM.SYSTABLES
WHERE CREATOR = 'SCHEMA' AND NAME LIKE 'T%' ORDER BY NAME;
It's my subselect that is causing me grief...
I get the following:
CREATE VIEW V_TABLE1 (COL1, COL2, COL3) AS SELECT COL1, COL2, COL3, CAST(NULL AS TIMESTAMP) AS INS_DTTM FROM SCHEMA.TABLE1
CREATE VIEW V_TABLE2 (COL1, COL2, COL3) AS SELECT COL1, COL2, COL3, CAST(NULL AS TIMESTAMP) AS INS_DTTM FROM SCHEMA.TABLE2
I'll apologize in advance if my syntax is off...I'm retyping across development environs and the internet...so I'm certain I've made mistakes...
I want the columns listed to relate to the table name of that row...but the LISTAGG appears for just the first row...
Hope this makes sense...and if anyone has any suggestions, please let me know.
Thanks

Oracle select selected columns from a table whose column names are available in another table

Let me give an example :
Suppose I have a table TableA(Col1, Col2, Col3, Col4, Col5)
I have another table TableB where their are entries of the names of the columns of TableA that required to be fetched, for example Col2 and Col5
Now I want to write an SQL query that will only fetch the columns of TableA as defined in TableB .
Here is a start.
The idea is to build a concatenated list of column_names as a varchar
'col1, col2, col3, col4'
and to use it in a dynamic sql query.
declare
column_list xmltype;
column_names varchar(10000);
begin
SELECT
XMLAGG (XMLELEMENT (e, t1.column_name || ',')).EXTRACT ('//text()')
column_name
into column_list
FROM all_tab_cols t1
where t1.table_name = 'TABLEA'
and exists (select null
from TableB
where t1.column_name = <the field for the column_name in tableB>);
column_names := RTRIM(column_list.getClobVal(), ',');
--this will just display the sql query, you'll need to execute it to get your results with EXECUTE IMMEDIATE
dbms_output.put_line( 'SELECT '||column_names||' from TableA');
end;