I have a table containing hundreds of columns many of which are null, and I would like have my select statement so that only those columns containing a value are returned. It would help me analyze data better. Something like:
Select (non null columns) from tablename;
I want to select all columns which have at least one non-null value.
Can this be done?
Have a look as statistics information, it may be useful for you:
SQL> exec dbms_stats.gather_table_stats('SCOTT','EMP');
PL/SQL procedure successfully completed.
SQL> select num_rows from all_tables where owner='SCOTT' and table_name='EMP';
NUM_ROWS
----------
14
SQL> select column_name,nullable,num_distinct,num_nulls from all_tab_columns
2 where owner='SCOTT' and table_name='EMP' order by column_id;
COLUMN_NAME N NUM_DISTINCT NUM_NULLS
------------------------------ - ------------ ----------
EMPNO N 14 0
ENAME Y 14 0
JOB Y 5 0
MGR Y 6 1
HIREDATE Y 13 0
SAL Y 12 0
COMM Y 4 10
DEPTNO Y 3 0
8 rows selected.
For example you can check if NUM_NULLS = NUM_ROWS to identify "empty" columns.
Reference: ALL_TAB_COLUMNS, ALL_TABLES.
Use the below:
SELECT *
FROM information_schema.columns
WHERE table_name = 'Table_Name' and is_nullable = 'NO'
Table_Name has to be replaced accordingly...
select column_name
from user_tab_columns
where table_name='Table_name' and num_nulls=0;
Here is simple code to get non null columns..
I don't think this can be done in a single query. You may need some plsql to first test what columns contain data and put together a statement based on that information. Of course, if the data in your table changes you have to recreate the statement.
declare
l_table varchar2(30) := 'YOUR_TABLE';
l_statement varchar2(32767);
l_test_statement varchar2(32767);
l_contains_value pls_integer;
-- select column_names from your table
cursor c is
select column_name
,nullable
from user_tab_columns
where table_name = l_table;
begin
l_statement := 'select ';
for r in c
loop
-- If column is not nullable it will always contain a value
if r.nullable = 'N'
then
-- add column to select list.
l_statement := l_statement || r.column_name || ',';
else
-- check if there is a row that has a value for this column
begin
l_test_statement := 'select 1 from dual where exists (select 1 from ' || l_table || ' where ' ||
r.column_name || ' is not null)';
dbms_output.put_line(l_test_statement);
execute immediate l_test_statement
into l_contains_value;
-- Yes, add column to select list
l_statement := l_statement || r.column_name || ',';
exception
when no_data_found then
null;
end;
end if;
end loop;
-- create a select statement
l_statement := substr(l_statement, 1, length(l_statement) - 1) || ' from ' || l_table;
end;
select rtrim (xmlagg (xmlelement (e, column_name || ',')).extract ('//text()'), ',') col
from (select column_name
from user_tab_columns
where table_name='<table_name>' and low_value is not null)
This block determines all columns in the table, loops through them in dynamic SQL and checks if they are null, then constructs a DBMS output query of the non-null query.
All you have to do is run the returned query.
I've included the exclusion of PKs and BLOB columns.
Obviously, this is quite slow as going through columns one by one, and it's not going to be great for very hot tables, as data may change too quickly, but this works for me as I control traffic in dev env.
DECLARE
l_table_name VARCHAR2(255) := 'XXXX';
l_counter NUMBER;
l_sql CLOB;
BEGIN
FOR r_col IN (SELECT *
FROM user_tab_columns tab_col
WHERE table_name = l_table_name
AND data_type NOT IN ('BLOB')
AND column_name NOT IN (SELECT column_name
FROM user_cons_columns con_col
JOIN user_constraints cons ON con_col.constraint_name = cons.constraint_name AND con_col.table_name = cons.table_name
WHERE con_col.table_name = tab_col.table_name
AND constraint_type = 'P')
ORDER BY column_id)
LOOP
EXECUTE IMMEDIATE 'SELECT COUNT(1) FROM '||l_table_name||' WHERE '||r_col.column_name||' IS NOT NULL'
INTO l_counter;
IF l_counter > 0 THEN
IF l_sql IS NULL THEN
l_sql := r_col.column_name;
ELSE
l_sql := l_sql||','||r_col.column_name;
END IF;
END IF;
END LOOP;
l_sql := 'SELECT '||l_sql||CHR(10)
||'FROM '||l_table_name;
----------
DBMS_OUTPUT.put_line(l_sql);
END;
What you're asking to do is establish a dependency on each row in the whole result. This is in fact not ever what you want. Just think of the ramifications if in one row every column had a value of '0' -- suddenly the schema of your result set grows to include all of those previously "empty" columns. You're effectively growing the badness of '*' exponentially, now your result set is not dependent on just the table's meta-data -- but your whole result set is dependent on the plain data.
What you want to do is just select the fields that have what you want, and not deviate from this simple plan.
Related
I need to know what columns of one table have only null values. I understand that I should do a loop in user_tab_columns. But how detect only columns with null value?
Thanks and sorry for my English
To perform a query where you don't know the column identifies in advance, you need to use dynamic SQL. Assuming you already know the table is not empty, you could do something like:
declare
l_count pls_integer;
begin
for r in (
select table_name, column_name
from user_tab_columns
where table_name = 'T42'
and nullable = 'Y'
)
loop
execute immediate 'select count(*) '
|| ' from "' || r.table_name || '"'
|| ' where "' || r.column_name || '" is not null'
into l_count;
if l_count = 0 then
dbms_output.put_line('Table ' || r.table_name
|| ' column ' || r.column_name || ' only has nulls');
end if;
end loop;
end;
/
Remember to set serveroutput on or your client's equivalent before executing.
The cursor gets the columns from the table which are declared as nullable (if they aren't, not much point checking them; though this won't catch explicit check constraints). For each column it builds a query to count the rows where that column is not null. If that count is zero then it didn't find any that are not null, therefore they all are. Again, assuming you know the table isn't empty before you start.
I've included the table name in the cursor select list and references so you only need to change the name in one place to search a different table, or you could use a variable for that name. Or check multiple tables at once by changing that filter.
You may get better performance by selecting a dummy value from any non-null row, with a rownum stop check - which means it will stop as soon as it finds a non-null value, rather than having to check every row to get an actual count:
declare
l_flag pls_integer;
begin
for r in (
select table_name, column_name
from user_tab_columns
where table_name = 'T42'
and nullable = 'Y'
)
loop
begin -- inner block to allow exception trapping within loop
execute immediate 'select 42 '
|| ' from "' || r.table_name || '"'
|| ' where "' || r.column_name || '" is not null'
|| ' and rownum < 2'
into l_flag;
-- if this foudn anything there is a non-null value
exception
when no_data_found then
dbms_output.put_line('Table ' || r.table_name
|| ' column ' || r.column_name || ' only has nulls');
end;
end loop;
end;
/
or you could do something similar with an exists() check.
If you don't know that the table has data then you can do a simple count(*) from the table before the loop to check if it is empty, and report that instead:
...
begin
if l_count = 0 then
dbms_output.put_line('Table is empty');
return;
end if;
...
Or you could combine it with the cursor query, but this would need some work if you wanted to check multiple tables at once as it would stop as soon as it found any empty one (have to leave you something to do... *8-)
declare
l_count_any pls_integer;
l_count_not_null pls_integer;
begin
for r in (
select table_name, column_name
from user_tab_columns
where table_name = 'T42'
and nullable = 'Y'
)
loop
execute immediate 'select count(*),'
|| ' count(case when "' || r.column_name || '" is not null then 1 end)'
|| ' from "' || r.table_name || '"'
into l_count_any, l_count_not_null;
if l_count_any = 0 then
dbms_output.put_line('Table ' || r.table_name || ' is empty');
exit; -- only report once
elsif l_count_not_null = 0 then
dbms_output.put_line('Table ' || r.table_name
|| ' column ' || r.column_name || ' only has nulls');
end if;
end loop;
end;
/
You could of course populate a collection or make it a pipelined function or whatever if you didn't want to reply on dbms_output, but I assume this is a one-off check so it is probably acceptable.
You can loop though your columns and count null rows. If it is same with your table count, then that column has only null values.
The first question is: one column with zero row could be regarded as only (null) value containing column. But it can remain your decision: the scripts below provide solutions to both ways. (In my opinion: no. The empty columns is not a column with only (null) value)
If you want to know the (null) values about one table you can it with count(column):
select count(column) from table
and when the count(column) = 0 then the column has only (null) value or has no value. (So, you cannot make a correct decision).
E.g. The following three tables (x, y and z) has the following columns:
select * from x;
N_X M_X
---------------
100 (null)
200 (null)
300 (null)
select * from y;
N_Y M_Y
---------------
101 (null)
202 (null)
303 apple
select * from z;
N_Z M_Z
---------------
The count() selects:
select count(n_x), count(m_x) from x;
COUNT(N_X) COUNT(M_X)
-----------------------
3 0
select count(n_y), count(m_y) from y;
COUNT(N_Y) COUNT(M_Y)
-----------------------
3 1
select count(n_z), count(m_Z) from z;
COUNT(N_Z) COUNT(M_Z)
-----------------------
0 0
As you can see, the difference between x and y is appears but you cannot decide that the table z has no rows or only full of (null) values.
The general solution:
I have separeted the schema and the db level but the basic idea is common:
Schema level: the current user’s table
DB level: all users or a chosen schema
The number of (null) in one columns:
all_tab_columns.num_nulls
(Or: user_tab_columns, num_nulls).
And we need the num_rows of the table:
all_all_tables.num_rows
(Or: user_all_tables.num_rows)
Where the num_nulls equals to num_rows there are only (null) values.
Firstly, you need to run the DBMS_STATS for refresh the statistics.
on database level:
exec DBMS_STATS.GATHER_DATABASE_STATS;
(it can use a lot of resources)
on schema level:
EXEC DBMS_STATS.gather_schema_stats('TRANEE',DBMS_STATS.AUTO_SAMPLE_SIZE); (owner = tranee)
-- column with zero row = column has only (null) values -> exclude num_nulls > 0 condition
-- column with zero row <> column has only (null) values -> include num_nulls > 0 condition
the scripts:
-- 1. current user
select
a.table_name,
a.column_name,
a.num_nulls,
b.num_rows
from user_tab_columns a, user_all_tables b
where a.table_name = b.table_name
and num_nulls = num_rows
and num_nulls > 0;
-- 2. chosen user / all user -> exclude the a.owner = 'TRANEE' condition
select
a.owner,
a.table_name,
a.column_name,
a.num_nulls,
b.num_rows
from all_tab_columns a, all_all_tables b
where a.owner = b.owner
and a.table_name = b.table_name
and a.owner = 'TRANEE'
and num_nulls = num_rows
and num_nulls > 0;
TABLE_NAME COLUMN_NAME NUM_NULLS NUM_ROWS
----------------------------------------------------
LEADERS COMM 4 4
EMP_ACTION ACTION 12 12
X M_X 3 3
These tables and columns have only (null) values in tranee schema.
I'm trying to find the proper query to :
Get all the column names of a table on which there is at least one row with a non null value for a specific query.
Meaning : I will see which columns have at least one value set in the record returned by my given query.
I hope I'm clear enough.
I think you need something as the following:
SELECT CASE WHEN MAX(col1) IS NOT NULL THEN 'COL1' END ||','
|| CASE WHEN MAX(col2) IS NOT NULL THEN 'COL2' END ||','
...
FROM T
Then use REGEXP_REPLACE to replace duplicated ,. You can use user_tab_columns to generate this query dynamically as mentioned.
Please check below anonymous block for your case , tablename should be given at 3 places in below query
Declare
v_columnlist varchar2(32627);
v_noofcolumns number;
Columnnm varchar2(1000);
Finalcolumns varchar2(32627);
v_count number:=0;
plsql_block varchar2(32627);
Begin
select LISTAGG(column_name,',') within group (order by column_id),max(column_id)
into v_columnlist , v_noofcolumns
from user_tab_columns
where table_name='tablename';
FOR Lcntr IN 1..v_noofcolumns
LOOP
select REGEXP_SUBSTR(v_columnlist,'[^,]+',1,Lcntr)
into Columnnm
from dual;
plsql_block := 'select count(*) from tablename where '|| Columnnm || ' is not null ';
EXECUTE IMMEDIATE plsql_block into v_count;
IF v_count > 1 THEN
Finalcolumns:= LTRIM(Finalcolumns ||','||Columnnm,',');
END IF;
END LOOP;
plsql_block := 'select ' || Finalcolumns ||' from tablename ';
DBMS_OUTPUT.PUT_LINE(plsql_block);
END;
I have table like this:
Table-1
Table-2
Table-3
Table-4
Table-5
each table is having many columns and one of the column name is employee_id.
Now, I want to write a query which will
1) return all the tables which is having this columns and
2) results should show the tables if the column is having values or empty values by passing employee_id.
e.g. show table name, column name from Table-1, Table-2,Table-3,... where employee_id='1234'.
If one of the table doesn't have this column, then it is not required to show.
I have verified with link, but it shows only table name and column name and not by passing some column values to it.
Also verified this, but here verifies from entire schema which I dont want to do it.
UPDATE:
Found a solution, but by using xmlsequence which is deprecated,
1)how do I make this code as xmltable?
2) If there are no values in the table, then output should have empty/null. or default as "YES" value
WITH char_cols AS
(SELECT /*+materialize */ table_name, column_name
FROM cols
WHERE data_type IN ('CHAR', 'VARCHAR2') and table_name in ('Table-1','Table-2','Table-3','Table-4','Table-5'))
SELECT DISTINCT SUBSTR (:val, 1, 11) "Employee_ID",
SUBSTR (table_name, 1, 14) "Table",
SUBSTR (column_name, 1, 14) "Column"
FROM char_cols,
TABLE (xmlsequence (dbms_xmlgen.getxmltype ('select "'
|| column_name
|| '" from "'
|| table_name
|| '" where upper("'
|| column_name
|| '") like upper(''%'
|| :val
|| '%'')' ).extract ('ROWSET/ROW/*') ) ) t ORDER BY "Table"
/
This query can be done in one step using the (non-deprecated) XMLTABLE.
Sample Schema
--Table-1 and Table-2 match the criteria.
--Table-3 has the right column but not the right value.
--Table-4 does not have the right column.
create table "Table-1" as select '1234' employee_id from dual;
create table "Table-2" as select '1234' employee_id from dual;
create table "Table-3" as select '4321' employee_id from dual;
create table "Table-4" as select 1 id from dual;
Query
--All tables with the column EMPLOYEE_ID, and the number of rows where EMPLOYEE_ID = '1234'.
select table_name, total
from
(
--Get XML results of dynamic query on relevant tables and columns.
select
dbms_xmlgen.getXMLType(
(
--Create a SELECT statement on each table, UNION ALL'ed together.
select listagg(
'select '''||table_name||''' table_name, count(*) total
from "'||table_name||'" where employee_id = ''1234'''
,' union all'||chr(10)) within group (order by table_name) v_sql
from user_tab_columns
where column_name = 'EMPLOYEE_ID'
)
) xml
from dual
) x
cross join
--Convert the XML data to relational.
xmltable('/ROWSET/ROW'
passing x.xml
columns
table_name varchar2(128) path 'TABLE_NAME',
total number path 'TOTAL'
);
Results
TABLE_NAME TOTAL
---------- -----
Table-1 1
Table-2 1
Table-3 0
Just try to use code below.
Pay your attention that may be nessecery clarify scheme name in loop.
This code works for my local db.
set serveroutput on;
DECLARE
ex_query VARCHAR(300);
num NUMBER;
emp_id number;
BEGIN
emp_id := <put your value>;
FOR rec IN
(SELECT table_name
FROM all_tab_columns
WHERE column_name LIKE upper('employee_id')
)
LOOP
num :=0;
ex_query := 'select count(*) from ' || rec.table_name || ' where employee_id = ' || emp_id;
EXECUTE IMMEDIATE ex_query into num;
if (num>0) then
DBMS_OUTPUT.PUT_LINE(rec.table_name);
end if;
END LOOP;
END;
I tried with the xml thing, but I get an error I cannot solve. Something about a zero size result. How difficult is it to solve this instead of raising exception?! Ask Oracle.
Anyway.
What you can do is use the COLS table to know what table has the employee_id column.
1) what table from table TABLE_LIKE_THIS (I assume column with table names is C) has this column?
select *
from COLS, TABLE_LIKE_THIS t
where cols.table_name = t
and cols.column_name = 'EMPLOYEE_ID'
-- think Oracle metadata/ think upper case
2) Which one has the value you are looking for: write a little chunk of Dynamic PL/SQL with EXECUTE IMMEDIATE to count the tables matching above condition
declare
v_id varchar2(10) := 'JP1829'; -- value you are looking for
v_col varchar2(20) := 'EMPLOYEE_ID'; -- column
n_c number := 0;
begin
for x in (
select table_name
from all_tab_columns cols
, TABLE_LIKE_THIS t
where cols.table_name = t.c
and cols.column_name = v_col
) loop
EXECUTE IMMEDIATE
'select count(1) from '||x.table_name
||' where Nvl('||v_col||', ''##'') = ''' ||v_id||'''' -- adding quotes around string is a little specific
INTO n_c;
if n_c > 0 then
dbms_output.put_line(n_C|| ' in ' ||x.table_name||' has '||v_col||'='||v_id);
end if;
-- idem for null values
-- ... ||' where '||v_col||' is null '
-- or
-- ... ||' where Nvl('||v_col||', ''##'') = ''##'' '
end loop;
dbms_output.put_line('done.');
end;
/
Hope this helps
I am working on a system where there are 50/60 tables. Each has the same unique key (call it MEMBID for this example)
Is there a query I can run that will show me the names of all tables that have at least one row where the MEMBID exists?
Or do I need to cursor through the USER_TABLES table and then build a dynamic query to build up an "array"?
Many thanks
Mike
I'd go for dynamic SQL - that's pretty straightforward:
declare
l_cnt_membid number;
l_cnt_overall number;
begin
for cur in (select table_name from user_tab_cols where column_name = 'MEMBID')
loop
execute immediate 'select count(*), count(membid) from ' || cur.table_name
into l_cnt_overall, l_cnt_membid;
dbms_output.put_line(cur.table_name || ', overall: ' || l_cnt_overall ||
', membid: ' || l_cnt_membid);
end loop;
end;
EDIT:
If your table statistics are up-to-date, you can obtain this information from user_tab_cols directly:
select table_name,
(case when num_distinct > 0
then 'YES'
else 'NO' end) has_nonnull_membid
from user_tab_cols
where column_name = 'MEMBID'
I have written 2 separate queries
1)
SELECT COLUMN_NAME
FROM ALL_TAB_COLUMNS
WHERE TABLE_NAME =
(SELECT DISTINCT UT.TABLE_NAME
FROM USER_TABLES UT
WHERE UT.TABLE_NAME = 'MY_TABLE')
AND COLUMN_NAME NOT IN ('AVOID_COLUMN')
2)
SELECT *
FROM MY_TABLE MT
WHERE MT.COL1 = '1'
The 1st query returns the names of all the columns except the one I want to avoid. The 2nd one returns data of all the columns from the table.
Is there some way to merge these queries so that only those column's data is selected from the 2nd query, which are returned from the 1st query?
Thanks in advance
You'll have to use dynamic SQL for this (BTW, I got rid of the subselect for the USER_TABLES query - it's unnecessary):
var cur refcursor
/
declare
v_stmt varchar2(4000);
begin
v_stmt := 'SELECT ';
for cur in (
SELECT COLUMN_NAME
FROM ALL_TAB_COLUMNS
WHERE TABLE_NAME =
'MY_TABLE'
AND COLUMN_NAME NOT IN ('AVOID_COLUMN')
)
loop
v_stmt := v_stmt || cur.column_name || ',';
end loop;
-- get rid of trailing ','
v_stmt := regexp_replace(v_stmt, ',$', '');
v_stmt := v_stmt || ' from my_table MT WHERE MT.COL1 = ''1''';
dbms_output.put_line(v_stmt);
open :cur for v_stmt;
end;