Counting nulls for each column in table - sql

We want to count how many nulls each column in a table has. There are too many columns to do this one by one, so the following PLSQL procedure was created.
In the first part of the procedure, all column names are obtained. This works, as the dbms_output correctly lists them all.
Secondly, a query inserts the count of null values in the variable 'nullscount'. This part does not work, as the output printed for this variable is always 0, even for columns where we know there are nulls.
Does anyone know how to handle the second part correctly?
Many thanks.
CREATE OR REPLACE PROCEDURE COUNTNULLS AS
nullscount int;
BEGIN
for c in (select column_name from all_tab_columns where table_name = upper('gp'))
loop
select count(*) into nullscount from gp where c.column_name is null;
dbms_output.put_line(c.column_name||' '||nullscount);
end loop;
END COUNTNULLS;

You can get it with just one query like this: this query scans table just once:
DBFiddle: https://dbfiddle.uk/asgrCezT
select *
from xmltable(
'/ROWSET/ROW/*'
passing
dbms_xmlgen.getxmltype(
(
select
'select '
||listagg('count(*)-count("'||column_name||'") as "'||column_name||'"',',')
||' from '||upper('gp')
from user_tab_columns
where table_name = upper('gp')
)
)
columns
column_name varchar2(30) path './name()',
cnt_nulls int path '.'
);
Results:
COLUMN_NAME CNT_NULLS
------------------------------ ----------
A 5
B 4
C 3
Dynamic sql in this query uses (24 chars + column name length) so it should work fine for example for 117 columns with average column name length = 10. If you need more, you can rewrite it a bit, for example:
select *
from xmltable(
'let $cnt := /ROWSET/ROW/CNT
for $r in /ROWSET/ROW/*[name() != "CNT"]
return <R name="{$r/name()}"> {$cnt - $r} </R>'
passing
dbms_xmlgen.getxmltype(
(
select
'select count(*) CNT,'
||listagg('count("'||column_name||'") as "'||column_name||'"',',')
||' from '||upper('gp')
from user_tab_columns
where table_name = upper('gp')
)
)
columns
column_name varchar2(30) path '#name',
cnt_nulls int path '.'
);

create table gp (
id number generated by default on null as identity
constraint gp_pk primary key,
c1 number,
c2 number,
c3 number,
c4 number,
c5 number
)
;
-- add some data with NULLS and numbers
DECLARE
BEGIN
FOR r IN 1 .. 20 LOOP
INSERT INTO gp (c1,c2,c3,c4,c5) VALUES
(CASE WHEN mod(r,2) = 0 THEN NULL ELSE mod(r,2) END
,CASE WHEN mod(r,3) = 0 THEN NULL ELSE mod(r,3) END
,CASE WHEN mod(r,4) = 0 THEN NULL ELSE mod(r,4) END
,CASE WHEN mod(r,5) = 0 THEN NULL ELSE mod(r,5) END
,5);
END LOOP;
END;
/
-- check what is in the table
SELECT * FROM gp;
-- do count of each column
DECLARE
l_colcount NUMBER;
l_statement VARCHAR2(100) := 'SELECT COUNT(*) FROM $TABLE_NAME$ WHERE $COLUMN_NAME$ IS NULL';
BEGIN
FOR r IN (SELECT column_name,table_name FROM user_tab_columns WHERE table_name = 'GP') LOOP
EXECUTE IMMEDIATE REPLACE(REPLACE(l_statement,'$TABLE_NAME$',r.table_name),'$COLUMN_NAME$',r.column_name) INTO l_colcount;
dbms_output.put_line('Table: '||r.table_name||', column'||r.column_name||', COUNT: '||l_colcount);
END LOOP;
END;
/
Table created.
Statement processed.
Result Set 4
ID C1 C2 C3 C4 C5
1 1 1 1 1 5
2 - 2 2 2 5
3 1 - 3 3 5
4 - 1 - 4 5
5 1 2 1 - 5
6 - - 2 1 5
7 1 1 3 2 5
8 - 2 - 3 5
9 1 - 1 4 5
10 - 1 2 - 5
11 1 2 3 1 5
12 - - - 2 5
13 1 1 1 3 5
14 - 2 2 4 5
15 1 - 3 - 5
16 - 1 - 1 5
17 1 2 1 2 5
18 - - 2 3 5
19 1 1 3 4 5
20 - 2 - - 5
20 rows selected.
Statement processed.
Table: GP, columnID, COUNT: 0
Table: GP, columnC1, COUNT: 10
Table: GP, columnC2, COUNT: 6
Table: GP, columnC3, COUNT: 5
Table: GP, columnC4, COUNT: 4
Table: GP, columnC5, COUNT: 0

c.column_name is never null because it's the content of the column "column_name" of the table "all_tab_columns"
not the column of which name is the value of c.column_name, in table gp.
You have to use dynamic query and EXECUTE IMMEDIATE to achieve what you want.

Related

SQL: Create Trailer Count

I have created a query that will output a flat file with header and details.
Now, I want to add a trailer record that will contain the total count of the detail records.
I have correctly counted the total of records using row_number, but it displays the every record.
How can I get the last line so that it will reflect the total count in the trailer line.
This is the code I already created for the headers and detail.
SQL> SELECT filerec FROM (
2 SELECT 'FILENAME' AS filerec, 1 col FROM dual
3 UNION ALL
4 SELECT 'FILEDATE: ' || to_char(SYSDATE,'mm/dd/yyyy') as filerec, 2 col FROM dual
5 UNION ALL
6 SELECT empno || ename AS filerec, NULL col FROM emp
7 ORDER BY 2,1
8 );
This is the output I want to get. (added the last rec, 'TRAILER: 0004')
FILENAME
FILEDATE: 02/27/2015
7369SMITH
7499ALLEN
7521WARD
7566JONES
TRAILER: 0004
There are several ways to do this. I'd prefer to use grouping. See how it can be used. Let we have a recordset:
SQL> select
2 'Row ' || rownum
3 from
4 dual
5 connect by
6 level <= 5;
'ROW'||ROWNUM
--------------------------------------------
Row 1
Row 2
Row 3
Row 4
Row 5
Now we wish to add counting:
SQL> select
2 case
3 when grouping_id(rownum) = 0 then 'Row ' || rownum
4 else 'Total: ' || count(*) || ' row(s)'
5 end
6 from
7 dual
8 connect by
9 level <= 5
10 group by rollup (rownum);
CASEWHENGROUPING_ID(ROWNUM)=0T
------------------------------------------------------
Row 1
Row 2
Row 3
Row 4
Row 5
Total: 5 row(s)
6 rows selected

SQL - How to select a column alias from another table?

I have a table in SQL with data and another table that holds the alias for that column. It is used for translation purposes.
I was wondering how can I do a select on those columns but retrieve the alias from another table?
This is the table that holds the real column names:
ID PageID ColName Order Type Width IsDeleted
1 7 CustType 2 NULL NULL 0
2 7 Description 3 NULL NULL 0
3 7 ApplyVAT 4 NULL NULL 0
4 7 ProduceInvoices 5 NULL NULL 0
5 7 PurchaseSale 6 NULL NULL 0
6 7 TermsDays 7 NULL NULL 0
7 7 DateTimeLastUpdated 8 NULL NULL 0
This is the table that holds the alias (text):
ID ColID UserID Text Order Enabled?
50 22 1 id 1 1
51 1 1 CustTypes 2 1
52 2 1 Description 3 1
53 3 1 ApplyVAT NULL 0
54 4 1 ProduceInvoices NULL 0
55 5 1 PurchaseSale NULL 0
56 6 1 TermsDays NULL 0
57 7 1 DateTimeLastUpdated NULL 0
I believe you will need to use dynamic sql to do this, e.g.:
DECLARE #Sql NVARCHAR(MAX);
SELECT TOP 1 #Sql = 'SELECT dt.ID as ' + at.IDAlias + ', dt.Town as ' + at.TownAlias
+ ' FROM DataTable dt'
FROM AliasTable at
WHERE at.LanguageID = 2;
EXEC(#Sql)
Given the example of Data Table
CREATE TABLE DataTable
(
ID INT,
Town NVARCHAR(50)
);
And a table holding language - dependent aliases for the columns in the above:
CREATE TABLE AliasTable
(
LanguageId INT,
IDAlias NVARCHAR(100),
TownAlias NVARCHAR(100)
);
SqlFiddle here
One of the (many) caveats with dynamic Sql is you will need to ensure that the alias data is validated against Sql Injectin attacks.

Calculation of 4 fields with another row

Table1 has 6 columns that code, code1, %ofcode1, calc, code2, %ofcode2.
code code1 %ofcode1 calc code2 %ofcode2
1 a 20 + b 10
2 1 - c
3 2 10 * d 10
Table2 has 2 columns that field, value.
field value
a 50
b 20
c 10
d 20
I need final calculation value using function
calculation might be like tis using table1 format and getting values from table2.
50*20/100 + 20*10/100
12 - 10
2*10/100 * 20*10/100 = 0.4
I need value that 0.4
You can try implementing something along these lines.
DECLARE
var_a VARCHAR2(50);
int_a NUMBER;
BEGIN
var_a := 'a'; -- SELECT code1 FROM Table1 WHERE code="input";
IF REGEXP_LIKE(var_a, '^\d+(\.\d+)?$')
THEN int_a := to_number(var_a, '9999.99');
ELSE int_a := (-16);-- SELECT value FROM Table2 WHERE field=var_a
END IF;
dbms_output.put_line('first int is: ' || int_a);
END;
/

Table transformation / field parsing in PL/SQL

I have de-normalized table, something like
CODES
ID | VALUE
10 | A,B,C
11 | A,B
12 | A,B,C,D,E,F
13 | R,T,D,W,W,W,W,W,S,S
The job is to convert is where each token from VALUE will generate new row. Example:
CODES_TRANS
ID | VALUE_TRANS
10 | A
10 | B
10 | C
11 | A
11 | B
What is the best way to do it in PL/SQL without usage of custom pl/sql packages, ideally with pure SQL?
Obvious solution is to implement it via cursors. Any ideas?
Another alternative is to use the model clause:
SQL> select id
2 , value
3 from codes
4 model
5 return updated rows
6 partition by (id)
7 dimension by (-1 i)
8 measures (value)
9 ( value[for i from 0 to length(value[-1])-length(replace(value[-1],',')) increment 1]
10 = regexp_substr(value[-1],'[^,]+',1,cv(i)+1)
11 )
12 order by id
13 , i
14 /
ID VALUE
---------- -------------------
10 A
10 B
10 C
11 A
11 B
12 A
12 B
12 C
12 D
12 E
12 F
13 R
13 T
13 D
13 W
13 W
13 W
13 W
13 W
13 S
13 S
21 rows selected.
I have written up to 6 alternatives for this type of query in this blogpost: http://rwijk.blogspot.com/2007/11/interval-based-row-generation.html
Regards,
Rob.
I have a pure SQL solution for you.
I adapted a trick I found on an old Ask Tom site, posted by Mihail Bratu. My adaptation uses regex to tokenise the VALUE column, so it requires 10g or higher.
The test data.
SQL> select * from t34
2 /
ID VALUE
---------- -------------------------
10 A,B,C
11 A,B
12 A,B,C,D,E,F
13 R,T,D,W1,W2,W3,W4,W5,S,S
SQL>
The query:
SQL> select t34.id
2 , t.column_value value
3 from t34
4 , table(cast(multiset(
5 select regexp_substr (t34.value, '[^(,)]+', 1, level)
6 from dual
7 connect by level <= length(value)
8 ) as sys.dbms_debug_vc2coll )) t
9 where t.column_value != ','
10 /
ID VALUE
---------- -------------------------
10 A
10 B
10 C
11 A
11 B
12 A
12 B
12 C
12 D
12 E
12 F
13 R
13 T
13 D
13 W1
13 W2
13 W3
13 W4
13 W5
13 S
13 S
21 rows selected.
SQL>
Based on Celko's book, here is what I found and it's working well!
SELECT
TABLE1.ID
, MAX(SEQ1.SEQ) AS START_POS
, SEQ2.SEQ AS END_POS
, COUNT(SEQ2.SEQ) AS PLACE
FROM
TABLE1, V_SEQ SEQ1, V_SEQ SEQ2
WHERE
SUBSTR(',' || TABLE1.VALUE || ',', SEQ1.SEQ, 1) = ','
AND SUBSTR(',' || TABLE1.VALUE || ',', SEQ2.SEQ, 1) = ','
AND SEQ1.SEQ < SEQ2.SEQ
AND SEQ2.SEQ <= LENGTH(TABLE1.VALUE)
GROUP BY TABLE1.ID, TABLE1.VALUE, SEQ2.SEQ
Where V_SEQ is a static table with one field:
SEQ, integer values 1 through N, where N >= MAX_LENGTH(VALUE).
This is based on the fact the the VALUE is wrapped by ',' on both ends, like this:
,A,B,C,D,
If your tokens are fixed length (like in my case) I simply used PLACE field to calculate the actual string. If variable length, use start_pos and end_pos
So, in my case, tokens are 2 char long, so the final SQL is:
SELECT
TABLE1.ID
, SUBSTR(TABLE1.VALUE, T_SUB.PLACE * 3 - 2 , 2 ) AS SINGLE_VAL
FROM
(
SELECT
TABLE1.ID
, MAX(SEQ1.SEQ) AS START_POS
, SEQ2.SEQ AS END_POS
, COUNT(SEQ2.SEQ) AS PLACE
FROM
TABLE1, V_SEQ SEQ1, V_SEQ SEQ2
WHERE
SUBSTR(',' || TABLE1.VALUE || ',', SEQ1.SEQ, 1) = ','
AND SUBSTR(',' || TABLE1.VALUE || ',', SEQ2.SEQ, 1) = ','
AND SEQ1.SEQ < SEQ2.SEQ
AND SEQ2.SEQ <= LENGTH(TABLE1.VALUE)
GROUP BY TABLE1.ID, TABLE1.VALUE, SEQ2.SEQ
) T_SUB
INNER JOIN
TABLE1 ON TABLE1.ID = T_SUB.ID
ORDER BY TABLE1.ID, T_SUB.PLACE
Original Answer
In SQL Server TSQL we parse strings and make a table object. Here is sample code - maybe you can translate it.
http://rbgupta.blogspot.com/2007/10/tsql-parsing-delimited-string-into.html
Second Option
Count the number of commas per row. Get the Max number of commas. Let's say that in the entire table you have a row with 5 commas max. Build a SELECT with 5 substrings. This will make it a set based operation and should be much faster than a rbar.

UPDATE with SUM() in MySQL

My table:
ID NAME COST PAR P_val S_val
1 X 5 0 1 0
1 y 5 0 2 0
1 z 5 0 0 5
2 XY 4 0 4 4
I need to update the PAR field with the SUM(S_val), grouped by ID:
For ID 1 PAR should be SUM(SVAL) WHERE ID=1
For ID 2 PAR should be SUM(SVAL) WHERE ID=2
Expected ouput:
ID NAME COST PAR P_val S_val
1 X 5 5 1 0
1 y 5 5 2 0
1 z 5 5 0 5
2 XY 4 4 4 4
How can I UPDATE the PAR value?
My code:
UPDATE Table_Name SET PAR = (SELECT SUM(S_val) FROM Table_Name WHERE ID=1)
FROM Table_Name
This does not work.
Unfortunately, you cannot update a table joined with itself in MySQL.
You'll need to create a function as a workaround:
DELIMITER $$
CREATE FUNCTION `fn_get_sum`(_id INT) RETURNS int(11)
READS SQL DATA
BEGIN
DECLARE r INT;
SELECT SUM(s_val)
INTO r
FROM table_name
WHERE id = _id;
RETURN r;
END $$
DELIMITER ;
UPDATE table_name
SET par = fn_get_sum(id)
Try:
UPDATE Table_NAme SET PAR= summedValue
FROM TAble_NAME t
JOIN (
SELECT ID, SUM(S_val) as summedvalue
FROM TABLE_NAME GROUP BY ID
) s on t.ID = s.ID
UPDATE Table_Name SET PAR = (SELECT SUM(S_val) FROM Table_Name WHERE ID=1)
FROM Table_Name
Check writing. delete "FROM Table_Name" row.
TRUE command is:
UPDATE Table_Name SET PAR = (SELECT SUM(S_val) FROM Table_Name WHERE ID=1)