Null. Empty and Filled count in a postgresql table - sql

I need to count how many fields are null, empty and filled in a table.
Where each column would be a field, for example:
select COUNT(1),
case when table.name is null
then 'Null'
when table.name = ''
then 'Empty'
else 'Filled' end
from table
(this is an exemple)
FIELD NULL EMPTY FILLED
name 0 2 98
age 10 10 80
heigh 0 50 50
Does anyone have any idea how I can do this? This table has about 30 columns.

You can convert each row to a JSON value and then group by the keys of those json value (which are the column names):
select d.col,
count(*) filter (where value is null) as null_count,
count(*) filter (where value is not null) as not_null_count,
count(*) filter (where value = '') as empty
from the_table t
cross join jsonb_each_text(to_jsonb(t)) as d(col, value)
group by d.col;
Note that this is going to be much slower than manually listing every column like you did.

Related

PostgreSQL return tuples where another column is not null

Let’s say I have a table
Name age
A Null
B Null
B 7
C 9
C 8
How can I write a sql query to return
Name
C
Meaning that only names where there is no null value in age are returned? Specifically using Postgres
Thoughts so far:
I think doing select name from table where age is not null, returns B and C because B has one age that isn’t null. So then, I thought about grouping by name but aggregation seems to remove bulls. Any help appreciated!
Depending on what dbms you are using, you can use ISNULL:
SELECT name FROM table
GROUP BY name
HAVING SUM(ISNULL(age)) = 0
Do a GROUP BY. COUNT(age) counts non-null values. COUNT(*) counts all rows.
SELECT name
FROM table
GROUP BY name
HAVING COUNT(age) = COUNT(*)
Or do an EXCEPT query:
SELECT name FROM table
EXCEPT
SELECT name FROM table WHERE age IS NULL

SQL select COUNT issue

I have a table
num
----
NULL
NULL
NULL
NULL
55
NULL
NULL
NULL
99
when I wrote
select COUNT(*)
from tbl
where num is null
the output was 7
but when I wrote
select COUNT(num)
from tbl
where num is null
the output was 0
what's the difference between these two queries ??
Difference is in the field you select.
When counting COUNT(*) NULL values are taken into account (count all rows returned).
When counting COUNT(num) NULL values are NOT taken into account (count all non-null fields).
That is a standard behavior in SQL, whatever the DBMS used
Source. look at COUNT(DISTINCT expr,[expr...])
count(*) returns number of rows, count(num) returns number of rows where num is not null. Change your last query to select count(*) from test where num is null to get the result you expect.
In second case first count values are eliminated and then where clause comes in picture. While in first case when you are using * row with null is not eliminated.
If you are counting on a coll which contains null and you want rows with null to be included in count than use
Count(ISNULL(col,0))
Count(*) counts the number of rows, COUNT(num) counts the number of not-null values in column num.
Considering the output given above, the result of the query count(num) should be 2.

Get percent of columns that completed by calculating null values

I have a table with a column that allows nulls. If the value is null it is incomplete. I want to calculate the percentage complete.
Can this be done in MySQL through SQL or should I get the total entries and the total null entries and calculate the percentage on the server?
Either way, I'm very confused on how I need to go about separating the variable_value so that I can get its total results and also its total NULL results.
SELECT
games.id
FROM
games
WHERE
games.category_id='10' AND games.variable_value IS NULL
This gives me all the games where the variable_value is NULL. How do I extend this to also get me either the TOTAL games or games NOT NULL along with it?
Table Schema:
id (INT Primary Auto-Inc)
category_id (INT)
variable_value (TEXT Allow Null Default: NULL)
When you use "Count" with a column name, null values are not included. So to get the count or percent not null just do this...
SELECT
count(1) as TotalAll,
count(variable_value) as TotalNotNull,
count(1) - count(variable_value) as TotalNull,
100.0 * count(variable_value) / count(1) as PercentNotNull
FROM
games
WHERE
category_id = '10'
SELECT
SUM(CASE WHEN G.variable_value IS NOT NULL THEN 1 ELSE 0 END)/COUNT(*) AS pct_complete
FROM
Games G
WHERE
G.category_id = '10'
You might need to do some casting on the SUM() so that you get a decimal.
To COUNT the number of entries matching your WHERE statement, use COUNT(*)
SELECT COUNT(*) AS c FROM games WHERE games.variable_value IS NULL
If you want both total number of rows and those with variable_value being NULL in one statement, try GROUP BY
SELECT COUNT(variable_value IS NULL) AS c, (variable_value IS NULL) AS isnull FROM games GROUP BY isnull
Returns something like
c | isnull
==============
12 | 1
193 | 0
==> 12 entries have NULL in that column, 193 havn't
==> Percentage: 12 / (12 + 193)

Select values in SQL that do not have other corresponding values except those that i search for

I have a table in my database:
Name | Element
1 2
1 3
4 2
4 3
4 5
I need to make a query that for a number of arguments will select the value of Name that has on the right side these and only these values.
E.g.:
arguments are 2 and 3, the query should return only 1 and not 4 (because 4 also has 5). For arguments 2,3,5 it should return 4.
My query looks like this:
SELECT name FROM aggregations WHERE (element=2 and name in (select name from aggregations where element=3))
What do i have to add to this query to make it not return 4?
A simple way to do it:
SELECT name
FROM aggregations
WHERE element IN (2,3)
GROUP BY name
HAVING COUNT(element) = 2
If you want to add more, you'll need to change both the IN (2,3) part and the HAVING part:
SELECT name
FROM aggregations
WHERE element IN (2,3,5)
GROUP BY name
HAVING COUNT(element) = 3
A more robust way would be to check for everything that isn't not in your set:
SELECT name
FROM aggregations
WHERE NOT EXISTS (
SELECT DISTINCT a.element
FROM aggregations a
WHERE a.element NOT IN (2,3,5)
AND a.name = aggregations.name
)
GROUP BY name
HAVING COUNT(element) = 3
It's not very efficient, though.
Create a temporary table, fill it with your values and query like this:
SELECT name
FROM (
SELECT DISTINCT name
FROM aggregations
) n
WHERE NOT EXISTS
(
SELECT 1
FROM (
SELECT element
FROM aggregations aii
WHERE aii.name = n.name
) ai
FULL OUTER JOIN
temptable tt
ON tt.element = ai.element
WHERE ai.element IS NULL OR tt.element IS NULL
)
This is more efficient than using COUNT(*), since it will stop checking a name as soon as it finds the first row that doesn't have a match (either in aggregations or in temptable)
This isn't tested, but usually I would do this with a query in my where clause for a small amount of data. Note that this is not efficient for large record counts.
SELECT ag1.Name FROM aggregations ag1
WHERE ag1.Element IN (2,3)
AND 0 = (select COUNT(ag2.Name)
FROM aggregatsions ag2
WHERE ag1.Name = ag2.Name
AND ag2.Element NOT IN (2,3)
)
GROUP BY ag1.name;
This says "Give me all of the names that have the elements I want, but have no records with elements I don't want"

Grouping by intervals

Given a table (mytable) containing a numeric field (mynum), how would one go about writing an SQL query which summarizes the table's data based on ranges of values in that field rather than each distinct value?
For the sake of a more concrete example, let's make it intervals of 3 and just "summarize" with a count(*), such that the results tell the number of rows where mynum is 0-2.99, the number of rows where it's 3-5.99, where it's 6-8.99, etc.
The idea is to compute some function of the field that has constant value within each group you want:
select count(*), round(mynum/3.0) foo from mytable group by foo;
I do not know if this is applicable to mySql, anyway in SQL Server I think you can "simply" use group by in both the select list AND the group by list.
Something like:
select
CASE
WHEN id <= 20 THEN 'lessthan20'
WHEN id > 20 and id <= 30 THEN '20and30' ELSE 'morethan30' END,
count(*)
from Profiles
where 1=1
group by
CASE
WHEN id <= 20 THEN 'lessthan20'
WHEN id > 20 and id <= 30 THEN '20and30' ELSE 'morethan30' END
returns something like
column1 column2
---------- ----------
20and30 3
lessthan20 3
morethan30 13