Get percent of columns that completed by calculating null values - sql

I have a table with a column that allows nulls. If the value is null it is incomplete. I want to calculate the percentage complete.
Can this be done in MySQL through SQL or should I get the total entries and the total null entries and calculate the percentage on the server?
Either way, I'm very confused on how I need to go about separating the variable_value so that I can get its total results and also its total NULL results.
SELECT
games.id
FROM
games
WHERE
games.category_id='10' AND games.variable_value IS NULL
This gives me all the games where the variable_value is NULL. How do I extend this to also get me either the TOTAL games or games NOT NULL along with it?
Table Schema:
id (INT Primary Auto-Inc)
category_id (INT)
variable_value (TEXT Allow Null Default: NULL)

When you use "Count" with a column name, null values are not included. So to get the count or percent not null just do this...
SELECT
count(1) as TotalAll,
count(variable_value) as TotalNotNull,
count(1) - count(variable_value) as TotalNull,
100.0 * count(variable_value) / count(1) as PercentNotNull
FROM
games
WHERE
category_id = '10'

SELECT
SUM(CASE WHEN G.variable_value IS NOT NULL THEN 1 ELSE 0 END)/COUNT(*) AS pct_complete
FROM
Games G
WHERE
G.category_id = '10'
You might need to do some casting on the SUM() so that you get a decimal.

To COUNT the number of entries matching your WHERE statement, use COUNT(*)
SELECT COUNT(*) AS c FROM games WHERE games.variable_value IS NULL
If you want both total number of rows and those with variable_value being NULL in one statement, try GROUP BY
SELECT COUNT(variable_value IS NULL) AS c, (variable_value IS NULL) AS isnull FROM games GROUP BY isnull
Returns something like
c | isnull
==============
12 | 1
193 | 0
==> 12 entries have NULL in that column, 193 havn't
==> Percentage: 12 / (12 + 193)

Related

SQL SUM and value conversion

I'm looking to transform data in my SUM query to acknowledge that some numeric values are negative in nature, although not represented as such.
I look for customer balance where the example dataset includes also credit transactions that are not written as negative in the database (although all records that have value C for credit in inv_type column should be treated as negative in the SQL SUM function). As an example:
INVOICES
inv_no inv_type cust_no value
1 D 25 10
2 D 35 30
3 C 25 5
4 D 25 50
5 C 35 2
My simple SUM function would not give me the correct answer:
select cust_no, sum(value) from INVOICES
group by cust_no
This query would obviously sum the balance of customer no 25 for 65 and no 35 for 32, although the anticipated answer would be 10-5+50 = 55 and 30 - 2 = 28
Should I perhaps utilize CAST function somehow? Unfortunately I'm not up to date on the underlying db engine, however good chance of it being of IBM origin. Most of the basic SQL code has worked out so far though.
You can use the case expression inside of a sum(). The simplest syntax would be:
select cust_no,
sum(case when inv_type = 'C' then - value else value end) as total
from invoices
group by cust_no;
Note that value could be a reserved word in your database, so you might need to escape the column name.
You should be able to write a projection (select) first to obtain a signed value column based on inv_type or whatever, and then do a sum over that.
Like this:
select cust_no, sum(value) from (
select cust_no
, case when inv_type='D' then [value] else -[value] end [value]
from INVOICES
) SUMS
group by cust_no
You can put an expression in the sum that calculates a negative value if the invoice is a credit:
select
cust_no,
sum
(
case inv_type
when 'C' then -[value]
else [value]
end
) as [Total]
from INVOICES

Calculate percentages of columns in Oracle SQL

I have three columns, all consisting of 1's and 0's. For each of these columns, how can I calculate the percentage of people (one person is one row/ id) who have a 1 in the first column and a 1 in the second or third column in oracle SQL?
For instance:
id marketing_campaign personal_campaign sales
1 1 0 0
2 1 1 0
1 0 1 1
4 0 0 1
So in this case, of all the people who were subjected to a marketing_campaign, 50 percent were subjected to a personal campaign as well, but zero percent is present in sales (no one bought anything).
Ultimately, I want to find out the order in which people get to the sales moment. Do they first go from marketing campaign to a personal campaign and then to sales, or do they buy anyway regardless of these channels.
This is a fictional example, so I realize that in this example there are many other ways to do this, but I hope anyone can help!
The outcome that I'm looking for is something like this:
percentage marketing_campaign/ personal campaign = 50 %
percentage marketing_campaign/sales = 0%
etc (for all the three column combinations)
Use count, sum and case expressions, together with basic arithmetic operators +,/,*
COUNT(*) gives a total count of people in the table
SUM(column) gives a sum of 1 in given column
case expressions make possible to implement more complex conditions
The common pattern is X / COUNT(*) * 100 which is used to calculate a percent of given value ( val / total * 100% )
An example:
SELECT
-- percentage of people that have 1 in marketing_campaign column
SUM( marketing_campaign ) / COUNT(*) * 100 As marketing_campaign_percent,
-- percentage of people that have 1 in sales column
SUM( sales ) / COUNT(*) * 100 As sales_percent,
-- complex condition:
-- percentage of people (one person is one row/ id) who have a 1
-- in the first column and a 1 in the second or third column
COUNT(
CASE WHEN marketing_campaign = 1
AND ( personal_campaign = 1 OR sales = 1 )
THEN 1 END
) / COUNT(*) * 100 As complex_condition_percent
FROM table;
You can get your percentages like this :
SELECT COUNT(*),
ROUND(100*(SUM(personal_campaign) / sum(count(*)) over ()),2) perc_personal_campaign,
ROUND(100*(SUM(sales) / sum(count(*)) over ()),2) perc_sales
FROM (
SELECT ID,
CASE
WHEN SUM(personal_campaign) > 0 THEN 1
ELSE 0
end AS personal_campaign,
CASE
WHEN SUM(sales) > 0 THEN 1
ELSE 0
end AS sales
FROM the_table
WHERE ID IN
(SELECT ID FROM the_table WHERE marketing_campaign = 1)
GROUP BY ID
)
I have a bit overcomplicated things because your data is still unclear to me. The subquery ensures that all duplicates are cleaned up and that you only have for each person a 1 or 0 in marketing_campaign and sales
About your second question :
Ultimately, I want to find out the order in which people get to the
sales moment. Do they first go from marketing campaign to a personal
campaign and then to sales, or do they buy anyway regardless of these
channels.
This is impossible to do in this state because you don't have in your table, either :
a unique row identifier that would keep the order in which the rows were inserted
a timestamp column that would tell when the rows were inserted.
Without this, the order of rows returned from your table will be unpredictable, or if you prefer, pure random.

SQL display duplicate field but not nulled field

I have a simple problem i think. I have in sql Server one table with this :
Name : Sum : CNP
Andrey 100 120
Marius 20 100
George 20 200
Popescu Nulled 300
Antal Nulled 100
I use this comand to show duplicate :
SELECT SUM, Name,CNP
FROM dbo.database
where SUM IN ( Select SUM from dbo.asigpag group by SUM HAVING Count(*)> 1)
Everything work ok.
In this Case show :
Name : Sum : CNP
Marius 20 100
George 20 200
Popescu Nulled 300
Antal Nulled 100
This is the problem . I want to display duplicate but with not Nulled.
I want to display this with all the other field not only Sum.
Name : Sum : CNP
Marius 20 100
George 20 200
You need to add another condition to exclude records that have a null value in the sum field:
SELECT SUM, Name,CNP
FROM dbo.database
where SUM IN ( Select SUM from dbo.asigpag group by SUM HAVING Count(*)> 1)
AND SUM is not NULL
SQL Server treats NULLS differently from values, because they have no value at all. They're special case that need to be selected using [Field] IS NULL or [Field] = NULL, or their reverse, as in this case.
In SQL NULL <> NULL always. You can use IS NULL or IS NOT NULL or predefined function ISNULL(SUM, 0) - with last statement you will prepare NULL values to default 0 value. For example:
SELECT SUM, Name,CNP
FROM dbo.database
where ISNULL(SUM, 0) IN ( Select ISNULL(SUM, 0) from dbo.asigpag group by SUM HAVING Count(*) > 1)
Update:
Sorry, I misunderstood what you want. In order to eliminate NULL from the list you need to update sub-query as:
Select SUM from dbo.asigpag where SUM IS NOT NULL group by SUM HAVING Count(*) > 1
Whole query will be:
SELECT SUM, Name, CNP
FROM dbo.database
where SUM IN ( Select SUM from dbo.asigpag Where SUM IS NOT NULL group by SUM HAVING Count(*) > 1)

SQL select COUNT issue

I have a table
num
----
NULL
NULL
NULL
NULL
55
NULL
NULL
NULL
99
when I wrote
select COUNT(*)
from tbl
where num is null
the output was 7
but when I wrote
select COUNT(num)
from tbl
where num is null
the output was 0
what's the difference between these two queries ??
Difference is in the field you select.
When counting COUNT(*) NULL values are taken into account (count all rows returned).
When counting COUNT(num) NULL values are NOT taken into account (count all non-null fields).
That is a standard behavior in SQL, whatever the DBMS used
Source. look at COUNT(DISTINCT expr,[expr...])
count(*) returns number of rows, count(num) returns number of rows where num is not null. Change your last query to select count(*) from test where num is null to get the result you expect.
In second case first count values are eliminated and then where clause comes in picture. While in first case when you are using * row with null is not eliminated.
If you are counting on a coll which contains null and you want rows with null to be included in count than use
Count(ISNULL(col,0))
Count(*) counts the number of rows, COUNT(num) counts the number of not-null values in column num.
Considering the output given above, the result of the query count(num) should be 2.

SQL Query Help: Returning distinct values from Count subquery

I've been stuck for quite a while now trying to get this query to work.
Here's the setup:
I have a [Notes] table that contains a nonunique (Number) column and a nonunique (Result) column. I'm looking to create a SELECT statement that will display each distinct (Number) value where the count of the {(Number), (Result)} tuple where Result = 'NA' is > 25.
Number | Result
100 | 'NA'
100 | 'TT'
101 | 'NA'
102 | 'AM'
100 | 'TT'
200 | 'NA'
200 | 'NA'
201 | 'NA'
Basically, have an autodialer that calls a number and returns a code depending on the results of the call. We want to ignore numbers that have had an 'NA'(no answer) code returned more than 25 times.
My basic attempts so far have been similar to:
SELECT DISTINCT n1.Number
FROM Notes n1
WHERE (SELECT COUNT(*) FROM Notes n2
WHERE n1.Number = n2.Number and n1.Result = 'NA') > 25
I know this query isn't correct, but in general I'm not sure how to relate the DISTINCT n1.Number from the initial select to the Number used in the subquery COUNT. Most examples I see aren't actually doing this by adding a condition to the COUNT returned. I haven't had to touch too much SQL in the past half decade, so I'm quite rusty.
you can do it like this :
SELECT Number
FROM Notes
WHERE Result = 'NA'
GROUP BY Number
HAVING COUNT(Result) > 25
Try this:
SELECT Number
FROM (
SELECT Number, Count(Result) as CountNA
FROM Notes
WHERE Result = 'NA'
GROUP BY Number
)
WHERE CountNA > 25
EDIT: depending on SQL product, you may need to give the derived table a table correlation name e.g.
SELECT DT1.Number
FROM (
SELECT Number, Count(Result) as CountNA
FROM Notes
WHERE Result = 'NA'
GROUP
BY Number
) AS DT1 (Number, CountNA)
WHERE DT1.CountNA > 25;