Count of group for null is always 0 (zero) - sql

In TSql what is the recommended approach for grouping data containing nulls?
Example of the type of query:
Select Group, Count([Group])
From [Data]
Group by [Group]
It appears that the count(*) and count(Group) both result in the null group displaying 0.
Example of the expected table data:
Id, Group
---------
1 , Alpha
2 , Null
3 , Beta
4 , Null
Example of the expected result:
Group, Count
---------
Alpha, 1
Beta, 1
Null, 0
This is the desired result which can be obtained by count(Id). Is this the best way to get this result and why does count(*) and count(Group) return an "incorrect" result?
Group, Count
---------
Alpha, 1
Beta, 1
Null, 2
edit: I don't remember why I thought count(*) did this, it may be the answer I'm looking for..

The best approach is to use count(*) which behaves exactly like count(1) or any other constant.
The * will ensure every row is counted.
Select Group, Count(*)
From [Data]
Group by [Group]
The reason null shows 0 instead of 2 in this case is because each cell is counted as either 1 or null and null + null = null so the total of that group would also be null. However the column type is an integer so it shows up as 0.

Just do
SELECT [group], count([group])
GROUP BY [group]
SQL Fiddle Demo
Count(id) doesn't gives the expected result as mentioned in question. Gives value of 2 for group NULL

try this..
Select Group, Count(isNull(Group,0))
From [Data]
Group by [Group]

COUNT(*) should work:
SELECT Grp,COUNT(*)
FROM tab
GROUP BY Grp
One more solution could be following:
SELECT Grp, COUNT(COALESCE(Grp, ' '))
FROM tab
GROUP BY Grp
Here is code at SQL Fiddle

Related

SQL query to allow for latest datasets per items

I have this table in an SQL server database:
and I would like a query that gives me the values of cw1, cw2,cw3 for a restricted date condition.
I would like a query giving me the "latest" values of cw1, cw2, cw3 giving me previous values of cw1, cw2, cw3, if they are null for the last plan_date. This would be with a date condition.
So if the condition is plan_date between "02.01.2020" and "04.01.2020" then the result should be
1 04.01.2020 null, 9, 4
2 03.01.2020 30 , 15, 2
where, for example, the "30" is from the last previous date for item_nr 2.
You can get the last value using first_value(). Unfortunately, that is a window function, but select distinct solves that:
select distinct item_nr,
first_value(cw1) over (partition by item_nr
order by (case when cw1 is not null then 1 else 2 end), plan_date desc
) as imputed_cw1,
first_value(cw2) over (partition by item_nr
order by (case when cw2 is not null then 1 else 2 end), plan_date desc
) as imputed_cw2,
first_value(cw3) over (partition by item_nr
order by (case when cw3 is not null then 1 else 2 end), plan_date desc
) as imputed_cw3
from t;
You can add a where clause after the from.
The first_value() window function returns the first value from each partition. The partition is ordered to put the non-NULL values first, and then order by time descending. So, the most recent non-NULL value is first.
The only downside is that it is a window function, so the select distinct is needed to get the most recent value for each item_nr.

if count value of the column is greater than 1, I want to print the count of the column else I want to print value in the field

I am writing a query which fetches details from different tables. In one column I want to print count value of a column. If the count value of the column is greater than 1, I want to print the count of the column else I want to print value in the field.
I want to build a query which will give me count of user_id from table 1 & 2. if the count user_id is greater than 1, then print count (user_id) else print value of user_id
Table:1
| user_id |
| John |
| Bob |
| Kris |
| Tom |
Table:2
| user_id |
| Rob |
query result should list count of table1 as it greater than 1. Table2 should list Rob as it is lesser than 2
You want to select user IDs (names actually) from a table. If it's just one row then show that name, otherwise show the number of entries instead. So, just use a CASE expression to check whether count is 1 or greater than 1.
You probably need CAST or CONVERT to turn the count number into a string, so the CASE expression always returns the same type (this is how CASE works).
select
case when count(*) > 1
then cast(count(*) as varchar(100))
else max(user_id)
end as name_or_count
from mytable
Window Functions come to mind but since your user_ids are not numbers, you'll run into an issue where you can't have two different data types in the same column. See how this works for you. Make sure to cast the varchar numbers back to integer if this script is part of a larger process.
with cte as
(select 'John' as user_id union all
select 'Bob' as user_id union all
select 'Kris' as user_id union all
select 'Tom' as user_id)
select distinct case when count(*) over() > 1
then cast(count(*) over() as varchar) else user_id end
from cte
with cte as
(select 'Rob' as user_id)
select distinct case when count(*) over() > 1
then cast(count(*) over() as varchar) else user_id end
from cte

sum values per id for a specific condition without group by syntax

I would like to sum the values that have the some id and 'Y'in YN column ,in a case statement, hence I can not use the group by syntax. Please see below an example and my code : Table T
ID Value YN
1 4 Y
1 6 Y
2 3 N
Request:
select
case when YN = 'Y'
then ( select sum(Value) from T group by ID)
else Value
end as TotalResult;
Can you help me displaying only Totalresult ?
Just because you use GROUP BY does not mean that you have to include that column in the SELECT...
SELECT
SUM(Value) AS TotalResult
FROM
T
GROUP BY
ID, YN
=>
Total Result
--------------
10
3
Exactly what query you need, however, is unclear as you have not demonstrated clearly what you want the query to actually do, or what the expected results should be for your sample data.

SQL aggregate rows with same id , specific value in secondary column

I'm looking to filter out rows in the database (PostgreSQL) if one of the values in the status column occurs. The idea is to sum the amount column if the unique reference only has a status equals to 1. The query should not SELECT the reference at all if it has also a status of 2 or any other status for that matter. status refers to the state of the transaction.
Current data table:
reference | amount | status
1 100 1
2 120 1
2 -120 2
3 200 1
3 -200 2
4 450 1
Result:
amount | status
550 1
I've simplified the data example but I think it gives a good idea of what I'm looking for.
I'm unsuccessful in selecting only references that only have status 1.
I've tried sub-queries, using the HAVING clause and other methods without success.
Thanks
Here's a way using not exists to sum all rows where the status is 1 and other rows with the same reference and a non 1 status do not exist.
select sum(amount) from mytable t1
where status = 1
and not exists (
select 1 from mytable t2
where t2.reference = t1.reference
and t2.status <> 1
)
SELECT SUM(amount)
FROM table
WHERE reference NOT IN (
SELECT reference
FROM table
WHERE status<>1
)
The subquery SELECTs all references that must be excluded, then the main query sums everything except them
select sum (amount) as amount
from (
select sum(amount) as amount
from t
group by reference
having not bool_or(status <> 1)
) s;
amount
--------
550
You could use windowed functions to count occurences of status different than 1 per each group:
SELECT SUM(amount) AS amount
FROM (SELECT *,COUNT(*) FILTER(WHERE status<>1) OVER(PARTITION BY reference) cnt
FROM tc) AS sub
WHERE cnt = 0;
Rextester Demo

SQL Query Help: Returning distinct values from Count subquery

I've been stuck for quite a while now trying to get this query to work.
Here's the setup:
I have a [Notes] table that contains a nonunique (Number) column and a nonunique (Result) column. I'm looking to create a SELECT statement that will display each distinct (Number) value where the count of the {(Number), (Result)} tuple where Result = 'NA' is > 25.
Number | Result
100 | 'NA'
100 | 'TT'
101 | 'NA'
102 | 'AM'
100 | 'TT'
200 | 'NA'
200 | 'NA'
201 | 'NA'
Basically, have an autodialer that calls a number and returns a code depending on the results of the call. We want to ignore numbers that have had an 'NA'(no answer) code returned more than 25 times.
My basic attempts so far have been similar to:
SELECT DISTINCT n1.Number
FROM Notes n1
WHERE (SELECT COUNT(*) FROM Notes n2
WHERE n1.Number = n2.Number and n1.Result = 'NA') > 25
I know this query isn't correct, but in general I'm not sure how to relate the DISTINCT n1.Number from the initial select to the Number used in the subquery COUNT. Most examples I see aren't actually doing this by adding a condition to the COUNT returned. I haven't had to touch too much SQL in the past half decade, so I'm quite rusty.
you can do it like this :
SELECT Number
FROM Notes
WHERE Result = 'NA'
GROUP BY Number
HAVING COUNT(Result) > 25
Try this:
SELECT Number
FROM (
SELECT Number, Count(Result) as CountNA
FROM Notes
WHERE Result = 'NA'
GROUP BY Number
)
WHERE CountNA > 25
EDIT: depending on SQL product, you may need to give the derived table a table correlation name e.g.
SELECT DT1.Number
FROM (
SELECT Number, Count(Result) as CountNA
FROM Notes
WHERE Result = 'NA'
GROUP
BY Number
) AS DT1 (Number, CountNA)
WHERE DT1.CountNA > 25;