Django ORM remove unwanted Group by when annotate multiple aggregate columns - sql

I want to create a query something like this in django ORM.
SELECT COUNT(CASE WHEN myCondition THEN 1 ELSE NULL end) as numyear
FROM myTable
Following is the djang ORM query i have written
year_case = Case(When(added_on__year = today.year, then=1), output_field=IntegerField())
qs = (ProfaneContent.objects
.annotate(numyear=Count(year_case))
.values('numyear'))
This is the query which is generated by django orm.
SELECT COUNT(CASE WHEN "analyzer_profanecontent"."added_on" BETWEEN 2020-01-01 00:00:00+00:00 AND 2020-12-31 23:59:59.999999+00:00 THEN 1 ELSE NULL END) AS "numyear" FROM "analyzer_profanecontent" GROUP BY "analyzer_profanecontent"."id"
All other things are good, but django places a GROUP BY at the end leading to multiple rows and incorrect answer. I don't want that at all. Right now there is just one column but i will place more such columns.
EDIT BASED ON COMMENTS
I will be using the qs variable to get values of how my classifications have been made in the current year, month, week.
UPDATE
On the basis of comments and answers i am getting here let me clarify. I want to do this at the database end only (obviously using Django ORM and not RAW SQL). Its a simple sql query. Doing anything at Python's end will be inefficient since the data can be too large. Thats why i want the database to get me the sum of records based on the CASE condition.
I will be adding more such columns in the future so something like len() or .count will not work.
I just want to create the above mentioned query using Django ORM (without an automatically appended GROUP BY).

When using aggregates in annotations, django needs to have some kind of grouping, if not it defaults to primary key. So, you need to use .values() before .annotate(). Please see django docs.
But to completely remove group by you can use a static value and django is smart enough to remove it completely, so you get your result using ORM query like this:
year_case = Case(When(added_on__year = today.year, then=1), output_field=IntegerField())
qs = (ProfaneContent.objects
.annotate(dummy_group_by = Value(1))
.values('dummy_group_by')
.annotate(numyear=Count(year_case))
.values('numyear'))

If you need to summarize only to one row then you should to use an .aggregate() method instead of annotate().
result = ProfaneContent.objects.aggregate(
numyear=Count(year_case),
# ... more aggregated expressions are possible here
)
You get a simple dictionary of result columns:
>>> result
{'numyear': 7, ...}
The generated SQL query is without groups, exactly how required:
SELECT
COUNT(CASE WHEN myCondition THEN 1 ELSE NULL end) as numyear
-- and more possible aggregated expressions
FROM myTable

What about a list comprehension:
# get all the objects
profane = ProfaneContent.objects.all()
# Something like this
len([pro for pro in profane if pro.numyear=today.year])
if the num years are equal it will add it to the list, so at the and you can check the len()
to get the count
Hopefully this is helpfull!

This is how I would write it in SQL.
SELECT SUM(CASE WHEN myCondition THEN 1 ELSE 0 END) as numyear
FROM myTable
SELECT
SUM(CASE WHEN "analyzer_profanecontent"."added_on"
BETWEEN 2020-01-01 00:00:00+00:00
AND 2020-12-31 23:59:59.999999+00:00
THEN 1
ELSE 0
END) AS "numyear"
FROM "analyzer_profanecontent"
GROUP BY "analyzer_profanecontent"."id"
If you intend to use other items in the SELECT clause I would recommend using a group by as well which would look like this:
SELECT SUM(CASE WHEN myCondition THEN 1 ELSE 0 END) as numyear
FROM myTable
GROUP BY SUM(CASE WHEN myCondition THEN 1 ELSE 0 END)

Related

SQL CASE WHEN- can I do a function within a function? New to SQL

SELECT
SP.SITE,
SYS.COMPANY,
SYS.ADDRESS,
SP.CUSTOMER,
SP.STATUS,
DATEDIFF(MONTH,SP.MEMBERSINCE, SP.EXPIRES) AS MONTH_COUNT
CASE WHEN(MONTH_COUNT = 0 THEN MONTH_COUNT = DATEDIFF(DAY,SP.MEMBERSINCE, SP.EXPIRES) AS DAY_COUNT)
ELSE NULL
END
FROM SALEPASSES AS SP
INNER JOIN SYSTEM AS SYS ON SYS.SITE = SP.SITE
WHERE STATUS IN (7,27,29);
I am still trying to understand SQL. Is this the right order to have everything? I'm assuming my datediff() is unable to work because it's inside case when. What I am trying to do, is get the day count if month_count is less than 1 (meaning it's less than one month and we need to count the days between the dates instead). I need month_count to run first to see if doing the day_count would even be necessary. Please give me feedback, I'm new and trying to learn!
Case is an expression, it returns a value, it looks like you should be doing this:
DAY_COUNT =
CASE WHEN DATEDIFF(MONTH,SP.MEMBERSINCE, SP.EXPIRES) = 0
THEN DATEDIFF(DAY,SP.MEMBERSINCE, SP.EXPIRES))
ELSE NULL END
You shouldn't actually need else null as NULL is the default.
Note also you [usually] cannot refer to a derived column in the same select
It appears that what you are trying to do is define the MonthCount column's value, and then reuse that value in another column's definition. (The Don't Repeat Yourself principle.)
In most dialects of SQL, you can't do that. Including MS SQL Server.
That's because SQL is a "declarative" language. This means that SQL Server is free to calculate the column values in any order that it likes. In turn, that means you're not allowed to do anything that would rely on one column being calculated before another.
There are two basic ways around that...
First, use CTEs or sub-queries to create two different "scopes", allowing you to define MonthCount before DayCount, and so reuse the value without retyping the definition.
SELECT
*,
CASE WHEN MonthCount = 0 THEN foo ELSE NULL END AS DayCount
FROM
(
SELECT
*,
bar AS MonthCount
FROM
x
)
AS derive_month
The second main way is to somehow derive the value Before the SELECT block is evaluated. In this case, using APPLY to 'join' a single value on to each input row...
SELECT
x.*,
MonthCount,
CASE WHEN MonthCount = 0 THEN foo ELSE NULL END AS DayCount
FROM
x
CROSS APPLY
(
SELECT
bar AS MonthCount
)
AS derive_month

SQL multiple constrained counts in query

I am trying to get with 1 query multiple count results where each one is a subset of the previous one.
So my table would be called Recipe and has these columns:
recipe_num(Primary_key) decimal,
recipe_added date,
is_featured bit,
liked decimal
And what I want is to make a query that will return the amount of likes grouped by day for any particular month with
total recipes as total_recipes,
total recipes that were featured as featured_recipes,
total number of recipes that were featured and had more than 100 likes liked_recipes
So as you can see each they are all counts with each being a subset of the previous one.
Ideally I don't want to run separate select count's where that query the whole table but rather get from the previous one.
I am not very good at using count with Where, Having, etc... and not exactly sure how to do it, so far I have the following which I managed via digging around here.
select
recipe_added,
count(*) total_recipes,
count(case is_featured when 1 then 1 else null end) total_featured_recipes
from
RECIPES
group by
recipe_added
I am not exactly sure why I have to use case inside the count but I wasn't able to get it to work using WHERE, would like to know if this is possible as well.
Thanks
With a CASE expression inside COUNT() you are doing conditional aggregation and this is exactly what you need for this requirement:
select recipe_added,
count(*) total_recipes,
count(case when is_featured = 1 then 1 end) total_featured_recipes,
count(case when is_featured = 1 and liked > 100 then 1 end) liked_recipes
from Recipes
group by recipe_added
There is no need for ELSE null because the default behavior of a CASE expression is to return null when no other branch returns a value.
If you want results for a specific month, say October 2020, you can add a WHERE clause before the GROUP BY:
where format(recipe_added, 'yyyyMM') = '202010'
This will work for SQL Server.
If you are using a different database then you can use a similar approach.

Conditional COUNT within CASE statement

Table 1, is a list of clients, what membership they have, what service they used, and the date the service was used
Table 2, is just table 1 grouped by month and membership type, then a count of the service sessions
What I am trying to do is count membership sessions only by particular service types. This is what I have so far, it returns an error saying 'Service_Type' is not in an aggregate function or group by clause, when I put 'Service_Type' in a group by, the query has no errors but the SESSIONS column is all NULL.
SELECT
DATEFROMPARTS(YEAR(t1.Date),MONTH(t1.Date),1)AS 'Draft_Date',
Membership,
CASE
WHEN Membership = 5 AND Service_Type = 'A' THEN COUNT(*)
WHEN Membership = 2 AND Service_Type IN ('J','C')
END AS'SESSIONS'
FROM Table1 t1
GROUP BY DATEFROMPARTS(YEAR(t1.Date),MONTH(t1.Date),1),Membership
The case statement will include all memberships and service types but I think this is enough for my example. Any help would be greatly appreciated! I've been on this for days.
Table 1
Table 2
You were nearly there! I've made a few changes:
SELECT
DATEFROMPARTS(YEAR(t1.Date), MONTH(t1.Date),1) AS Draft_Date,
Membership,
COUNT(CASE WHEN t1.Membership = 5 AND t1.Service_Type = 'A' THEN 1 END) as m5stA,
COUNT(CASE WHEN t1.Membership = 2 AND t1.Service_Type IN ('J','C') THEN 1 END) as m2stJC
FROM Table1 t1
GROUP BY YEAR(t1.Date), MONTH(t1.Date), Membership
Changes:
Avoid using apostrophes to alias column names, use ascii standard " double quotes if you must
When doing a conditional count, put the count outside the CASE WHEN, and have the case when return something (any non null thing will be fine - i used 1, but it could also have been 'x' etc) when the condition is met. Don't put an ELSE - CASE WHEN will return null if there is no ELSE and the condition is not met, and nulls don't COUNT (you could also write ELSE NULL, though it's redundant)
Qualify all your column names, always - this helps keep the query working when more tables are added in future, or even if new columns with the same names are added to existing tables
You forgot a THEN in the second WHEN
You don't necessarily need to GROUP BY the output of DATEFROMPARTS. When a deterministic function is used (always produces the same output from the same inputs) the db is smart enough to know that grouping on the inputs is also fine
Your example data didn't contain any data that would make the COUNT count 1+ by the way, but I'm sure you will have other conditional counts that work out (it just made it harder to test)
use sum
SELECT DATEFROMPARTS(YEAR(t1.Date),MONTH(t1.Date),1) AS Draft_Date , Membership,
sum(CASE WHEN Membership = 5 AND Service_Type = 'A' THEN 1 else 0 end),
sum(case WHEN Membership = 2 AND Service_Type IN ('J','C') then 1 else 0 end)
FROM Table1 t1 group by DATEFROMPARTS(YEAR(t1.Date),MONTH(t1.Date),1)

Return NULL instead of 0 when using COUNT(column) SQL Server

I have query which running fine and its doing two types of work, COUNT and SUM.
Something like
select
id,
Count (contracts) as countcontracts,
count(something1),
count(something1),
count(something1),
sum(cost) as sumCost
from
table
group by
id
My problem is: if there is no contract for a given ID, it will return 0 for COUNT and Null for SUM. I want to see null instead of 0
I was thinking about case when Count (contracts) = 0 then null else Count (contracts) end but I don't want to do it this way because I have more than 12 count positions in query and its prepossessing big amount of records so I think it may slow down query performance.
Is there any other ways to replace 0 with NULL?
Try this:
select NULLIF ( Count(something) , 0)
Here are three methods:
1. (case when count(contracts) > 0 then count(contracts) end) as countcontracts
2. sum(case when contracts is not null then 1 end) as countcontracts
3. nullif(count(contracts), 0)
All three of these require writing more complicated expressions. However, this really isn't that difficult. Just copy the line multiple times, and change the name of the variable on each one. Or, take the current query, put it into a spreadsheet and use spreadsheet functions to make the transformation. Then copy the function down. (Spreadsheets are really good code generators for repeated lines of code.)

Sql Server equivalent of a COUNTIF aggregate function

I'm building a query with a GROUP BY clause that needs the ability to count records based only on a certain condition (e.g. count only records where a certain column value is equal to 1).
SELECT UID,
COUNT(UID) AS TotalRecords,
SUM(ContractDollars) AS ContractDollars,
(COUNTIF(MyColumn, 1) / COUNT(UID) * 100) -- Get the average of all records that are 1
FROM dbo.AD_CurrentView
GROUP BY UID
HAVING SUM(ContractDollars) >= 500000
The COUNTIF() line obviously fails since there is no native SQL function called COUNTIF, but the idea here is to determine the percentage of all rows that have the value '1' for MyColumn.
Any thoughts on how to properly implement this in a MS SQL 2005 environment?
You could use a SUM (not COUNT!) combined with a CASE statement, like this:
SELECT SUM(CASE WHEN myColumn=1 THEN 1 ELSE 0 END)
FROM AD_CurrentView
Note: in my own test NULLs were not an issue, though this can be environment dependent. You could handle nulls such as:
SELECT SUM(CASE WHEN ISNULL(myColumn,0)=1 THEN 1 ELSE 0 END)
FROM AD_CurrentView
I usually do what Josh recommended, but brainstormed and tested a slightly hokey alternative that I felt like sharing.
You can take advantage of the fact that COUNT(ColumnName) doesn't count NULLs, and use something like this:
SELECT COUNT(NULLIF(0, myColumn))
FROM AD_CurrentView
NULLIF - returns NULL if the two passed in values are the same.
Advantage: Expresses your intent to COUNT rows instead of having the SUM() notation.
Disadvantage: Not as clear how it is working ("magic" is usually bad).
I would use this syntax. It achives the same as Josh and Chris's suggestions, but with the advantage it is ANSI complient and not tied to a particular database vendor.
select count(case when myColumn = 1 then 1 else null end)
from AD_CurrentView
How about
SELECT id, COUNT(IF status=42 THEN 1 ENDIF) AS cnt
FROM table
GROUP BY table
Shorter than CASE :)
Works because COUNT() doesn't count null values, and IF/CASE return null when condition is not met and there is no ELSE.
I think it's better than using SUM().
Adding on to Josh's answer,
SELECT COUNT(CASE WHEN myColumn=1 THEN AD_CurrentView.PrimaryKeyColumn ELSE NULL END)
FROM AD_CurrentView
Worked well for me (in SQL Server 2012) without changing the 'count' to a 'sum' and the same logic is portable to other 'conditional aggregates'. E.g., summing based on a condition:
SELECT SUM(CASE WHEN myColumn=1 THEN AD_CurrentView.NumberColumn ELSE 0 END)
FROM AD_CurrentView
It's 2022 and latest SQL Server still doesn't have COUNTIF (along with regex!). Here's what I use:
-- Count if MyColumn = 42
SELECT SUM(IIF(MyColumn = 42, 1, 0))
FROM MyTable
IIF is a shortcut for CASE WHEN MyColumn = 42 THEN 1 ELSE 0 END.
Not product-specific, but the SQL standard provides
SELECT COUNT() FILTER WHERE <condition-1>,
COUNT() FILTER WHERE <condition-2>, ...
FROM ...
for this purpose. Or something that closely resembles it, I don't know off the top of my hat.
And of course vendors will prefer to stick with their proprietary solutions.
Why not like this?
SELECT count(1)
FROM AD_CurrentView
WHERE myColumn=1
I had to use COUNTIF() in my case as part of my SELECT columns AND to mimic a % of the number of times each item appeared in my results.
So I used this...
SELECT COL1, COL2, ... ETC
(1 / SELECT a.vcount
FROM (SELECT vm2.visit_id, count(*) AS vcount
FROM dbo.visitmanifests AS vm2
WHERE vm2.inactive = 0 AND vm2.visit_id = vm.Visit_ID
GROUP BY vm2.visit_id) AS a)) AS [No of Visits],
COL xyz
FROM etc etc
Of course you will need to format the result according to your display requirements.
SELECT COALESCE(IF(myColumn = 1,COUNT(DISTINCT NumberColumn),NULL),0) column1,
COALESCE(CASE WHEN myColumn = 1 THEN COUNT(DISTINCT NumberColumn) ELSE NULL END,0) AS column2
FROM AD_CurrentView