hql conditional expressions in aggregate functions - nhibernate

Does HQL support conditional expressions in aggregate functions?
I would like to do something like this
select
o.id,
count(o),
count(o.something is null and o.otherthing = :other)
from objects o
But I get a MissingTokenException from the Antlr parser.
EDIT: By using a subquery it's working, but it's ugly as I'm grouping by several variables...

You can use expressions in HQL. For this instance, you'll want to use SUM instead of COUNT as follows:
select
o.id,
count(o),
sum(case when o.something is null and o.otherthing = :other then 1 else 0 end)
from objects o
When the conditionals match, the SUM will receive a 1 for that row. When they do not match, it'll receive a zero. This should provide the same functionality as a COUNT.

Related

SQL selecting with conditioning from a subquery

I am trying to perform two sum functions from a query. However, I want to only perform one of the sum functions if it meets a certain condition without affecting the other sum function.
What I was thinking is to use something similar to select x where condition = 1 from AC which is however not possible.
Here is the sample query where I want the second [sum(t.match)] selection to only calculate if the result in the subquery: match = 1 while still getting the total sum of all qqty.
select
sum(t.qqty), sum(t.qqty)
from
(select
car, cqty, qqty,
case when cqty = qqty then 1 else 0 end as match,
location, state) t
Use conditional aggregation -- that is case as the argument to the sum():
select sum(t.qqty), sum(case when condition = 1 then t.qqty else 0 end)
from t;

Django ORM remove unwanted Group by when annotate multiple aggregate columns

I want to create a query something like this in django ORM.
SELECT COUNT(CASE WHEN myCondition THEN 1 ELSE NULL end) as numyear
FROM myTable
Following is the djang ORM query i have written
year_case = Case(When(added_on__year = today.year, then=1), output_field=IntegerField())
qs = (ProfaneContent.objects
.annotate(numyear=Count(year_case))
.values('numyear'))
This is the query which is generated by django orm.
SELECT COUNT(CASE WHEN "analyzer_profanecontent"."added_on" BETWEEN 2020-01-01 00:00:00+00:00 AND 2020-12-31 23:59:59.999999+00:00 THEN 1 ELSE NULL END) AS "numyear" FROM "analyzer_profanecontent" GROUP BY "analyzer_profanecontent"."id"
All other things are good, but django places a GROUP BY at the end leading to multiple rows and incorrect answer. I don't want that at all. Right now there is just one column but i will place more such columns.
EDIT BASED ON COMMENTS
I will be using the qs variable to get values of how my classifications have been made in the current year, month, week.
UPDATE
On the basis of comments and answers i am getting here let me clarify. I want to do this at the database end only (obviously using Django ORM and not RAW SQL). Its a simple sql query. Doing anything at Python's end will be inefficient since the data can be too large. Thats why i want the database to get me the sum of records based on the CASE condition.
I will be adding more such columns in the future so something like len() or .count will not work.
I just want to create the above mentioned query using Django ORM (without an automatically appended GROUP BY).
When using aggregates in annotations, django needs to have some kind of grouping, if not it defaults to primary key. So, you need to use .values() before .annotate(). Please see django docs.
But to completely remove group by you can use a static value and django is smart enough to remove it completely, so you get your result using ORM query like this:
year_case = Case(When(added_on__year = today.year, then=1), output_field=IntegerField())
qs = (ProfaneContent.objects
.annotate(dummy_group_by = Value(1))
.values('dummy_group_by')
.annotate(numyear=Count(year_case))
.values('numyear'))
If you need to summarize only to one row then you should to use an .aggregate() method instead of annotate().
result = ProfaneContent.objects.aggregate(
numyear=Count(year_case),
# ... more aggregated expressions are possible here
)
You get a simple dictionary of result columns:
>>> result
{'numyear': 7, ...}
The generated SQL query is without groups, exactly how required:
SELECT
COUNT(CASE WHEN myCondition THEN 1 ELSE NULL end) as numyear
-- and more possible aggregated expressions
FROM myTable
What about a list comprehension:
# get all the objects
profane = ProfaneContent.objects.all()
# Something like this
len([pro for pro in profane if pro.numyear=today.year])
if the num years are equal it will add it to the list, so at the and you can check the len()
to get the count
Hopefully this is helpfull!
This is how I would write it in SQL.
SELECT SUM(CASE WHEN myCondition THEN 1 ELSE 0 END) as numyear
FROM myTable
SELECT
SUM(CASE WHEN "analyzer_profanecontent"."added_on"
BETWEEN 2020-01-01 00:00:00+00:00
AND 2020-12-31 23:59:59.999999+00:00
THEN 1
ELSE 0
END) AS "numyear"
FROM "analyzer_profanecontent"
GROUP BY "analyzer_profanecontent"."id"
If you intend to use other items in the SELECT clause I would recommend using a group by as well which would look like this:
SELECT SUM(CASE WHEN myCondition THEN 1 ELSE 0 END) as numyear
FROM myTable
GROUP BY SUM(CASE WHEN myCondition THEN 1 ELSE 0 END)

Rails/SQL - How to filter by distinct value in a group

I'm working in Rails, but an answer in SQL is equally helpful. Let's say I have a table of Users and a table of Purchases. I want to find the Users who have only ever bought Item A. I was hoping to use a query along the lines of:
User.joins(:purchases).group(:id).having("DISTINCT(item) = 'A'").pluck(:id)
This is a simplification of the question I need to answer, but this grouping issue is my main roadblock. For that reason, I'm hoping for an answer that is logically very similar, as other workarounds would likely not apply.
Does this work in Rails?
User.joins(:purchases).group(:id).having("MIN(item) = MAX(item) AND MIN(item) = 'A'").pluck(:id)
This phrase as: there is only one distinct value (since MIN() and MAX() are equal), that is 'A'.
Alternatively:
having("MAX(CASE WHEN item <> 'A' THEN 1 ELSE 0 END) = 0")
Which would stand for: no other value than 'A'.
In having you can use only aggregate functions (e.g. having count(id) > 2) or expressions on columns you did the grouping on e.g. having("id > 1").
So depending on your db you may try to find an aggregate function that identifies existence of item in the grouping per id.
For PostgreSql that would be something like (haven't tested):
...
GROUP BY id
HAVING 'A' = ANY(ARRAY_AGG(item))

PostgreSQL where clause not pushed down when using grouping sets

SELECT *
FROM (
SELECT SUM(quantity) AS quantity,
product_location_id,
location_bin_id,
product_lot_id,
product_serial_id,
CASE
WHEN GROUPING (product_location_id, location_bin_id, product_lot_id, product_serial_id) = 0 AND product_serial_id IS NOT NULL THEN
'Serial'
WHEN GROUPING (product_location_id, location_bin_id, product_lot_id, product_serial_id) = 0 THEN
'Lot'
ELSE
'Quantity'
END AS pick_by
FROM product_location_bins
WHERE status != 'Void'
AND has_quantity = 'Yes'
GROUP BY GROUPING SETS (
(product_location_id, location_bin_id, product_lot_id, product_serial_id),
(product_location_id, location_bin_id)
)
HAVING SUM(quantity) > 0
) x
WHERE x.product_serial_id = 5643
I have the above query. Using a normal GROUP BY postgres is able to "push down" the outer where clause and use the index on product_serial_id. When I use grouping sets it's unable to do so. It resolves the entire inner query and then filters the results. I'm wondering why this is. Is it a limitation with grouping sets?
Your query is odd. Your outer where clause eliminates the second set of results from grouping sets, because product_serial_id would be NULL for the second set. This gets filtered out in the outer where.
I think you want something like this for the outer query:
WHERE x.product_serial_id = 5643 OR x.product_serial_id IS NULL
I suppose that Postgres could add optimizations for poorly written code -- that is, eliminate the work for the second grouping sets set because it is filtered out by the outer where. However, that is not usually the focus of optimizations.

Multiple aggregate functions in Hibernate Query

I want to have an HQL query which essentially does this :
select quarter, sum(if(a>1, 1, 0)) as res1, sum(if(b>1, 1, 0)) as res2 from foo group by quarter;
I want a List as my output list with Summary Class ->
Class Summary
{
long res1;
long res2;
int quarter;
}
How can I achieve this aggregation in HQL? What will be the hibernate mappings for the target Object?
I don't want to use SQL kind of query that would return List<Object[]> and then transform it to List<Summary>
Since Summary is not an entity, you don't need a mapping for it, you can create an appropriate constructor and use an HQL constructor expression instead. Aggregate functions and ifs are also possible, though you need to use case syntax instead of if.
So, if Foo is an entity mapped to the table Foo it would look like this:
select new Summary(
f.quarter,
sum(case when f.a > 1 then 1 else 0 end),
sum(case when f.b > 1 then 1 else 0 end)
) from Foo f group by f.quarter
See also:
Chapter 16. HQL: The Hibernate Query Language
It might be possible with a subselect in the mapping. Have a look at More complex association mappings in the hibernate documentation. I've never tried that possibility.
But even the hibernate guys recommend "... but it is more practical to handle these kinds of cases using HQL or a criteria query." That's what I would do: Use the group-by in the HQL statement and work with the List. The extra time for copying this list into a list of suitable objects is negligible compared with the time which the group-by is using in the database. But it seems you don't like this possibility.
A third possibility is to define a view in the database containing your group-by and then create a normal mapping for this view.