Multiple aggregate functions in Hibernate Query - sql

I want to have an HQL query which essentially does this :
select quarter, sum(if(a>1, 1, 0)) as res1, sum(if(b>1, 1, 0)) as res2 from foo group by quarter;
I want a List as my output list with Summary Class ->
Class Summary
{
long res1;
long res2;
int quarter;
}
How can I achieve this aggregation in HQL? What will be the hibernate mappings for the target Object?
I don't want to use SQL kind of query that would return List<Object[]> and then transform it to List<Summary>

Since Summary is not an entity, you don't need a mapping for it, you can create an appropriate constructor and use an HQL constructor expression instead. Aggregate functions and ifs are also possible, though you need to use case syntax instead of if.
So, if Foo is an entity mapped to the table Foo it would look like this:
select new Summary(
f.quarter,
sum(case when f.a > 1 then 1 else 0 end),
sum(case when f.b > 1 then 1 else 0 end)
) from Foo f group by f.quarter
See also:
Chapter 16. HQL: The Hibernate Query Language

It might be possible with a subselect in the mapping. Have a look at More complex association mappings in the hibernate documentation. I've never tried that possibility.
But even the hibernate guys recommend "... but it is more practical to handle these kinds of cases using HQL or a criteria query." That's what I would do: Use the group-by in the HQL statement and work with the List. The extra time for copying this list into a list of suitable objects is negligible compared with the time which the group-by is using in the database. But it seems you don't like this possibility.
A third possibility is to define a view in the database containing your group-by and then create a normal mapping for this view.

Related

How do I calculate percentage of one column in Jooq / SQL in only one transaction?

Question:
Grades Table
---------------
Name Score
"Bob" "A"
"Sally" "A"
"Joe" "B"
"Ann" "C"
Suppose I have this table, and I want to calculate what percentage of students have a C. The correct answer would be 25%. How do I do that in one transaction in JOOQ (or raw SQL if I must)? Or is it not possible? Thank you.
Bad solution: Two Transactions:
float numberOfC = database.fetchCountOfStudentsWithGrade("C"); //Transaction
float numberOfStudents = database.fetchCountOfStudents(); //Transaction
float percentage = numberOfC / numberOfStudents;
Good solution attempt: One Transaction - JOOQ
context.select(val(context.selectCount().from(TABLE1))
.div(val(context.selectCount().from(TABLE1)))) // This line has error
.fetch(0, int.class); //One transaction
//Error: Cannot resolve method `div(org.jooq.Param<T>)`
Jooq Docs for Arithmetic Expressions:
https://www.jooq.org/doc/latest/manual/sql-building/column-expressions/arithmetic-expressions/
In raw sql, you can do:
select avg(case when score = 'C' then 1.0 else 0 end) as c_ratio
from t;
The above is standard syntax and should work in all databases. In some databases, you can write this as:
select avg( score = 'C' ) as c_ratio
from t;
Using SQL Standard FILTER (WHERE ..)
One option in jOOQ would be to use AggregateFunction.filterWhere() as such:
ctx.select(count().filterWhere(T.SCORE.eq("C"))
.cast(BigDecimal.class)
.div(count()))
.from(T)
.fetch();
The above is assuming the following static import:
import static org.jooq.impl.DSL.*;
HSQLDB and PostgreSQL have native support for the COUNT(*) FILTER (WHERE x) syntax. In all other databases, jOOQ will emulate this using COUNT(CASE WHEN x THEN 1 END).
A note on the approach with correlated subqueries
In your question, you suggested an approach using correlated subqueries that do the COUNT(*) calculations. It's almost never a good idea to run several such subqueries if there's a solution running several aggregations in one step

Aggregate multiple columns without groupBy in Slick 2.0

I would like to perform an aggregation with Slick that executes SQL like the following:
SELECT MIN(a), MAX(a) FROM table_a;
where table_a has an INT column a
In Slick given the table definition:
class A(tag: Tag) extends Table[Int](tag, "table_a") {
def a = column[Int]("a")
def * = a
}
val A = TableQuery[A]
val as = A.map(_.a)
It seems like I have 2 options:
Write something like: Query(as.min, as.max)
Write something like:
as
.groupBy(_ => 1)
.map { case (_, as) => (as.map(identity).min, as.map(identity).max) }
However, the generated sql is not good in either case. In 1, there are two separate sub-selects generated, which is like writing two separate queries. In 2, the following is generated:
select min(x2."a"), max(x2."a") from "table_a" x2 group by 1
However, this syntax is not correct for Postgres (it groups by the first column value, which is invalid in this case). Indeed AFAIK it is not possible to group by a constant value in Postgres, except by omitting the group by clause.
Is there a way to cause Slick to emit a single query with both aggregates without the GROUP BY?
The syntax error is a bug. I created a ticket: https://github.com/slick/slick/issues/630
The subqueries are a limitation of Slick's SQL compiler currently producing non-optimal code in this case. We are working on improving the situation.
As a workaround, here is a pattern to swap out the generated SQL under the hood and leave everything else intact: https://gist.github.com/cvogt/8054159
I use the following trick in SQL Server, and it seems to work in Postgres:
select min(x2."a"), max(x2."a")
from "table_a" x2
group by (case when x2.a = x2.a then 1 else 1 end);
The use of the variable in the group by expression tricks the compiler into thinking that there could be more than one group.

Select records with highest values for each subset

I have a set of records of which some, but not all, have a 'path' field, and all have a 'value' field. I wish to select only those which either do not have a path, or have the largest value of all the records with a particular path.
That is, given these records:
Name: Path: Value:
A foo 5
B foo 6
C NULL 2
D bar 2
E NULL 4
I want to return B, C, D, and E, but not A (because A has a path and it's path is the same as B, but A has a lower value).
How can I accomplish this, using ActiveRecord, ARel and Postgres? Ideally, I would like a solution which functions as a scope.
You could use something like this by using 2 subqueries (will do only one SQL query which has subqueries). Did not test, but should get you in the right direction. This is for Postgres.
scope :null_ids, -> { where(path: nil).select('id') }
scope :non_null_ids, -> { where('path IS NOT NULL').select('DISTINCT ON (path) id').order('path, value desc, id') }
scope :stuff, -> {
subquery = [null_ids, non_null_ids].map{|q| "(#{q.to_sql})"}.join(' UNION ')
where("#{table_name}.id IN (#{subquery})")
}
If you are using a different DB you might need to use group/order instead of distinct on for the non_nulls scope. If the query is running slow put an index on path and value.
You get only 1 query and it's a chainable scope.
A straightforward transliteration of your description to SQL would look like this:
select name, path, value
from (
select name, path, value,
row_number() over (partition by path order by value desc) as r
from your_table
where path is not null
) as dt
where r = 1
union all
select name, path, value
from your_table
where path is null
You could wrap that in a find_by_sql and get your objects out the other side.
That query works like this:
The row_number window function allows us to group the rows by path, order each group by value, and then number the rows in each group. Play around with the SQL a bit inside psql and you'll see how this works, there are other window functions available that will allow you to do all sorts of wonderful things.
You're treating NULL path values separately from non-NULL paths, hence the path is not null in the inner query.
We can peel off the first row in each of the path groups by selecting those rows from the derived table that have a row number of one (i.e. where r = 1).
The treatment of path is null rows is easily handled by the section query.
The UNION is used to join the result sets of the queries together.
I can't think of any way to construct such a query using ActiveRecord nor can I think of any way to integrate such a query with ActiveRecord's scope mechanism. If you could easily access just the WHERE component of an ActiveRecord::Relation then you could augment the where path is not null and where path is null components of that query with the WHERE components of a scope chain. I don't know how to do that though.
In truth, I tend to abandon ActiveRecord at the drop of a hat. I find ActiveRecord to be rather cumbersome for most of the complicated things I do and not nearly as expressive as SQL. This applies to every ORM I've ever used so the problem isn't specific to ActiveRecord.
I have no experience with ActiveRecord, but here's a sample with SQLAlchemy to silent the just-use-SQL crowd ;)
q1 = Session.query(Record).filter(Record.path != None)
q1 = q1.distinct(Record.path).order_by(Record.path, Record.value.desc())
q2 = Session.query(Record).filter(Record.path == None)
query = q1.from_self().union(q2)
# Further chaining, e.g. query = query.filter(Record.value > 3) to return B, E
for record in query:
print record.name

NHibernate: row/result counting: Projection.RowCount() VS. Projection.Count()

What is the exact difference between NHibernates
Projection.RowCount()
and
Projection.Count()
when we are looking for number of rows/results?
Projection.Count expects you to pass a property that you want a count on i.e
Projection.Count("propertyName")
which transalates to the following in SQL
select Count(this.whateverNhibernateConvention) from table as this
where as for Projection.RowCount you dont need to pass anything which translates to
select Count(1) from table as this
I think I expect the above to be the case

hql conditional expressions in aggregate functions

Does HQL support conditional expressions in aggregate functions?
I would like to do something like this
select
o.id,
count(o),
count(o.something is null and o.otherthing = :other)
from objects o
But I get a MissingTokenException from the Antlr parser.
EDIT: By using a subquery it's working, but it's ugly as I'm grouping by several variables...
You can use expressions in HQL. For this instance, you'll want to use SUM instead of COUNT as follows:
select
o.id,
count(o),
sum(case when o.something is null and o.otherthing = :other then 1 else 0 end)
from objects o
When the conditionals match, the SUM will receive a 1 for that row. When they do not match, it'll receive a zero. This should provide the same functionality as a COUNT.