I want to use a query that could be used as subquery. But I noticed that query like this:
x = Deal.all_objects.filter(state='created').values('id').annotate(cnt=Count('id')).values('cnt')
produces
SELECT COUNT("deals_deal"."id") AS "cnt"
FROM "deals_deal"
WHERE "deals_deal"."state" = created
GROUP BY "deals_deal"."id"
I don't need the GROUP BY, I just want to count offers that match filter.
I don't want to use .count() because It would not let me to write a query like this:
Deal.all_objects.filter(
Q(creator=OuterRef('pk')) | Q(taker=OuterRef('pk'), state='completed')
).annotate(cnt=Count('pk')).values('cnt')
How to modify above query so that it count without GROUP BY?
What you are looking for is done with aggregate not annotate.
x = Deal.objects.filter(state='created').values('id').aggregate(cnt=Count('id'))
# x is a dictionary {"cnt":9000}, you wouldn't even need the ".values('id')" now
This will result in a query something like this
SELECT COUNT("deals_deal"."id") AS "cnt"
FROM "deals_deal"
WHERE "deals_deal"."state" = created
Further cool things you can do with Aggregate
Related
In my App backend with Knex using PSQL I'm trying to get the count of the rows where they have the same ID.
The issue is that whatever I'm doing always the count is 1 when in reality I have 2 rows for the same ID.
My table looks
In the table shared I need to count the rows with the same conversation_id which is 1.
The expected result should be count = 2
What I tried with Knex:
tx(tableName).select(columns.conversationId)
.whereIn(columns.conversationId, conversationIds)
.groupBy(columns.conversationId, columns.createdAt, columns.id);
The groupBy section if I try to remove columns.createdAt, columns.id it is complaining saying that those need to be included in the groupBy or in an aggregate function.
Removing in the following SQL those extra groupBy element I'm getting the right result but Knex doesn't like it and I'm stuck on it.
SQL generated as follow:
select
conversation_id ,
COUNT(*)
from
message
group by
conversation_id,
created_at ,
id ;
The result of this SQL is as follow
As you see the result is not good and I'm not able to make it work correctly with Knex which complain if I remove the elements from the groupBy
Tinkering with some expressions in the QueryLab, I wonder if something like the following will work:
tx(tableName)
.select(columns.conversationId)
.whereIn(columns.conversationId, conversationIds)
.count()
Which would give something like (these values are placeholders, obviously):
select "columns"."conversationId", count(*) from "tableName" where "columns"."conversationId" in (1, 2, 3)
I'm trying to write a query to return the town, and the number of runners from each town where the number of runners is greater than 5.
My Query right now look like this:
select hometown, count(hometown) from marathon2016 where count(hometown) > 5 group by hometown order by count(hometown) desc;
but sqlite3 responds with this:
Error: misuse of aggregate: count()
What am i doing wrong, Why cant I use the count() here, and what should I use instead.
When you're trying to use an aggregate function (such as count) in a WHERE cause, you're usually looking for HAVING instead of WHERE:
select hometown, count(hometown)
from marathon2016
group by hometown
having count(*) > 5
order by count(*) desc
You can't use an aggregate in a WHERE cause because aggregates are computed across multiple rows (as specified by GROUP BY) but WHERE is used to filter individual rows to determine what row set GROUP BY will be applied to (i.e. WHERE happens before grouping and aggregates apply after grouping).
Try the following:
select
hometown,
count(hometown) as hometown_count
from
marathon2016
group by
hometown
having
hometown_count > 5
order by
hometown_count desc;
I have a model Company and Company has many DailyData.
And DailyData has columns volume and date
To calculate average volume of recent 10 business days I wrote like:
class Array
def sum
inject(0) { |result, el| result + el }
end
def mean
sum.to_d / size
end
end
company = Company.first
company.daily_data.order(date: :desc).limit(10).pluck(:volume).mean
This code works fine, but I want to use postgres AVG() function.
company.daily_data.select('AVG(volume) as average_volume').order(date: :desc)
This code ends up with error:
PG::GroupingError: ERROR: column "daily_data.date" must appear in the GROUP BY clause or be used in an aggregate function
But If I put .group(:date) in method chain, the sql reurns multiple results.
How can I get recent 10 volumes average value by using postgresql AVG() function?
An ActiveRecord query like this:
company.daily_data.select('AVG(volume) as average_volume').order(date: :desc)
doesn't really make much sense. avg is an aggregate function in SQL so it needs to operate on a groups of rows. But you're not telling the database how to group the rows, you're telling the database to compute the average volume over the entire table and then order that single value by something that doesn't exist in the final result set.
Throwing a limit in:
company.daily_data
.select('AVG(volume) as average_volume')
.order(date: :desc)
.limit(10)
won't help either because the limit is applied after the order and by then you've already confused the database with your attempted avg(volume).
I'd probably use a derived table if I was doing this in SQL, something like:
select avg(volume) as average_volume
from (
select volume
from where_ever...
where what_ever...
order by date desc
limit 10
) dt
The derived table in the FROM clause finds the volumes that you want and then the overall query averages those 10 volumes.
Alternatively, you could use a subquery to grab the rows of interest:
select avg(volume) as average_volume
from where_ever...
where id in (
select id
from where_ever...
where what_ever...
order by date desc
limit 10
)
The subquery approach is fairly straight forward to implement with ActiveRecord, something like this:
ten_most_recent = company.daily_data.select(:id).order(:date => :desc).limit(10)
company.daily_data.where(:id => ten_most_recent).average(:volume)
If you throw a to_sql call on the end of the second line you should see something that looks like the subquery SQL.
You can also make the derived table approach work with ActiveRecord but it is a little less natural. There is a from method in ActiveRecord that will take an ActiveRecord query to build the from (select ...) derived table construction but you'll want to be sure to manually name the derived table:
ten_most_recent = company.daily_data.select(:volume).order(:date => :desc).limit(10)
AnyModelAtAll.from(ten_most_recent, 'dt').average('dt.volume')
You have to use a string argument to average and include the dt. prefix to keep ActiveRecord from trying to add its own table name.
Of course you'd hide all this stuff in a method somewhere so that you could hide the details. Perhaps an extension method on the daily_data association:
has_many :daily_data, ... do
def average_volume(days)
#...
end
end
so that you could say things like:
company.daily_data.average_volume(11)
This page implies that I can do something like this:
From thing In things
Group By thing.ID Into result = Group, Sum(thing.value)
And then be able to use the Sum of values later on. Perhaps in something like this (not actually working code):
From thing In things
Group By thing.name Into results = Group, Sum(thing.value)
Where results.Sum >= 10
From result In results
Select result
This should select all things where the sum of values for things with the same name is greater than 10.
Through more testing, I can't get results to be anything other than the first item in a so called aggregate list. In other words the following are effectively identical as far as I can tell:
Group By thing.name Into results = Group, Sum(thing.value)
Group By thing.name Into results = Group
What is this aggregate list even supposed to do and how can I use it?
A side note: The example in the link seems to use Count instead Sum but actually uses two different definitions of Count and is worthless as an example.
It helps to understand what exactly is happening when you perform a linq query.
When you do a group by in the query syntax, you're creating result objects with property names of the key and aggregates.
Your query:
From thing In things
Group By thing.ID Into result = Group, Sum(thing.value)
Yields a collection of objects with properties:
ID - the key (implicitly taken from the ID property)
result - a collection of items that is in the group (explicit alias to Group)
Sum - the sum of adding the value of the items (implicitly taken from the aggregate name)
Once you understand that, you can do more filtering. To filter results where the sum is less than 10:
From thing In things
Group By thing.ID Into result = Group, Sum(thing.value)
Where Sum < 10
So to fix your other query, you would do this:
From thing In things
Group By thing.name Into results = Group, Sum(thing.value)
Where Sum >= 10
From result In results
Select result
I am getting very strange behavior on 2.0-M2. Consider the following against the GratefulDeadConcerts database:
Query 1
SELECT name, in('written_by') AS wrote FROM V WHERE type='artist'
This query returns a list of artists and the songs each has written; a majority of the rows have at least one song.
Query 2
Now try:
SELECT name, count(in('written_by')) AS num_wrote FROM V WHERE type='artist'
On my system (OSX Yosemite; Orient 2.0-M2), I see just one row:
name num_wrote
---------------------------
Willie_Cobb 224
This seems wrong. But I tried to better understand. Perhaps the count() causes the in() to look at all written_by edges...
Query 3
SELECT name, in('written_by') FROM V WHERE type='artist' GROUP BY name
Produces results similar to the first query.
Query 4
Now try count()
SELECT name, count(in('written_by')) FROM V WHERE type='artist' GROUP BY name
Wrong path -- So try LET variables...
Query 5
SELECT name, $wblist, $wbcount FROM V
LET $wblist = in('written_by'),
$wbcount = count($wblist)
WHERE type='artist'
Produces seemingly meaningless results:
You can see that the $wblist and $wbcount columns are inconsistent with one another, and the $wbcount values don't show any obvious progression like a cumulative result.
Note that the strange behavior is not limited to count(). For example, first() does similarly odd things.
count(), like in RDBMS, computes the sum of all the records in only one value. For your purpose .size()seems the right method to call:
in('written_by').size()