Rails - get distinct events, sorted by the start date of associated event instances - sql

I've spent several hours going through StackOverflow and playing around with this query, but still can't get it to work! Hopefully an expert here on SO can make the pain go away...
I have two models, Event and EventInstance. An Event has_many EventInstances.
What I want to do is easily get a list of Events (not EventInstances), where:
Events are distinct and not repeated
Events are sorted by the start_date of the nearest EventInstance
Event instances have the attribute :active => true
Only event instances that have a start date in the future are returned
I currently have the query
Event.joins(:event_instances).select('distinct events.*').where('event_instances.start_date >= ?', Time.now).where('event_instances.active = true')
This returns a list of events, but not sorted by date. Excellent - so I am almost there!
If I change the query to add this on the end:
.order('event_instances.start_date')
I get the error:
PG::InvalidColumnReference: ERROR: for SELECT DISTINCT, ORDER BY expressions must appear in select list
So I moved it to the select statement:
select('distinct event_instances.start_date, events.*')
Now I get
PG::UndefinedFunction: ERROR: function count(date, events) does not exist
I've tried moving methods around, using includes, everything but I still can't get it to work. Any help would be really appreciated! Thank you.

try changing
.order('event_instances.start_date')
to
.order(:event_instances.start_date)
or if you need descending order add the .reverse_order method to the end of the query

This is the exact query which worked for my models Post and PostComments on both MySQL and PostgreSQL:
Post.joins(:post_comments).select('distinct post_comments.body, post_comments.created_at').order('post_comments.created_at desc')
So for you, it's equivalent should work too. If it still doesn't then please update your post with the fields of your model.

Related

Why does Postgres not accept my count column?

I am building a Rails app with the following models:
# vote.rb
class Vote < ApplicationRecord
belongs_to :person
belongs_to :show
scope :fulfilled, -> { where(fulfilled: true) }
scope :unfulfilled, -> { where(fulfilled: false) }
end
# person.rb
class Person < ApplicationRecord
has_many :votes, dependent: :destroy
def self.order_by_votes(show = nil)
count = 'nullif(votes.fulfilled, true)'
count = "case when votes.show_id = #{show.id} AND NOT votes.fulfilled then 1 else null end" if show
people = left_joins(:votes).group(:id).uniq!(:group)
people = people.select("people.*, COUNT(#{count}) AS people.vote_count")
people.order('people.vote_count DESC')
end
end
The idea behind order_by_votes is to sort People by the number of unfulfilled votes, either counting all votes, or counting only votes associated with a given Show.
This seem to work fine when I test against SQLite. But when I switch to Postgres I get this error:
Error:
PeopleControllerIndexTest#test_should_get_previously_on_show:
ActiveRecord::StatementInvalid: PG::UndefinedColumn: ERROR: column people.vote_count does not exist
LINE 1: ...s"."show_id" = $1 GROUP BY "people"."id" ORDER BY people.vot...
^
If I dump the SQL using #people.to_sql, this is what I get:
SELECT people.*, COUNT(nullif(votes.fulfilled, true)) AS people.vote_count FROM "people" LEFT OUTER JOIN "votes" ON "votes"."person_id" = "people"."id" GROUP BY "people"."id" ORDER BY people.vote_count DESC
Why is this failing on Postgres but working on SQLite? And what should I be doing instead to make it work on Postgres?
(PS: I named the field people.vote_count, with a dot, so I can access it in my view without having to do another SQL query to actually view the vote count for each person in the view (not sure if this works) but I get the same error even if I name the field simply vote_count.)
(PS2: I recently added the .uniq!(:group) because of some deprecation warning for Rails 6.2, but I couldn't find any documentation for it so I am not sure I am doing it right, still the error is there without that part.)
Are you sure you're not getting a syntax error from PostgreSQL somewhere? If you do something like this:
select count(*) as t.vote_count from t ... order by t.vote_count
I get a syntax error before PostgreSQL gets to complain about there being no t.vote_count column.
No matter, the solution is to not try to put your vote_count in the people table:
people = people.select("people.*, COUNT(#{count}) AS vote_count")
...
people.order(vote_count: :desc)
You don't need it there, you'll still be able to reference the vote_count just like any "normal" column in people. Anything in the select list will appear as an accessor in the resultant model instances whether they're columns or not, they won't show up in the #inspect output (since that's generated based on the table's columns) but you call the accessor methods nonetheless.
Historically there have been quite a few AR problems (and bugs) in getting the right count by just using count on a scope, and I am not sure they are actually all gone.
That depends on the scope (AR version, relations, group, sort, uniq, etc). A defaut count call that a gem has to generically use on a scope is not a one-fit-all solution. For that known reason Pagy allows you to pass the right count to its pagy method as explained in the Pagy documentation.
Your scope might become complex and the default pagy collection.count(:all) may not get the actual count. In that case you can get the right count with some custom statement, and pass it to pagy.
#pagy, #records = pagy(collection, count: your_count)
Notice: pagy will efficiently skip its internal count query and will just use the passed :count variable.
So... just get your own calculated count and pass it to pagy, and it will not even try to use the default.
EDIT: I forgot to mention: you may want to try the pagy arel extra that:
adds specialized pagination for collections from sql databases with GROUP BY clauses, by computing the total number of results with COUNT(*) OVER ().
Thanks to all the comments and answers I have finally found a solution which I think is the best way to solve this.
First of, the issue occurred when I called pagy which tried to count my scope by appending .count(:all). This is what caused the errors. The solution was to not create a "field" in select() and use it in .order().
So here is the proper code:
def self.order_by_votes(show = nil)
count = if show
"case when votes.show_id = #{show.id} AND NOT votes.fulfilled then 1 else null end"
else
'nullif(votes.fulfilled, true)'
end
left_joins(:votes).group(:id)
.uniq!(:group)
.select("people.*, COUNT(#{count}) as vote_count")
.order(Arel.sql("COUNT(#{count}) DESC"))
end
This sorts the number of people on the number of unfulfilled votes for them, with the ability to count only votes for a given show, and it works with pagy(), and pagy_arel() which in my case is a much better fit, so the results can be properly paginated.

order_by() method not working in peewee

I am using a SQLite backend with a simple show - season - episode schema:
class Show(BaseModel):
name = CharField()
class Season(BaseModel):
show = ForeignKeyField(Show, related_name='seasons')
season_number = IntegerField()
class Episode(BaseModel):
season = ForeignKeyField(Season, related_name='episodes')
episode_number = IntegerField()
and I would need the following query :
seasons = (Season.select(Season, Episode)
.join(Episode)
.where(Season.show == SHOW_ID)
.order_by(Season.season_number.desc(), Episode.episode_number.desc())
.aggregate_rows())
SHOW_ID being the id of the show for which I want the list of seasons.
But when I iterate over the query with the following code :
for season in seasons:
for episode in season.episodes:
print(episode.episode_number)
... I get something which is not ordered at all, and which does not even follow the order I would get without using order_by(), i.e. the insertion order.
I activated the debug logs to see the outgoing query, and the query does contain the ORDER BY clause, and manually applying it returns the proper descending order.
I am new to peewee, and I have seen so many examples making use of a join() combines with an order_by(), but I can still not find out what I am doing wrong.
This was due to a bug in the processing of nested collections in the aggregate query result wrapper.
The github issue is: https://github.com/coleifer/peewee/issues/519
The fix has been merged here: https://github.com/coleifer/peewee/commit/ec0e87f1a480695d98bf1f0d7f2e63aed8dfc440
So, to get the fix you'll need to either clone master or wait til the next release which should be in the next week or two (2.4.7).

Rails Order by frequency of a column in another table

I have a table KmRelationship which associates Keywords and Movies
In keyword index I would like to list all keywords that appear most frequently in the KmRelationships table and only take(20)
.order doesn't seem to work no matter how I use it and where I put it and same for sort_by
It sounds relatively straight forward but i just can't seem to get it to work
Any ideas?
Assuming your KmRelationship table has keyword_id:
top_keywords = KmRelationship.select('keyword_id, count(keyword_id) as frequency').
order('frequency desc').
group('keyword_id').
take(20)
This may not look right in your console output, but that's because rails doesn't build out an object attribute for the calculated frequency column.
You can see the results like this:
top_keywords.each {|k| puts "#{k.keyword_id} : #{k.freqency}" }
To put this to good use, you can then map out your actual Keyword objects:
class Keyword < ActiveRecord::Base
# other stuff
def self.most_popular
KmRelationship.
select('keyword_id, count(keyword_id) as frequency').
order('frequency desc').
group('keyword_id').
take(20).
map(&:keyword)
end
end
And call with:
Keyword.most_popular
#posts = Post.select([:id, :title]).order("created_at desc").limit(6)
I have this listed in my controller index method which allows the the order to show the last post with a limit of 6. It might be something similar to what you are trying to do. This code actually reflects a most recent post on my home page.

Rails select with include statement

I've been trying to find a proper solution for this problem but didn't succeed. I know that we can't do select with include statement.
For example I have a model called Parent which have many children. I tried following things
1) When I tried this
Parent.includes(:children).select("parent.name, children.age")
Rails completely ignores the select clause.
2) Then I tried this
Parent.joins(:children).select("parent.name, children.age")
The select clause works but instead of returning a nested object it returns me a flat array of objects. So I have to again run a group by command on it to make it nested.
3) I found something called preload, but again not enough documentation for it.
I'm tired of finding a solution to this problem. Can someone point me in a right direction.
===================================================================
By nested object I meant I should be able to do things like
#parents.each do |parent|
puts parent.name
parent.children.each do |child|
puts child.age
end
end
I can achieve this with include but then it selects all attributes which are not needed.

Rails 3 selecting only values

In rails 3, I would like to do the following:
SomeModel.where(:some_connection_id => anArrayOfIds).select("some_other_connection_id")
This works, but i get the following from the DB:
[{"some_other_connection_id":254},{"some_other_connection_id":315}]
Now, those id-s are the ones I need, but I am uncapable of making a query that only gives me the ids. I do not want to have to itterate over the resulst, only to get those numbers out. Are there any way for me to do this with something like :
SomeModel.where(:some_connection_id => anArrayOfIds).select("some_other_connection_id").values()
Or something of that nautre?
I have been trying with the ".select_values()" found at Git-hub, but it only returns "some_other_connection_id".
I am not an expert in rails, so this info might be helpful also:
The "SomeModel" is a connecting table, for a many-to-many relation in one of my other models. So, accually what I am trying to do is to, from the array of IDs, get all the entries from the other side of the connection. Basicly I have the source ids, and i want to get the data from the models with all the target ids. If there is a magic way of getting these without me having to do all the sql myself (with some help from active record) it would be really nice!
Thanks :)
Try pluck method
SomeModel.where(:some => condition).pluck("some_field")
it works like
SomeModel.where(:some => condition).select("some_field").map(&:some_field)
SomeModel.where(:some_connection_id => anArrayOfIds).select("some_other_connection_id").map &:some_other_connection_id
This is essentially a shorthand for:
results = SomeModel.where(:some_connection_id => anArrayOfIds).select("some_other_connection_id")
results.map {|row| row.some_other_connection_id}
Look at Array#map for details on map method.
Beware that there is no lazy loading here, as it iterates over the results, but it shouldn't be a problem, unless you want to add more constructs to you query or retrieve some associated objects(which should not be the case as you haven't got the ids for loading the associated objects).