Rails Order by frequency of a column in another table - sql

I have a table KmRelationship which associates Keywords and Movies
In keyword index I would like to list all keywords that appear most frequently in the KmRelationships table and only take(20)
.order doesn't seem to work no matter how I use it and where I put it and same for sort_by
It sounds relatively straight forward but i just can't seem to get it to work
Any ideas?

Assuming your KmRelationship table has keyword_id:
top_keywords = KmRelationship.select('keyword_id, count(keyword_id) as frequency').
order('frequency desc').
group('keyword_id').
take(20)
This may not look right in your console output, but that's because rails doesn't build out an object attribute for the calculated frequency column.
You can see the results like this:
top_keywords.each {|k| puts "#{k.keyword_id} : #{k.freqency}" }
To put this to good use, you can then map out your actual Keyword objects:
class Keyword < ActiveRecord::Base
# other stuff
def self.most_popular
KmRelationship.
select('keyword_id, count(keyword_id) as frequency').
order('frequency desc').
group('keyword_id').
take(20).
map(&:keyword)
end
end
And call with:
Keyword.most_popular

#posts = Post.select([:id, :title]).order("created_at desc").limit(6)
I have this listed in my controller index method which allows the the order to show the last post with a limit of 6. It might be something similar to what you are trying to do. This code actually reflects a most recent post on my home page.

Related

How can you use distinct in rails while still using ActiveRecord's

I am struggling with the following problem:
I want to have two different tabs, one that displays all recent chugs (Done), and one that displays the chugs that are the fastest per person.
However, this needs to remain an ActiveRecord, since I need to use it with link_to and gravatar, thus restraining me from group_by, as far as I understand it.
AKA: If there are three users who each have three chugs, I want to show 1 chug per user, which contains the fastest time of that particular user.
The current code looks like this, where chugs_unique should be edited:
def show
#pagy, #chugs_all_newest = pagy(#chugtype.chugs.order('created_at DESC'), items: 10, page: params[:page])
#chugs_unique = #chugtype.chugs.order('secs ASC, milis ASC, created_at DESC').uniq
breadcrumb #chugtype.name, chugtypes_path(#chugtype)
end
In this case, a chug belongs to both a chugtype and user, and the chugtype has multiple chugs.
Thanks in advance!

Why does Postgres not accept my count column?

I am building a Rails app with the following models:
# vote.rb
class Vote < ApplicationRecord
belongs_to :person
belongs_to :show
scope :fulfilled, -> { where(fulfilled: true) }
scope :unfulfilled, -> { where(fulfilled: false) }
end
# person.rb
class Person < ApplicationRecord
has_many :votes, dependent: :destroy
def self.order_by_votes(show = nil)
count = 'nullif(votes.fulfilled, true)'
count = "case when votes.show_id = #{show.id} AND NOT votes.fulfilled then 1 else null end" if show
people = left_joins(:votes).group(:id).uniq!(:group)
people = people.select("people.*, COUNT(#{count}) AS people.vote_count")
people.order('people.vote_count DESC')
end
end
The idea behind order_by_votes is to sort People by the number of unfulfilled votes, either counting all votes, or counting only votes associated with a given Show.
This seem to work fine when I test against SQLite. But when I switch to Postgres I get this error:
Error:
PeopleControllerIndexTest#test_should_get_previously_on_show:
ActiveRecord::StatementInvalid: PG::UndefinedColumn: ERROR: column people.vote_count does not exist
LINE 1: ...s"."show_id" = $1 GROUP BY "people"."id" ORDER BY people.vot...
^
If I dump the SQL using #people.to_sql, this is what I get:
SELECT people.*, COUNT(nullif(votes.fulfilled, true)) AS people.vote_count FROM "people" LEFT OUTER JOIN "votes" ON "votes"."person_id" = "people"."id" GROUP BY "people"."id" ORDER BY people.vote_count DESC
Why is this failing on Postgres but working on SQLite? And what should I be doing instead to make it work on Postgres?
(PS: I named the field people.vote_count, with a dot, so I can access it in my view without having to do another SQL query to actually view the vote count for each person in the view (not sure if this works) but I get the same error even if I name the field simply vote_count.)
(PS2: I recently added the .uniq!(:group) because of some deprecation warning for Rails 6.2, but I couldn't find any documentation for it so I am not sure I am doing it right, still the error is there without that part.)
Are you sure you're not getting a syntax error from PostgreSQL somewhere? If you do something like this:
select count(*) as t.vote_count from t ... order by t.vote_count
I get a syntax error before PostgreSQL gets to complain about there being no t.vote_count column.
No matter, the solution is to not try to put your vote_count in the people table:
people = people.select("people.*, COUNT(#{count}) AS vote_count")
...
people.order(vote_count: :desc)
You don't need it there, you'll still be able to reference the vote_count just like any "normal" column in people. Anything in the select list will appear as an accessor in the resultant model instances whether they're columns or not, they won't show up in the #inspect output (since that's generated based on the table's columns) but you call the accessor methods nonetheless.
Historically there have been quite a few AR problems (and bugs) in getting the right count by just using count on a scope, and I am not sure they are actually all gone.
That depends on the scope (AR version, relations, group, sort, uniq, etc). A defaut count call that a gem has to generically use on a scope is not a one-fit-all solution. For that known reason Pagy allows you to pass the right count to its pagy method as explained in the Pagy documentation.
Your scope might become complex and the default pagy collection.count(:all) may not get the actual count. In that case you can get the right count with some custom statement, and pass it to pagy.
#pagy, #records = pagy(collection, count: your_count)
Notice: pagy will efficiently skip its internal count query and will just use the passed :count variable.
So... just get your own calculated count and pass it to pagy, and it will not even try to use the default.
EDIT: I forgot to mention: you may want to try the pagy arel extra that:
adds specialized pagination for collections from sql databases with GROUP BY clauses, by computing the total number of results with COUNT(*) OVER ().
Thanks to all the comments and answers I have finally found a solution which I think is the best way to solve this.
First of, the issue occurred when I called pagy which tried to count my scope by appending .count(:all). This is what caused the errors. The solution was to not create a "field" in select() and use it in .order().
So here is the proper code:
def self.order_by_votes(show = nil)
count = if show
"case when votes.show_id = #{show.id} AND NOT votes.fulfilled then 1 else null end"
else
'nullif(votes.fulfilled, true)'
end
left_joins(:votes).group(:id)
.uniq!(:group)
.select("people.*, COUNT(#{count}) as vote_count")
.order(Arel.sql("COUNT(#{count}) DESC"))
end
This sorts the number of people on the number of unfulfilled votes for them, with the ability to count only votes for a given show, and it works with pagy(), and pagy_arel() which in my case is a much better fit, so the results can be properly paginated.

How can I get Rails 5 to play nicely with a primary key that has a period in it?

I'm working with a database I have no control over, and cannot make alterations to. This database has a table called warehouse_items. Each warehouse item is uniquely identified by a primary key indicating the item id.
Unfortunately, that primary key attribute is named WAREHOUSE_ITEM.ID
(Note the obnoxious period between "item" and "id")
When I try to run a basic query, such as:
WarehouseItem.find('wh3453')
I get an Undefined Table error.
Fortunately, when looking at what Rails is attempting to do, the problem becomes obvious:
: SELECT "warehouse_items".* FROM "warehouse_items" WHERE "WAREHOUSE_ITEM"."ID" = $1 LIMIT $2
Because of the period in the attribute name, Rails is treating "WAREHOUSE_ITEM.ID" as a table/attribute combination, rather than an attribute name with a period in it.
When I run the following PSQL query by hand, I get exactly what I need:
SELECT "warehouse_items".* FROM "warehouse_items" WHERE "warehouse_items"."WAREHOUSE_ITEM.ID" = 'wh3453'
Why is Rails screwing this up, and how can I fix it?
EDIT:
Also worth noting: I've tried using self.primary_key to override the primary key to no avail.
I've tried both a string and a symbol, as in:
self.primary_key="WAREHOUSE_ITEM.ID"
and
self.primary_key=:"WAREHOUSE_ITEM.ID"
Neither one has worked...
Thanks for all the help, everyone!
A suggestion in the comments to use find_by_sql does work! However, I stumbled onto a different solution that works even better.
First, I aliased the annoying attribute name to something simple: id
alias_attribute :id, :"WAREHOUSE_ITEM.ID"
Notice that it's still a symbol, which is important for the next step.
I then overwrite the primary_key method with a custom function:
def self.primary_key
return "id"
end
Now, when I do WarehouseItem.find('wh3453'), Rails defaults to checking id, which is aliased to the correct symbol and it works as intended!!!

Update more record in one query with Active Record in Rails

Is there a better way to update more record in one query with different values in Ruby on Rails? I solved using CASE in SQL, but is there any Active Record solution for that?
Basically I save a new sort order when a new list arrive back from a jquery ajax post.
#List of product ids in sorted order. Get from jqueryui sortable plugin.
#product_ids = [3,1,2,4,7,6,5]
# Simple solution which generate a loads of queries. Working but slow.
#product_ids.each_with_index do |id, index|
# Product.where(id: id).update_all(sort_order: index+1)
#end
##CASE syntax example:
##Product.where(id: product_ids).update_all("sort_order = CASE id WHEN 539 THEN 1 WHEN 540 THEN 2 WHEN 542 THEN 3 END")
case_string = "sort_order = CASE id "
product_ids.each_with_index do |id, index|
case_string += "WHEN #{id} THEN #{index+1} "
end
case_string += "END"
Product.where(id: product_ids).update_all(case_string)
This solution works fast and only one query, but I create a query string like in php. :) What would be your suggestion?
You should check out the acts_as_list gem. It does everything you need and it uses 1-3 queries behind the scenes. Its a perfect match to use with jquery sortable plugin. It relies on incrementing/decrementing the position (sort_order) field directly in SQL.
This won't be a good solution for you, if your UI/UX relies on saving the order manually by the user (user sorts out the things and then clicks update/save). However I strongly discourage this kind of interface, unless there is a specific reason (for example you cannot have intermediate state in database between old and new order, because something else depends on that order).
If thats not the case, then by all means just do an asynchronous update after user moves one element (and acts_as_list will be great to help you accomplish that).
Check out:
https://github.com/swanandp/acts_as_list/blob/master/lib/acts_as_list/active_record/acts/list.rb#L324
# This has the effect of moving all the higher items down one.
def increment_positions_on_higher_items
return unless in_list?
acts_as_list_class.unscoped.where(
"#{scope_condition} AND #{position_column} < #{send(position_column).to_i}"
).update_all(
"#{position_column} = (#{position_column} + 1)"
)
end

Rails, Ransack: How to search HABTM relationship for "all" matches instead of "any"

I'm wondering if anyone has experience using Ransack with HABTM relationships. My app has photos which have a habtm relationship with terms (terms are like tags). Here's a simplified explanation of what I'm experiencing:
I have two photos: Photo 1 and Photo 2. They have the following terms:
Photo 1: A, B, C
Photo 2: A, B, D
I built a ransack form, and I make checkboxes in the search form for all the terms, like so:
- terms.each do |t|
= check_box_tag 'q[terms_id_in][]', t.id
If I use: q[terms_id_in][] and I check "A, C" my results are Photo 1 and Photo 2. I only want Photo 1, because I asked for A and C, in this query I don't care about B or D but I want both A and C to be present on a given result.
If I use q[terms_id_in_all][] my results are nil, because neither photo includes only A and C. Or, perhaps, because there's only one term per join, so no join matches both A and C. Regardless, I want just Photo 1 to be returned.
If I use any variety of q[terms_id_eq][] I never get any results, so I don't think that works in this case.
So, given a habtm join, how do you search for models that match the given values while ignoring not given values?
Or, for any rails/sql gurus not familiar with Ransack, how else might you go about creating a search form like I'm describing for a model with a habtm join?
Update: per the answer to related question, I've now gotten as far as constructing an Arel query that correctly matches this. Somehow you're supposed to be able to use Arel nodes as ransackers, or as cdesrosiers pointed out, as custom predicates, but thus far I haven't gotten that working.
Per that answer, I setup the following ransack initializer:
Ransack.configure do |config|
config.add_predicate 'has_terms',
:arel_predicate => 'in',
:formatter => proc {|term_ids| Photo.terms_subquery(term_ids)},
:validator => proc {|v| v.present?},
:compounds => true
end
... and then setup the following method on Photo:
def self.terms_subquery(term_ids)
photos = Arel::Table.new(:photos)
terms = Arel::Table.new(:terms)
photos_terms = Arel::Table.new(:photos_terms)
photos[:id].in(
photos.project(photos[:id])
.join(photos_terms).on(photos[:id].eq(photos_terms[:photo_id]))
.join(terms).on(photos_terms[:term_id].eq(terms[:id]))
.where(terms[:id].in(term_ids))
.group(photos.columns)
.having(terms[:id].count.eq(term_ids.length))
).to_sql
end
Unfortunately this doesn't seem to work. While terms_subquery produces the correct SQL, the result of Photo.search(:has_terms => [2,5]).result.to_sql is just "SELECT \"photos\".* FROM \"photos\" "
With a custom ransack predicate defined as in my answer to your related question, this should work with a simple change to your markup:
- terms.each do |t|
= check_box_tag 'q[id_has_terms][]', t.id
UPDATE
The :formatter doesn't do what I thought, and seeing as how the Ransack repo makes not a single mention of "subquery," you may not be able to use it for what you're trying to do, after all. All available options seem to be exhausted, so there would be nothing left to do but monkey patch.
Why not just skip ransack and query the "photos" table as you normally would with active record (or even with the Arel query you now have)? You already know the query works. Is there a specific benefit you hoped to reap from using Ransack?