How to define new instantaneous variable row by row - RAILS3 BEGINNER - ruby-on-rails-3

I was hoping somebody may be able to point me in the right direction...
I have a database called Info and use a find command to select the rows in this database which match a certain criteria
#matching = Info.find( :all, :conditions => ["product_name = ?", distinctproduct], :order => 'Price ASC')
I then pull out the cheapest of these items
#cheapest = #matching.first
Finally, I would like to create an instantaneous array which contains a list of #cheapest for a number of different search criteria. i.e. row 1 in #allcheapest is #cheapest for criteria 1, row 2 in #allcheapest is #cheapest for criteria 2, ...
Any help would be great, thanks in advance

Info.where(:product_name => distinct_product.to_s).order('Price ASC').first
to select the cheapest price for the product_name. Without more insight into how your database is structured, it is difficult to suggest how to obtain the latter, but you may try
Info.where(:product_name => distinct_product.to_s).order('Price ASC').group(:product_name)

Related

Active Record query to match every subset element

In my RoR application, I've got a database lookup similar to this one:
Client.joins(:products).where({'product.id' => [1,2,3]})
Unfortunately this will return all clients that have bought product 1, 2 or 3 but I only want to get back the clients, that bought all of the three products. In other words, I'd like to write a query that matches for n elements in a given set.
Are there any elegant solutions for this?
This is not really elegant. But it should translate into the needed SQL.
Client.joins(:products).
where({'products.id' => [1,2,3]}).
group('users.id').
having('COUNT(DISTINCT products.id) >= 3')
Same answer with more dynamic way
ids = [1,2,3]
Client.joins(:products).
where({'products.id' => ids}).
group('users.id').
having('COUNT(DISTINCT products.id) >= ?', ids.size)

Rails ActiveRecord Join Query With conditions

I have following SQL Query:
SELECT campaigns.* , campaign_countries.points, offers.image
FROM campaigns
JOIN campaign_countries ON campaigns.id = campaign_countries.campaign_id
JOIN countries ON campaign_countries.country_id = countries.id
JOIN offers ON campaigns.offer_id = offers.id
WHERE countries.code = 'US'
This works perfectly well. I want its rails active record version some thing like:
Campaign.includes(campaign_countries: :country).where(countries: {code: "US"})
Above code runs more or less correct query (did not try to include offers table), issue is returned result is collection of Campaign objects so obviously it does not include Points
My tables are:
campaigns --HAS_MANY--< campaign_countries --BELONGS_TO--< countries
campaigns --BELONGS_TO--> offers
Any suggestions to write AR version of this SQL? I don't want to use SQL statement in my code.
I some how got this working without SQL but surely its poor man's solution:
in my controller I have:
campaigns = Campaign.includes(campaign_countries: :country).where(countries: {code: country.to_s})
render :json => campaigns.to_json(:country => country)
in campaign model:
def points_for_country country
CampaignCountry.joins(:campaign, :country).where(countries: {code: country}, campaigns: {id: self.id}).first
end
def as_json options={}
json = {
id: id,
cid: cid,
name: name,
offer: offer,
points_details: options[:country] ? points_for_country(options[:country]) : ""
}
end
and in campaign_countries model:
def as_json options={}
json = {
face_value: face_value,
actual_value: actual_value,
points: points
}
end
Why this is not good solution? because it invokes too many queries:
1. It invokes query when first join is performed to get list of campaigns specific to country
2. For each campaign found in first query it will invoke one more query on campaign_countries table to get Points for that campaign and country.
This is bad, Bad and BAD solution. Any suggestions to improve this?
If You have campaign, You can use campaign.campaign_countries to get associated campaign_countries and just get points from them.
> campaign.campaign_countries.map(&:points)
=> [1,2,3,4,5]
Similarly You will be able to get image from offers relation.
EDIT:
Ok, I guess now I know what's going on. You can use joins with select to get object with attached fields from join tables.
cs = Campaign.joins(campaign_countries: :country).joins(:offers).select('campaigns.*, campaign_countries.points, offers.image').where(countries: {code: "US"})
You can than reference additional fields by their name on Campaign object
cs.first.points
cs.first.image
But be sure, that additional column names do not overlap with some primary table fields or object methods.
EDIT 2:
After some more research I came to conclusion that my first version was actually correct for this case. I will use my own console as example.
> u = User.includes(:orders => :cart).where(:carts => { :id => [5168, 5167] }).first
> u.orders.length # no query is performed
=> 2
> u.orders.count # count query is performed
=> 5
So when You use includes with condition on country, in campaign_countries are stored only campaign_countries that fulfill Your condition.
Try this:
Campaign.joins( [{ :campaign_countries => :countries}, :offers]).where('`countries`.`code` = ?', "US")

Rails/Sql - order/group search results such that repetition of entities occurs only after appearance of others

In my application, say, animals have many photos. I'm querying photos of animals such that I want all photos of all animals to be displayed. However, I want each animal to appear as a photo before repetition occurs.
Example:
animal instance 1, 'cat', has four photos,
animal instance 2, 'dog', has two photos:
photos should appear ordered as so:
#photo belongs to #animal
tiddles.jpg , cat
fido.jpg dog
meow.jpg cat
rover.jpg dog
puss.jpg cat
felix.jpg, cat (no more dogs so two consecutive cats)
Pagination is required so I can't
order on an array.
Filename
structure/convention provides no
help, though the animal_id exists on
each photo.
Though there are two
types of animal in this example this
is an active record model with
hundreds of records.
Animals may be
selectively queried.
If this isn't possible with active_record then I'll happily use sql; I'm using postgresql.
My brain is frazzled so if anyone can come up with a better title, please go ahead and edit it or suggest in comments.
Here is a PostgreSQL specific solution:
batch_id_sql = "RANK() OVER (PARTITION BY animal_id ORDER BY id ASC)"
Photo.paginate(
:select => "DISTINCT photos.*, (#{batch_id_sql}) batch_id",
:order => "batch_id ASC, photos.animal_id ASC",
:page => 1)
Here is a DB agnostic solution:
batch_id_sql = "
SELECT COUNT(bm.*)
FROM photos bm
WHERE bm.animal_id = photos.animal_id AND
bm.id <= photos.id
"
Photo.paginate(
:select => "photos.*, (#{batch_id_sql}) batch_id",
:order => "batch_id ASC, photos.animal_id ASC",
:page => 1)
Both queries work even when you have a where condition. Benchmark the query using expected data set to check if it meets the expected throughput and latency requirements.
Reference
PostgreSQL Window function
Having no experience in activerecord. Using plain PostgreSQL I would try something like this:
Define a window function over all previous rows which counts how many time the current animal has appeared, then order by this count.
SELECT
filename,
animal_id,
COUNT(*) OVER (PARTITION BY animal_id ORDER BY filename) AS cnt
FROM
photos
ORDER BY
cnt,
animal_id,
filename
Filtering on certain animal_id's will work. This will always order the same way. I don't know if you want something random in there, but it should be easily added.
New solution
Add an integer column called batch_id to the animals table.
class AddBatchIdToPhotos < ActiveRecord::Migration
def self.up
add_column :photos, :batch_id, :integer
set_batch_id
change_column :photos, :batch_id, :integer, :nil => false
add_index :photos, :batch_id
end
def self.down
remove_column :photos, :batch_id
end
def self.set_batch_id
# set the batch id to existing rows
# implement this
end
end
Now add a before_create on the Photo model to set the batch id.
class Photo
belongs_to :animal
before_create :batch_photo_add
after_update :batch_photo_update
after_destroy :batch_photo_remove
private
def batch_photo_add
self.batch_id = next_batch_id_for_animal(animal_id)
true
end
def batch_photo_update
return true unless animal_id_changed?
batch_photo_remove(batch_id, animal_id_was)
batch_photo_add
end
def batch_photo_remove(b_id=batch_id, a_id=animal_id)
Photo.update_all("batch_id = batch_id- 1",
["animal_id = ? AND batch_id > ?", a_id, b_id])
true
end
def next_batch_id_for_animal(a_id)
(Photo.maximum(:batch_id, :conditions => {:animal_id => a_id}) || 0) + 1
end
end
Now you can get the desired result by issuing simple paginate command
#animal_photos = Photo.paginate(:page => 1, :per_page => 10,
:order => :batch_id)
How does this work?
Let's consider we have data set as given below:
id Photo Description Batch Id
1 Cat_photo_1 1
2 Cat_photo_2 2
3 Dog_photo_1 1
2 Cat_photo_3 3
4 Dog_photo_2 2
5 Lion_photo_1 1
6 Cat_photo_4 4
Now if we were to execute a query ordered by batch_id we get this
# batch 1 (cat, dog, lion)
Cat_photo_1
Dog_photo_1
Lion_photo_1
# batch 2 (cat, dog)
Cat_photo_2
Dog_photo_2
# batch 3,4 (cat)
Cat_photo_3
Cat_photo_4
The batch distribution is not random, the animals are filled from the top. The number of animals displayed in a page is governed by per_page parameter passed to paginate method (not the batch size).
Old solution
Have you tried this?
If you are using the will_paginate gem:
# assuming you want to order by animal name
animal_photos = Photo.paginate(:include => :animal, :page => 1,
:order => "animals.name")
animal_photos.each do |animal_photo|
puts animal_photo.file_name
puts animal_photo.animal.name
end
I'd recommend something hybrid/corrected based on KandadaBoggu's input.
First off, the correct way to do it on paper is with row_number() over (partition by animal_id order by id). The suggested rank() will generate a global row number, but you want the one within its partition.
Using a window function is also the most flexible solution (in fact, the only solution) if you want to plan to change the sort order here and there.
Take note that this won't necessarily scale well, however, because in order to sort the results you'll need to:
fetch the whole result set that matches your criteria
sort the whole result set to create the partitions and obtain a rank_id
top-n sort/limit over the result set a second time to get them in their final order
The correct way to do this in practice, if your sort order is immutable, is to maintain a pre-calculated rank_id. KandadaBoggu's other suggestion points in the correct direction in this sense.
When it comes to deletes (and possibly updates, if you don't want them sorted by id), you may run into issues because you end up trading faster reads for slower writes. If deleting the cat with an index of 1 leads to updating the next 50k cats, you're going to be in trouble.
If you've very small sets, the overhead might be very acceptable (don't forget to index animal_id).
If not, there's a workaround if you find the order in which specific animals appear is irrelevant. It goes like this:
Start a transaction.
If the rank_id is going to change (i.e. insert or delete), obtain an advisory lock to ensure that two sessions can't impact the rank_id of the same animal class, e.g.:
SELECT pg_try_advisory_lock('the_table'::regclass, the_animal_id);
(Sleep for .05s if you don't obtain it.)
On insert, find max(rank_id) for that animal_id. Assign it rank_id + 1. Then insert it.
On delete, select the animal with the same animal_id and the largest rank_id. Delete your animal, and assign its old rank_id to the fetched animal (unless you were deleting the last one, of course).
Release the advisory lock.
Commit the work.
Note that the above will make good use of an index on (animal_id, rank_id) and can be done using plpgsql triggers:
create trigger "__animals_rank_id__ins"
before insert on animals
for each row execute procedure lock_animal_id_and_assign_rank_id();
create trigger "_00_animals_rank_id__ins"
after insert on animals
for each row execute procedure unlock_animal_id();
create trigger "__animals_rank_id__del"
before delete on animals
for each row execute procedure lock_animal_id();
create trigger "_00_animals_rank_id__del"
after delete on animals
for each row execute procedure reassign_rank_id_and_unlock_animal_id();
You can then create a multi-column index on your sort criteria if you're not joining all over them place, e.g. (rank_id, name). And you'll end up with a snappy site for reads and writes.
You should be able to get the pictures (or filenames, anyway) using ActiveRecord, ordered by name.
Then you can use Enumerable#group_by and Enumerable#zip to zip all the arrays together.
If you give me more information about how your filenames are really arranged (i.e., are they all for sure with an underscore before the number and a constant name before the underscore for each "type"? etc.), then I can give you an example. I'll write one up momentarily showing how you'd do it for your current example.
You could run two sorts and build one array as follows:
result1= The first of each animal type only. use the ruby "find" method for this search.
result2= All animals, sorted by group. Use "find" to again find the first occurrence of each animal and then use "drop" to remove those "first occurrences" from result2.
Then:
markCustomResult = result1 + result2
Then:
You can use willpaginate on markCustomResult

Help optimizing ActiveRecord query (voting system)

I have a voting system with two models: Item(id, name) and Vote(id, item_id, user_id).
Here's the code I have so far:
class Item < ActiveRecord::Base
has_many :votes
def self.most_popular
items = Item.all #where can I optimize here?
items.sort {|x,y| x.votes.length <=> y.votes.length}.first #so I don't need to do anything here?
end
end
There's a few things wrong with this, mainly that I retrieve all the Item records, THEN use Ruby to compute popularity. I am almost certain there is a simple solution to this, but I can't quite put my finger on it.
I'd much rather gather records and run the calculations in the initial query. This way, I can add a simple :limit => 1 (or LIMIT 1) to the query.
Any help would be great--either rewrite in all ActiveRecord or even in raw SQl. The latter would actually give me a much clearer picture of the nature of the query I want to execute.
Group the votes by item id, order them by count and then take the item of the first one. In rails 3 the code for this is:
Vote.group(:item_id).order("count(*) DESC").first.item
In rails 2, this should work:
Vote.all(:order => "count(*) DESC", :group => :item_id).first.item
sepp2k has the right idea. In case you're not using Rails 3, the equivalent is:
Vote.first(:group => :item_id, :order => "count(*) DESC", :include => :item).item
Probably there's a better way to do this in ruby, but in SQL (mysql at least) you could try something like this to get a top 10 ranking:
SELECT i.id, i.name, COUNT( v.id ) AS total_votes
FROM Item i
LEFT JOIN Vote v ON ( i.id = v.item_id )
GROUP BY i.id
ORDER BY total_votes DESC
LIMIT 10
One easy way of handling this is to add a vote count field to the Item, and update that each time there is a vote. Rails used to do that automatically for you, but not sure if it's still the case in 2.x and 3.0. It's easy enough for you to do it in any case using an Observer pattern or else just by putting in a "after_save" in the Vote model.
Then your query is very easy, by simply adding a "VOTE_COUNT DESC" order to your query.

Rails combined ('AND') searches on associated join tables

I cant get rails to return combined ('AND') searches on associated join tables of an Object.
E.g. I have Books that are in Categories. Lets say: Book 1: is in category 5 and 8
But I can't get 'AND' to filter results using the join table? E.g ::->
Class Books
has_and_belongs_to_many :categories, :join_table => "book_categories"
Book.find :all, :conditions => "book_categories.category_id = 5 AND book_categories.category_id = 8", :include => "categories"
... returns nil
(why does it not return all books that are in both 5 & 8 ??)
However: 'OR' does work:
Book.find :all, :conditions => "book_categories.category_id = 5 OR book_categories.category_id = 8"
... returns all books in category 5 and 8
I must be missing something?
The problem is at the SQL level. That condition runs on a link table row, and any individual link table row can never have a category_id of both 5 and 8. You really want separate link table rows to have these IDs.
Try looking into Rails' named_scope, specifically the part that allows filtering with a lambda (so you can take an argument). I've never tried it out myself, but if I had to implement what you're looking for, that's what I'd look in to.