ruby each loop in order on sql order query result - sql

I have a query in my Controller that works perfectly:
#klasses_mon = Klass.order(:start).where(day: 'MON').find_each
my result is (shown by <%= #klasses_mon.inspect %> in my view):
#<Enumerator: #<ActiveRecord::Relation
[#<Klass id: 9, name: "Cycling", teacher: "Tomek", day: "MON", start: 510, duration: 45>,
#<Klass id: 8, name: "LBT", teacher: "Monia", day: "MON", start: 600, duration: 60>,
#<Klass id: 11, name: "HIIT", teacher: "Aga", day: "MON", start: 930, duration: 45>]>
:find_each({:start=>nil, :finish=>nil, :batch_size=>1000, :error_on_ignore=>nil})>
But I am trying to display it in each loop. For some reason it is not ordered anymore. Looks like each loop does not keep the order from my query result:
<% #klasses_mon.each do |k| %>
<p><%= k.teacher %>,
<%= k.name %>
START: <%= k.start/60 %>:<%= k.start%60 %>
<% end %>
result:
Monia, LBT START: 10:0
Tomek, Cycling START: 8:30
Aga, HIIT START: 15:30
How should I do that?

From the fine manual:
find_each(start: nil, finish: nil, batch_size: 1000, error_on_ignore: nil)
[...]
NOTE: It's not possible to set the order. That is automatically set to ascending on the primary key (“id ASC”) to make the batch ordering work. This also means that this method only works when the primary key is orderable (e.g. an integer or string).
So find_each is explicitly documented to ignore any ordering that you try to use.
find_each doesn't use LIMIT and OFFSET to move the batch window through the result set as that tends to be very expensive as the OFFSET increases, instead it orders by the primary key and includes a id > last_one condition in the WHERE clause to set the start of the batch and a LIMIT clause to set the batch size. Ordering by the PK and querying on the PK are both generally inexpensive as is a LIMIT clause.
find_each is the wrong tool for this job, find_each is for batch work but you're just displaying a short list of records so you want a simple:
#klasses_mon = Klass.order(:start).where(day: 'MON')

The method #find_each ignores any scoped order and forces a sort by the primary key (usually id). This is stated in the documentation and is because #find_each needs to make sure that it doesn't repeat any records during iteration.
You can see this in your console if you try:
> #klasses_mon = Klass.order(:start).where(day: 'MON').find_each
> #klasses_mon.map(&:start) # force the relation to execute and return rows.
Scoped order is ignored, it's forced to be batch order.
Klass Load (0.ms) SELECT "klasses".* FROM "klasses" WHERE "klasses"."day" = 'MON' ORDER BY "klasses"."id"
=> [600, 510, 930]
If you're not expecting to run through thousands of rows, you can drop the find_each:
#klasses_mon = Klass.where(day: "MON").order(:start)

Related

Rails: Optimize querying maximum values from associated table

I need to show a list of partners and the maximum value from the reservation_limit column from Klass table.
Partner has_many :klasses
Klass belongs_to :partner
# Partner controller
def index
#partners = Partner.includes(:klasses)
end
# view
<% #partners.each do |partner| %>
Up to <%= partner.klasses.maximum("reservation_limit") %> visits per month
<% end %>
Unfortunately the query below runs for every single Partner.
SELECT MAX("klasses"."reservation_limit") FROM "klasses" WHERE "klasses"."partner_id" = $1 [["partner_id", 1]]
If there are 40 partners then the query will run 40 times. How do I optimize this?
edit: Looks like there's a limit method in rails so I'm changing the limit in question to reservation_limit to prevent confusion.
You can use two forms of SQL to efficiently retrieve this information, and I'm assuming here that you want a result for a partner even where there is no klass record for it
The first is:
select partners.*,
max(klasses.limit) as max_klasses_limit
from partners
left join klasses on klasses.partner_id = partners.id
group by partner.id
Some RDBMSs require that you use "group by partner.*", though, which is potentially expensive in terms of the required sort and the possibility of it spilling to disk.
On the other hand you can add a clause such as:
having("max(klasses.limit) > ?", 3)
... to efficiently filter the partners by their value of maximum klass.limit
The other is:
select partners.*,
(Select max(klasses.limit)
from klasses
where klasses.partner_id = partners.id) as max_klasses_limit
from partners
The second one does not rely on a group by, and in some RDBMSs may be effectively transformed internally to the first form, but may execute less efficiently by the subquery being executed once per row in the partners table (which would stil be much faster than the raw Rails way of actually submitting a query per row).
The Rails ActiveRecord forms of these would be:
Partner.joins("left join klasses on klasses.partner_id = partners.id").
select("partners.*, max(klasses.limit) as max_klasses_limit").
group(:id)
... and ...
Partner.select("partners.*, (select max(klasses.limit)
from klasses
where klasses.partner_id = partners.id) as max_klasses_limit")
Which of these is actually the most efficient is probably going to depend on the RDBMS and even the RDBMS version.
If you don't need a result when there is no klass for the partner, or there is always guaranteed to be one, then:
Partner.joins(:klasses).
select("partners.*, max(klasses.limit) as max_klasses_limit").
group(:id)
Either way, you can then reference
partner.max_klasses_limit
Your initial query brings all the information you need. You only need to work with it as you would work with a regular array of objects.
Change
Up to <%= partner.klasses.maximum("reservation_limit") %> visits per month
to
Up to <%= partner.klasses.empty? ? 0 : partner.klasses.max_by { |k| k.reservation_limit }.reservation_limit %> visits per month
What maximum("reservation_limit") does it to trigger an Active Record query SELECT MAX.... But you don't need this, as you already have all the information you need to process the maximum in your array.
Note
Using .count on an Active Record result will trigger an extra SELECT COUNT... query!
Using .length will not.
It generally helps if you start writing the query in pure SQL and then extract it into ActiveRecord or Arel code.
ActiveRecord is powerful, but it tends to force you to write highly inefficient queries as soon as you derail from the standard CRUD operations.
Here's your query
Partner
.select('partners.*, (SELECT MAX(klasses.reservation_limit) FROM klasses WHERE klasses.partner_id = partners.id) AS maximum_limit')
.joins(:klasses).group('partners.id')
It is a single query, with a subquery. However the subquery is optimized to run only once as it can be parsed ahead and it doesn't run N+1 times.
The code above fetches all the partners, joins them with the klasses records and thanks to the join it can compute the aggregate maximum. Since the join effectively creates a cartesian product of the records, you then need to group by the partners.id (which in fact is required in any case by the MAX aggregate function).
The key here is the AS maximum_limit that will assign a new attribute to the Partner instances returned with the value of the count.
partners = Partner.select ...
partners.each do |partner|
puts partner.maximum_limit
end
This will return max. limits in one select for an array of parthner_ids:
parthner_ids = #partners.map{|p| p.id}
data = Klass.select('MAX("limit") as limit', 'partner_id').where(partner_id: parthner_ids).group('partner_id')
#limits = data.to_a.group_by{|d| d.id}
You can now integrate it into your view:
<% #partners.each do |partner| %>
Up to <%= #limits[partner.id].limit %> visits per month
<% end %>

count users within a given age range - ruby on rails

I want to have a function to group users according to a certain age range. Then get their count which I can use to plot a bar-chart.
e
this statement is able to plot a chart of their ages, but it is each users age. which is what i want to eliminate. if you would like to know am using chartkick to make the charts.
<%= bar_chart User.group("date_trunc('year', age(dob))").count, {library: {title: "User's Age"}} %>
I want the ages returned as:
<%= bar_chart [["Below 10", users_with_ages_lying_in this_range.count],
["10-19", users_with_ages_lying_in this_range.count],
["20-29", ],
["30-39", ],
["40-49", ],
["above 50",]], {library: {title: "User's Age"}}
%>
in short I would like this line of code User.group("date_trunc('year', age(dob))").count to return only users for a given age range.
If you have any better ideas I would appreciate to know.
I'm using postgres DB
I had a similar need to find events within a range.
I made a class method on a model:
def self.in_range(start_range, end_range)
where "((start_date <= ?) and (end_date >= ?))", end_range, start_range
end
which is used like so:
events.in_range(start_range, end_range).order('start_date ASC')
I am sure a similar one could be created using age instead of start_date and end_date.

Remove duplicate records based on multiple columns?

I'm using Heroku to host my Ruby on Rails application and for one reason or another, I may have some duplicate rows.
Is there a way to delete duplicate records based on 2 or more criteria but keep just 1 record of that duplicate collection?
In my use case, I have a Make and Model relationship for cars in my database.
Make Model
--- ---
Name Name
Year
Trim
MakeId
I'd like to delete all Model records that have the same Name, Year and Trim but keep 1 of those records (meaning, I need the record but only once). I'm using Heroku console so I can run some active record queries easily.
Any suggestions?
class Model
def self.dedupe
# find all models and group them on keys which should be common
grouped = all.group_by{|model| [model.name,model.year,model.trim,model.make_id] }
grouped.values.each do |duplicates|
# the first one we want to keep right?
first_one = duplicates.shift # or pop for last one
# if there are any more left, they are duplicates
# so delete all of them
duplicates.each{|double| double.destroy} # duplicates can now be destroyed
end
end
end
Model.dedupe
Find All
Group them on keys which you need for uniqueness
Loop on the grouped model's values of the hash
remove the first value because you want to retain one copy
delete the rest
If your User table data like below
User.all =>
[
#<User id: 15, name: "a", email: "a#gmail.com", created_at: "2013-08-06 08:57:09", updated_at: "2013-08-06 08:57:09">,
#<User id: 16, name: "a1", email: "a#gmail.com", created_at: "2013-08-06 08:57:20", updated_at: "2013-08-06 08:57:20">,
#<User id: 17, name: "b", email: "b#gmail.com", created_at: "2013-08-06 08:57:28", updated_at: "2013-08-06 08:57:28">,
#<User id: 18, name: "b1", email: "b1#gmail.com", created_at: "2013-08-06 08:57:35", updated_at: "2013-08-06 08:57:35">,
#<User id: 19, name: "b11", email: "b1#gmail.com", created_at: "2013-08-06 09:01:30", updated_at: "2013-08-06 09:01:30">,
#<User id: 20, name: "b11", email: "b1#gmail.com", created_at: "2013-08-06 09:07:58", updated_at: "2013-08-06 09:07:58">]
1.9.2p290 :099 >
Email id's are duplicate, so our aim is to remove all duplicate email ids from user table.
Step 1:
To get all distinct email records id.
ids = User.select("MIN(id) as id").group(:email,:name).collect(&:id)
=> [15, 16, 18, 19, 17]
Step 2:
To remove duplicate id's from user table with distinct email records id.
Now the ids array holds the following ids.
[15, 16, 18, 19, 17]
User.where("id NOT IN (?)",ids) # To get all duplicate records
User.where("id NOT IN (?)",ids).destroy_all
** RAILS 4 **
ActiveRecord 4 introduces the .not method which allows you to write the following in Step 2:
User.where.not(id: ids).destroy_all
Similar to #Aditya Sanghi 's answer, but this way will be more performant because you are only selecting the duplicates, rather than loading every Model object into memory and then iterating over all of them.
# returns only duplicates in the form of [[name1, year1, trim1], [name2, year2, trim2],...]
duplicate_row_values = Model.select('name, year, trim, count(*)').group('name, year, trim').having('count(*) > 1').pluck(:name, :year, :trim)
# load the duplicates and order however you wantm and then destroy all but one
duplicate_row_values.each do |name, year, trim|
Model.where(name: name, year: year, trim: trim).order(id: :desc)[1..-1].map(&:destroy)
end
Also, if you truly don't want duplicate data in this table, you probably want to add a multi-column unique index to the table, something along the lines of:
add_index :models, [:name, :year, :trim], unique: true, name: 'index_unique_models'
You could try the following: (based on previous answers)
ids = Model.group('name, year, trim').pluck('MIN(id)')
to get all valid records. And then:
Model.where.not(id: ids).destroy_all
to remove the unneeded records. And certainly, you can make a migration that adds a unique index for the three columns so this is enforced at the DB level:
add_index :models, [:name, :year, :trim], unique: true
To run it on a migration I ended up doing like the following (based on the answer above by #aditya-sanghi)
class AddUniqueIndexToXYZ < ActiveRecord::Migration
def change
# delete duplicates
dedupe(XYZ, 'name', 'type')
add_index :xyz, [:name, :type], unique: true
end
def dedupe(model, *key_attrs)
model.select(key_attrs).group(key_attrs).having('count(*) > 1').each { |duplicates|
dup_rows = model.where(duplicates.attributes.slice(key_attrs)).to_a
# the first one we want to keep right?
dup_rows.shift
dup_rows.each{ |double| double.destroy } # duplicates can now be destroyed
}
end
end
Based on #aditya-sanghi's answer, with a more efficient way to find duplicates using SQL.
Add this to your ApplicationRecord to be able to deduplicate any model:
class ApplicationRecord < ActiveRecord::Base
# …
def self.destroy_duplicates_by(*columns)
groups = select(columns).group(columns).having(Arel.star.count.gt(1))
groups.each do |duplicates|
records = where(duplicates.attributes.symbolize_keys.slice(*columns))
records.offset(1).destroy_all
end
end
end
You can then call destroy_duplicates_by to destroy all records (except the first) that have the same values for the given columns. For example:
Model.destroy_duplicates_by(:name, :year, :trim, :make_id)
I chose a slightly safer route (IMHO). I started by getting all the unique records.
ids = Model.where(other_model_id: 1).uniq(&:field).map(&:id)
Then I got all the ids
all_ids = Model.where(other_model_id: 1).map(&:id)
This allows me to do a matrix subtraction for the duplicates
dups = all_ids - ids
I then map over the duplicate ids and fetch the model because I want to ensure I have the records I am interested in.
records = dups.map do |id| Model.find(id) end
When I am sure I want to delete, I iterate again to delete.
records.map do |record| record.delete end
When deleting duplicate records on a production system, you want to be very sure you are not deleting important live data, so in this process, I can double-check everything.
So in the case above:
all_ids = Model.all.map(&:ids)
uniq_ids = Model.all.group_by do |model|
[model.name, model.year, model.trim]
end.values.map do |duplicates|
duplicates.first.id
end
dups = all_ids - uniq_ids
records = dups.map { |id| Model.find(id) }
records.map { |record| record.delete }
or something like this.
You can try this sql query, to remove all duplicate records but latest one
DELETE FROM users USING users user WHERE (users.name = user.name AND users.year = user.year AND users.trim = user.trim AND users.id < user.id);

Improving performance of Rails model

I have the following model that allows Users to cast Votes on Photos.
class Vote < ActiveRecord::Base
attr_accessible :value
belongs_to :photo
belongs_to :user
validates_associated :photo, :user
validates_uniqueness_of :user_id, :scope => :photo_id
validates_uniqueness_of :photo_id, :scope => :user_id
validates_inclusion_of :value, :in => [-2,-1,1,2], :allow_nil => true
after_save :write_photo_data
def self.score
dd = where( :value => -2 ).count
d = where( :value => -1 ).count
u = where( :value => 1 ).count
uu = where( :value => 2 ).count
self.compute_score(dd,d,u,uu)
end
def self.compute_score(dd, d, u, uu)
tot = [dd,d,u,uu].sum.to_f
score = [-5*dd, -2*d, 2*u, 5*uu].sum / [tot,4].sum*20.0
score.round(2)
end
private
def write_photo_data
self.photo.score = self.photo.votes.score
self.photo.save!
end
end
This functions very well, however computing the score for a photo is pretty slow - it seems to take 7-12 seconds on average. I've tried adding indices for photo_id, user_id, and one combined for photo_id and value, but this hasn't really improved the performance as far as I can tell.
I'd be interested in feedback from any serious rails gurus (I'm totally an amateur) as to how this could be optimized / improved. How would you tally up votes for a particular photo and value?
Thanks!
--EDIT--
Note that the scores: -2,-1,1,2 represent "two-thumbs down, one-thumb down, thumb up, two-thumbs up", not specific values. I could match these to the values I've assigned to them in the compute score method, but I haven't done that so far because I may want to tweak the weightings over time after seeing more data accumulated.
Also, regardless of how I represent those four possible votes in the DB, I still need both the COUNT of each kind of vote as well as the weighted value of those votes for each photo to compute the score. Thanks!
You need an index on value, by itself. combined indexes only work when the query has both components, starting at the left. Since your where clause does not specify a photo id, it's not using your combined index.
update see http://dev.mysql.com/doc/refman/5.0/en/multiple-column-indexes.html
One thing you could do is asking the database once instead of four times for the score counts:
Vote.where(photo_id: photo.id).group(:value).count
would result in a single database query and give you a hash like
{-2 => 21, -1 => 48, 1 => 103, 2 => 84}
Besides that, if you store the actual values of [-5, -2, 2, 5] instead of [-2, -1, 1, 2] in the database, you could just do
Vote.where(photo_id: photo.id).sum
and get your sum direct from the database (or even use avg to get the average instead)
Why do you store -2, 2, 1, 2 instead of the actual grade? If you store the grade (-5 for example), you will be able to compute the score in DB directly without having to run 4 count queries. This will be an improvement for sure.
Putting an index on the value column will speed up the SELECTs if you have lots of records in the DB.
The above posts also bring up some good points on direct optimization. However, as your DB scales, all of these approaches will eventually fall down. Since the score is a derived value, you could cache it in Memcached, Redis, or even SQL which will ensure that fetching the score scales in constant time as the app grows. You can allow the caches to get out of date and keep them updated using a background process. By doing so, your calculation function can take arbitrarily long without impacting the user experience.

Rails/Sql - order/group search results such that repetition of entities occurs only after appearance of others

In my application, say, animals have many photos. I'm querying photos of animals such that I want all photos of all animals to be displayed. However, I want each animal to appear as a photo before repetition occurs.
Example:
animal instance 1, 'cat', has four photos,
animal instance 2, 'dog', has two photos:
photos should appear ordered as so:
#photo belongs to #animal
tiddles.jpg , cat
fido.jpg dog
meow.jpg cat
rover.jpg dog
puss.jpg cat
felix.jpg, cat (no more dogs so two consecutive cats)
Pagination is required so I can't
order on an array.
Filename
structure/convention provides no
help, though the animal_id exists on
each photo.
Though there are two
types of animal in this example this
is an active record model with
hundreds of records.
Animals may be
selectively queried.
If this isn't possible with active_record then I'll happily use sql; I'm using postgresql.
My brain is frazzled so if anyone can come up with a better title, please go ahead and edit it or suggest in comments.
Here is a PostgreSQL specific solution:
batch_id_sql = "RANK() OVER (PARTITION BY animal_id ORDER BY id ASC)"
Photo.paginate(
:select => "DISTINCT photos.*, (#{batch_id_sql}) batch_id",
:order => "batch_id ASC, photos.animal_id ASC",
:page => 1)
Here is a DB agnostic solution:
batch_id_sql = "
SELECT COUNT(bm.*)
FROM photos bm
WHERE bm.animal_id = photos.animal_id AND
bm.id <= photos.id
"
Photo.paginate(
:select => "photos.*, (#{batch_id_sql}) batch_id",
:order => "batch_id ASC, photos.animal_id ASC",
:page => 1)
Both queries work even when you have a where condition. Benchmark the query using expected data set to check if it meets the expected throughput and latency requirements.
Reference
PostgreSQL Window function
Having no experience in activerecord. Using plain PostgreSQL I would try something like this:
Define a window function over all previous rows which counts how many time the current animal has appeared, then order by this count.
SELECT
filename,
animal_id,
COUNT(*) OVER (PARTITION BY animal_id ORDER BY filename) AS cnt
FROM
photos
ORDER BY
cnt,
animal_id,
filename
Filtering on certain animal_id's will work. This will always order the same way. I don't know if you want something random in there, but it should be easily added.
New solution
Add an integer column called batch_id to the animals table.
class AddBatchIdToPhotos < ActiveRecord::Migration
def self.up
add_column :photos, :batch_id, :integer
set_batch_id
change_column :photos, :batch_id, :integer, :nil => false
add_index :photos, :batch_id
end
def self.down
remove_column :photos, :batch_id
end
def self.set_batch_id
# set the batch id to existing rows
# implement this
end
end
Now add a before_create on the Photo model to set the batch id.
class Photo
belongs_to :animal
before_create :batch_photo_add
after_update :batch_photo_update
after_destroy :batch_photo_remove
private
def batch_photo_add
self.batch_id = next_batch_id_for_animal(animal_id)
true
end
def batch_photo_update
return true unless animal_id_changed?
batch_photo_remove(batch_id, animal_id_was)
batch_photo_add
end
def batch_photo_remove(b_id=batch_id, a_id=animal_id)
Photo.update_all("batch_id = batch_id- 1",
["animal_id = ? AND batch_id > ?", a_id, b_id])
true
end
def next_batch_id_for_animal(a_id)
(Photo.maximum(:batch_id, :conditions => {:animal_id => a_id}) || 0) + 1
end
end
Now you can get the desired result by issuing simple paginate command
#animal_photos = Photo.paginate(:page => 1, :per_page => 10,
:order => :batch_id)
How does this work?
Let's consider we have data set as given below:
id Photo Description Batch Id
1 Cat_photo_1 1
2 Cat_photo_2 2
3 Dog_photo_1 1
2 Cat_photo_3 3
4 Dog_photo_2 2
5 Lion_photo_1 1
6 Cat_photo_4 4
Now if we were to execute a query ordered by batch_id we get this
# batch 1 (cat, dog, lion)
Cat_photo_1
Dog_photo_1
Lion_photo_1
# batch 2 (cat, dog)
Cat_photo_2
Dog_photo_2
# batch 3,4 (cat)
Cat_photo_3
Cat_photo_4
The batch distribution is not random, the animals are filled from the top. The number of animals displayed in a page is governed by per_page parameter passed to paginate method (not the batch size).
Old solution
Have you tried this?
If you are using the will_paginate gem:
# assuming you want to order by animal name
animal_photos = Photo.paginate(:include => :animal, :page => 1,
:order => "animals.name")
animal_photos.each do |animal_photo|
puts animal_photo.file_name
puts animal_photo.animal.name
end
I'd recommend something hybrid/corrected based on KandadaBoggu's input.
First off, the correct way to do it on paper is with row_number() over (partition by animal_id order by id). The suggested rank() will generate a global row number, but you want the one within its partition.
Using a window function is also the most flexible solution (in fact, the only solution) if you want to plan to change the sort order here and there.
Take note that this won't necessarily scale well, however, because in order to sort the results you'll need to:
fetch the whole result set that matches your criteria
sort the whole result set to create the partitions and obtain a rank_id
top-n sort/limit over the result set a second time to get them in their final order
The correct way to do this in practice, if your sort order is immutable, is to maintain a pre-calculated rank_id. KandadaBoggu's other suggestion points in the correct direction in this sense.
When it comes to deletes (and possibly updates, if you don't want them sorted by id), you may run into issues because you end up trading faster reads for slower writes. If deleting the cat with an index of 1 leads to updating the next 50k cats, you're going to be in trouble.
If you've very small sets, the overhead might be very acceptable (don't forget to index animal_id).
If not, there's a workaround if you find the order in which specific animals appear is irrelevant. It goes like this:
Start a transaction.
If the rank_id is going to change (i.e. insert or delete), obtain an advisory lock to ensure that two sessions can't impact the rank_id of the same animal class, e.g.:
SELECT pg_try_advisory_lock('the_table'::regclass, the_animal_id);
(Sleep for .05s if you don't obtain it.)
On insert, find max(rank_id) for that animal_id. Assign it rank_id + 1. Then insert it.
On delete, select the animal with the same animal_id and the largest rank_id. Delete your animal, and assign its old rank_id to the fetched animal (unless you were deleting the last one, of course).
Release the advisory lock.
Commit the work.
Note that the above will make good use of an index on (animal_id, rank_id) and can be done using plpgsql triggers:
create trigger "__animals_rank_id__ins"
before insert on animals
for each row execute procedure lock_animal_id_and_assign_rank_id();
create trigger "_00_animals_rank_id__ins"
after insert on animals
for each row execute procedure unlock_animal_id();
create trigger "__animals_rank_id__del"
before delete on animals
for each row execute procedure lock_animal_id();
create trigger "_00_animals_rank_id__del"
after delete on animals
for each row execute procedure reassign_rank_id_and_unlock_animal_id();
You can then create a multi-column index on your sort criteria if you're not joining all over them place, e.g. (rank_id, name). And you'll end up with a snappy site for reads and writes.
You should be able to get the pictures (or filenames, anyway) using ActiveRecord, ordered by name.
Then you can use Enumerable#group_by and Enumerable#zip to zip all the arrays together.
If you give me more information about how your filenames are really arranged (i.e., are they all for sure with an underscore before the number and a constant name before the underscore for each "type"? etc.), then I can give you an example. I'll write one up momentarily showing how you'd do it for your current example.
You could run two sorts and build one array as follows:
result1= The first of each animal type only. use the ruby "find" method for this search.
result2= All animals, sorted by group. Use "find" to again find the first occurrence of each animal and then use "drop" to remove those "first occurrences" from result2.
Then:
markCustomResult = result1 + result2
Then:
You can use willpaginate on markCustomResult