Moving MySQL routine to model - ruby-on-rails-3

I have a PHP-driven website that I am converting to Rails. I currently have the following logic in a MySQL routine.
DECLARE FirstWeighinDate DATE;
DECLARE FirstWeighinWeight DECIMAL(5,1);
DECLARE MostRecentWeighinDate DATE;
DECLARE MostRecentWeighinWeight DECIMAL(5,1);
DECLARE NumberOfWeighins INT(11);
DECLARE NumberOfDeficitRecords INT(11);
DECLARE AverageDeficit INT(11);
DECLARE UserHeight INT(11);
DECLARE MostRecentBMI DECIMAL(5,1);
SELECT date INTO FirstWeighinDate FROM tblDeficitEntries WHERE weight IS NOT NULL AND user_id = inUserId ORDER BY date ASC LIMIT 1;
SELECT weight INTO FirstWeighinWeight FROM tblDeficitEntries WHERE weight IS NOT NULL AND user_id = inUserId ORDER BY date ASC LIMIT 1;
SELECT date INTO MostRecentWeighinDate FROM tblDeficitEntries WHERE weight IS NOT NULL AND user_id = inUserId ORDER BY date DESC LIMIT 1;
SELECT weight INTO MostRecentWeighinWeight FROM tblDeficitEntries WHERE weight IS NOT NULL AND user_id = inUserId ORDER BY date DESC LIMIT 1;
SELECT Height INTO UserHeight FROM tblLogins WHERE id = inUserId LIMIT 1;
SELECT COUNT(id) INTO NumberOfWeighins FROM tblDeficitEntries WHERE weight IS NOT NULL AND user_id = inUserId;
SELECT COUNT(id) INTO NumberOfDeficitRecords FROM tblDeficitEntries WHERE user_id = inUserId;
SELECT (SUM(deficit) / NumberOfDeficitRecords) INTO AverageDeficit FROM tblDeficitEntries WHERE user_id = inUserId;
SELECT ((MostRecentWeighinWeight * 703) / (UserHeight * UserHeight)) INTO MostRecentBMI;
SELECT FirstWeighinDate, FirstWeighinWeight, MostRecentWeighinDate, MostRecentWeighinWeight, NumberOfWeighins, NumberOfDeficitRecords, AverageDeficit, MostRecentBMI;
I believe in a MVC situation, that logic belongs in the Model. I am unsure how to achieve this. I was thinking it should be done with a virtual accessor that is a hash. Is that correct? If I should have this somewhere besides the Model, where should it go? I don't expect the entire code to be converted in the answer, but I would greatly appreciate a couple of the lines to be converted so that I can get the idea. Thanks!

So, you've got two models: There's a a User, and that the User has many associated Entry objects.
class User < ActiveRecord::Base
has_many :entries
end
class Entry < ActiveRecord::Base
belongs_to :user
#attributes include weight, date
end
So, what about these values?
The most basic way to do what I believe you want would be to turn all of these into methods on your User model, using the associated models.
For instance,
SELECT date INTO FirstWeighinDate FROM tblDeficitEntries WHERE weight IS NOT NULL AND user_id = inUserId ORDER BY date ASC LIMIT 1;
could be replaced with the following, in the User class:
def first_weighin_date
entries.order('date asc').first.date
end
Then, anywhere you have a User object, you can simply call the method:
u = User.find_by_name('Bob')
u.first_weighin_date
Now, if you're going to be displaying this information for a lot of users at once, you'll want to take advantage of eager loading. This, for instance, is going to be very inefficient:
User.all.map(&:first_weigh_in_date)
It will make individual calls to the database to load the associated entries for each user object. 100 users? 101 database hits. So tell Rails to go ahead and load all the objects, because you're going to need them:
User.includes(:entries).map(&:first_weigh_in_date)
Now there's just two calls to the database, one to load the users and one to load the deficit_entries.

Related

How to select each model which has the maximum value of an attribute for any given value of another attribute?

I have a Work model with a video_id, a user_id and some other simple fields. I need to display the last 12 works on the page, but only take 1 per user. Currently I'm trying to do it like this:
def self.latest_works_one_per_user(video_id=nil)
scope = self.includes(:user, :video)
scope = video_id ? scope.where(video_id: video_id) : scope.where.not(video_id: nil)
scope = scope.order(created_at: :desc)
user_ids = works = []
scope.each do |work|
next if user_ids.include? work.user_id
user_ids << work.user_id
works << work
break if works.size == 12
end
works
end
But I'm damn sure there is a more elegant and faster way of doing it especially when the number of works gets bigger.
Here's a solution that should work for any SQL database with minimal adjustment. Whether one thinks it's elegant or not depends on how much you enjoy SQL.
def self.latest_works_one_per_user(video_id=nil)
scope = includes(:user, :video)
scope = video_id ? scope.where(video_id: video_id) : scope.where.not(video_id: nil)
scope.
joins("join (select user_id, max(created_at) created_at
from works group by created at) most_recent
on works.user_id = most_recent.user_id and
works.created_at = most_recent.created_at").
order(created_at: :desc).limit(12)
end
It only works if the combination of user_id and created_at is unique, however. If that combination isn't unique you'll get more than 12 rows.
It can be done more simply in MySQL. The MySQL solution doesn't work in Postgres, and I don't know a better solution in Postgres, although I'm sure there is one.

Activerecord or SQL statement to find users where something very specific happens in the join table

I have a User that have_many MyVersions associated.
A MyVersion is created every time the column "profile_id" or "state" are changed in User. MyVersion has these columns:
user_id, object_changed (profile_id or state), before, after
I need to find Users that where active and had a specific profile at a specific time. Meaning, to find all Users when this happens in its associated my_versions:
my_versions was created_at before a date AND where :object_changed is 'state' And within that time range:
1.1 THEN (is not AND) find the last one and only select the user if the value for :after is 'active'
my_versions was created_at before a date AND where :object_changed is 'profile_id' And within that time range:
2.1 THEN find the last one and only select the user if the value for :after is '1'
Select only users that match both 1.1 and 2.1
EDIT 1: Apparently I'm getting closer but still not sure this is getting what I need:
active_user_ids = User.joins(:my_versions).merge(MyVersion.where(
"my_versions.created_at = (SELECT MAX(created_at) from my_versions WHERE
user_id = users.id AND created_at < '2016-01-01' AND object_changed = 'state')
AND my_versions.after = 'activo'")).pluck(:id)
Now I have all user IDS that were active at the time (do I?). Then I can do the same for the profile, but passing also the previous IDS to combine the results properly:
active_and_right_profile =
User.joins(:my_versions).merge(MyVersion.where(
"my_versions.created_at = (SELECT MAX(created_at) from my_versions WHERE
user_id = users.id AND created_at < '2016-01-01' AND object_changed = 'profile_id')
AND my_versions.after = 1")).where(id: active_user_ids)
It doesn't look pretty and I'm not sure I'm getting what I describe above in the specifications. First tests appears to be right but I have many doubts because I don't understand some parts of the query:
Apparently when I use "SELECT MAX ... where user_id = users.id" I'm requiring the top value for each user id. Is that right?
If that's true, I'm getting and array of results and I'm passing it to the first created_at =. This means that if I have other versions outside of the scope of this query but with the exact timestamp, they will be in the results. Is that correct? That's relevant to me because few of those versions.created_at are being updated manually.
How does it look? Is there a way to make it better with only one query? Is there a way to avoid the problem of searching exact created_at values that I mention above?
Thanks!!
Previous attempts:
I tried this:
Class User...
scope :active_at, -> (date) {
joins(:my_versions).merge(MyVersion.on_state.before_date(date)
.where("my_versions.created_at = (SELECT MAX(created_at) FROM my_versions WHERE user_id = users.id AND after = 'activo')"))
}
But this create the folliwing query:
SELECT `users`.* FROM `users` INNER JOIN `my_versions` ON `my_versions`.`user_id` = `users`.`id` WHERE `my_versions`.`object_changed` = 'state' AND (my_versions.created_at < '2016-01-31') AND (my_versions.created_at = (SELECT MAX(created_at) FROM my_versions WHERE user_id = users.id AND after = 'activo'))
This is not what I need.

Last doesn't work with default order in ActiveRecord

I have the following code in my Single model:
default_scope order("created_at ASC")
When I call
Single.where("user_id = ? ", dan).last(2)
I get:
SELECT "singles".* FROM "singles" WHERE (user_id = 22 ) ORDER BY created_at ASC, id DESC LIMIT 2
Instead of the DESC being applied to created_by it gets applied to id.
Any idea how to solve it?
You cannot override default scope. That's why it adds to the order clause
You can do
Single.unscoped.order('created_at desc').where("user_id=?", 'dan').last(2)
I would advise you to not use default_scope and use a normal scope(default scopes are considered evil). So you don't forget that there is a default scope that is ordering by created_at
Also reading this would give you some better understanding

Rails 3 query matching attribute of has_one association that is a subset of has_many association

The title is confusing, but allow me to explain. I have a Car model that has multiple datapoints with different timestamps. We are almost always concerned with attributes of its latest status. So the model has_many statuses, along with a has_one to easily access it's latest one:
class Car < ActiveRecord::Base
has_many :statuses, class_name: 'CarStatus', order: "timestamp DESC"
has_one :latest_status, class_name: 'CarStatus', order: "timestamp DESC"
delegate :location, :timestamp, to: 'latest_status', prefix: 'latest', allow_nil: true
# ...
end
To give you an idea of what the statuses hold:
loc = Car.first.latest_location # Location object (id = 1 for example)
loc.name # "Miami, FL"
Let's say I wanted to have a (chainable) scope to find all cars with a latest location id of 1. Currently I have a sort of complex method:
# car.rb
def self.by_location_id(id)
ids = []
find_each(include: :latest_status) do |car|
ids << car.id if car.latest_status.try(:location_id) == id.to_i
end
where("id in (?)", ids)
end
There may be a quicker way to do this using SQL, but not sure how to only get the latest status for each car. There may be many status records with a location_id of 1, but if that's not the latest location for its car, it should not be included.
To make it harder... let's add another level and be able to scope by location name. I have this method, preloading statuses along with their location objects to be able to access the name:
def by_location_name(loc)
ids = []
find_each(include: {latest_status: :location}) do |car|
ids << car.id if car.latest_location.try(:name) =~ /#{loc}/i
end
where("id in (?)", ids)
end
This will match the location above with "miami", "fl", "MIA", etc... Does anyone have any suggestions on how I can make this more succinct/efficient? Would it be better to define my associations differently? Or maybe it will take some SQL ninja skills, which I admittedly don't have.
Using Postgres 9.1 (hosted on Heroku cedar stack)
All right. Since you're using postgres 9.1 like I am, I'll take a shot at this. Tackling the first problem first (scope to filter by location of last status):
This solution takes advantage of PostGres's support for analytic functions, as described here: http://explainextended.com/2009/11/26/postgresql-selecting-records-holding-group-wise-maximum/
I think the following gives you part of what you need (replace/interpolate the location id you're interested in for the '?', naturally):
select *
from (
select cars.id as car_id, statuses.id as status_id, statuses.location_id, statuses.created_at, row_number() over (partition by statuses.id order by statuses.created_at) as rn
from cars join statuses on cars.id = statuses.car_id
) q
where rn = 1 and location_id = ?
This query will return car_id, status_id, location_id, and a timestamp (called created_at by default, although you could alias it if some other name is easier to work with).
Now to convince Rails to return results based on this. Because you'll probably want to use eager loading with this, find_by_sql is pretty much out. There is a trick I discovered though, using .joins to join to a subquery. Here's approximately what it might look like:
def self.by_location(loc)
joins(
self.escape_sql('join (
select *
from (
select cars.id as car_id, statuses.id as status_id, statuses.location_id, statuses.created_at, row_number() over (partition by statuses.id order by statuses.created_at) as rn
from cars join statuses on cars.id = statuses.car_id
) q
where rn = 1 and location_id = ?
) as subquery on subquery.car_id = cars.id order by subquery.created_at desc', loc)
)
end
Join will act as a filter, giving you only the Car objects that were involved in the subquery.
Note: In order to refer to escape_sql as I do above, you'll need to modify ActiveRecord::Base slightly. I do this by adding this to an initializer in the app (which I place in app/config/initializers/active_record.rb):
class ActiveRecord::Base
def self.escape_sql(clause, *rest)
self.send(:sanitize_sql_array, rest.empty? ? clause : ([clause] + rest))
end
end
This allows you to call .escape_sql on any of your models that are based on AR::B. I find this profoundly useful, but if you've got some other way to sanitize sql, feel free to use that instead.
For the second part of the question - unless there are multiple locations with the same name, I'd just do a Location.find_by_name to turn it into an id to pass into the above. Basically this:
def self.by_location_name(name)
loc = Location.find_by_name(name)
by_location(loc)
end

Django - Count a subset of related models - Need to annotate count of active Coupons for each Item

I have a Coupon model that has some fields to define if it is active, and a custom manager which returns only live coupons. Coupon has an FK to Item.
In a query on Item, I'm trying to annotate the number of active coupons available. However, the Count aggregate seems to be counting all coupons, not just the active ones.
# models.py
class LiveCouponManager(models.Manager):
"""
Returns only coupons which are active, and the current
date is after the active_date (if specified) but before the valid_until
date (if specified).
"""
def get_query_set(self):
today = datetime.date.today()
passed_active_date = models.Q(active_date__lte=today) | models.Q(active_date=None)
not_expired = models.Q(valid_until__gte=today) | models.Q(valid_until=None)
return super(LiveCouponManager,self).get_query_set().filter(is_active=True).filter(passed_active_date, not_expired)
class Item(models.Model):
# irrelevant fields
class Coupon(models.Model):
item = models.ForeignKey(Item)
is_active = models.BooleanField(default=True)
active_date = models.DateField(blank=True, null=True)
valid_until = models.DateField(blank=True, null=True)
# more fields
live = LiveCouponManager() # defined first, should be default manager
# views.py
# this is the part that isn't working right
data = Item.objects.filter(q).distinct().annotate(num_coupons=Count('coupon', distinct=True))
The .distinct() and distinct=True bits are there for other reasons - the query is such that it will return duplicates. That all works fine, just mentioning it here for completeness.
The problem is that Count is including inactive coupons that are filtered out by the custom manager.
Is there any way I can specify that Count should use the live manager?
EDIT
The following SQL query does exactly what I need:
SELECT data_item.title, COUNT(data_coupon.id) FROM data_item LEFT OUTER JOIN data_coupon ON (data_item.id=data_coupon.item_id)
WHERE (
(is_active='1') AND
(active_date <= current_timestamp OR active_date IS NULL) AND
(valid_until >= current_timestamp OR valid_until IS NULL)
)
GROUP BY data_item.title
At least on sqlite. Any SQL guru feedback would be greatly appreciated - I feel like I'm programming by accident here. Or, even better, a translation back to Django ORM syntax would be awesome.
In case anyone else has the same problem, here's how I've gotten it to work:
Items = Item.objects.filter(q).distinct().extra(
select={"num_coupons":
"""
SELECT COUNT(data_coupon.id) FROM data_coupon
WHERE (
(data_coupon.is_active='1') AND
(data_coupon.active_date <= current_timestamp OR data_coupon.active_date IS NULL) AND
(data_coupon.valid_until >= current_timestamp OR data_coupon.valid_until IS NULL) AND
(data_coupon.data_id = data_item.id)
)
"""
},).order_by(order_by)
I don't know that I consider this a 'correct' answer - it completely duplicates my custom manager in a possibly non portable way (I'm not sure how portable current_timestamp is), but it does work.
Are you sure your custom manager actually get's called? You set your manager as Model.live, but you query the normal manager at Model.objects.
Have you tried the following?
data = Data.live.filter(q)...