I am creating an Rails 3.2.14 app.
In this app I got a model called Timereport. In the model I got a class method
that I am using to generate statistics.
def self.stats_time_spent(params)
data = group("date(created_at)")
data = data.where("backend_user_id = ?", params[:backend_user_id])
data = data.where("created_at >= ?", params[:date_from])
data = data.where("created_at <= ?", params[:date_to])
data = data.select("date (created_at) as timecreated, sum(total_time) as timetotal")
data
end
This function works but it outputs data in a random fashion. The dates are not sorted.
I tried to add .order("created_at desc") but then I get this error:
PG::GroupingError: ERROR: column "timereports.created_at" must appear in the GROUP BY clause or be used in an aggregate function
LINE 1: ...user_id = '1') GROUP BY date(created_at) ORDER BY created_at...
^
: SELECT COUNT(*) AS count_all, date(created_at) AS date_created_at FROM "timereports" WHERE
I got two questions. Is this a good way of aggregating the data and how do I order the output?
Thankful for all input!
You should order by date(created_at)
Related
I have a scores table that I have to group by the attempt_number and take the sum of scores
I want to nest this query using Eloquent and SQL raw and take the Max score from the attempts and order it according to score. I need the final result as a leaderboard.
$usersByScore = Attempt::where('game_id',$id)
->select('user_id','attempt_no','game_id',DB::raw('SUM(score) as total_score'))
->groupBy('attempt_no')
->orderBy('total_score', 'DESC')
->get()
this gives me the leaderboard but it has all attempts from the user. I need just the max score attempt for each user ordered by the score in descending order.
use distinct() method for this: i hope it will work.
$usersByScore = Attempt::where('game_id',$id)
->select('user_id','attempt_no','game_id',DB::raw('SUM(score) as total_score'))
->groupBy('attempt_no')
->orderBy('total_score', 'DESC')
->distict()
->get()
Got the solution - Implemented the from method to nest the query
$usersByScore = Attempt::with('user')
->select(DB::raw('MAX(tscore) as total_score, user_id,attempt_no'))
->from(DB::raw('(SELECT user_id,attempt_no,SUM(score) as tscore FROM
attempts WHERE game_id = '.$id.' GROUP By attempt_no,user_id) AS T'))
->groupBy('user_id')
->get();
Example models:
class User(models.Model):
pass
class UserStatusChange(models.Model):
user = models.ForeignKey(User, related_name='status_changes')
status = models.CharField()
start_date = models.DateField()
I want to annotate UserStatusChanges queryset with end_date field, and end_date should be equal to start_date of next status change for the same user.
Eventually, I want to be able to do this:
qs = UserStatusChange.ojects.annotate(end_date=???)
qs = qs.filter(start_date__lte=some_date, end_date__gte=another_date)
Logically that annotation should be something like this:
qs.annotate(
end_date=qs.filter(
user=OuterRef('user'),
start_date__gt=OuterRef('start_date')
).order_by('start_date').first().start_date)
But it should be one DB query, if it is possible.
Solution:
subquery = UserStatusChange.objects.filter(user=OuterRef('user'),
start_date__gt=OuterRef('start_date')).order_by('start_date')
UserStatusChange.objects.annotate(end_date=Subquery(subquery.values('start_date')[:1]))
That works, thank to #hynekcer's answer. But with aggregate I got the error:
ValueError: This queryset contains a reference to an outer query and may only be used in a subquery.
UPD: in Django 2.0+ it can be solved with Lead Window function.
In SQL it will be something like this:
select
user_id, status_id, start_date,
LEAD(start_date, 1) over (partition by user_id order by start_date)
from user_status_change;
You can use Subquery() with OuterRef() in Django 1.11.
from django.db.models import Min, OuterRef, Subquery
from django.db.models.functions import Coalesce
default_end = now() # or the end of the recorded history
qs = (
UserStatusChanges.objects
.annotate(
end_date=Coalesce(
Subquery(
UserStatusChanges.objects
.filter(
user=OuterRef('user'),
start_date__gt=OuterRef('start_date')
)
.order_by()
.aggregate(Min('start_date'))
),
default_end
)
)
)
qs = qs.order_by('user', 'start_date')
# an optional filter
qs = qs.filter(start_date__lte=some_date, end_date__gte=another_date, user__in=[...])
It is compiled as one query when being executed, e.g. when combined with User filter by prefetch_related. If you want a meaningful end_date also for the last item then you can use Coalesce() with a default value equal to the current timestamp.
I have a User that have_many MyVersions associated.
A MyVersion is created every time the column "profile_id" or "state" are changed in User. MyVersion has these columns:
user_id, object_changed (profile_id or state), before, after
I need to find Users that where active and had a specific profile at a specific time. Meaning, to find all Users when this happens in its associated my_versions:
my_versions was created_at before a date AND where :object_changed is 'state' And within that time range:
1.1 THEN (is not AND) find the last one and only select the user if the value for :after is 'active'
my_versions was created_at before a date AND where :object_changed is 'profile_id' And within that time range:
2.1 THEN find the last one and only select the user if the value for :after is '1'
Select only users that match both 1.1 and 2.1
EDIT 1: Apparently I'm getting closer but still not sure this is getting what I need:
active_user_ids = User.joins(:my_versions).merge(MyVersion.where(
"my_versions.created_at = (SELECT MAX(created_at) from my_versions WHERE
user_id = users.id AND created_at < '2016-01-01' AND object_changed = 'state')
AND my_versions.after = 'activo'")).pluck(:id)
Now I have all user IDS that were active at the time (do I?). Then I can do the same for the profile, but passing also the previous IDS to combine the results properly:
active_and_right_profile =
User.joins(:my_versions).merge(MyVersion.where(
"my_versions.created_at = (SELECT MAX(created_at) from my_versions WHERE
user_id = users.id AND created_at < '2016-01-01' AND object_changed = 'profile_id')
AND my_versions.after = 1")).where(id: active_user_ids)
It doesn't look pretty and I'm not sure I'm getting what I describe above in the specifications. First tests appears to be right but I have many doubts because I don't understand some parts of the query:
Apparently when I use "SELECT MAX ... where user_id = users.id" I'm requiring the top value for each user id. Is that right?
If that's true, I'm getting and array of results and I'm passing it to the first created_at =. This means that if I have other versions outside of the scope of this query but with the exact timestamp, they will be in the results. Is that correct? That's relevant to me because few of those versions.created_at are being updated manually.
How does it look? Is there a way to make it better with only one query? Is there a way to avoid the problem of searching exact created_at values that I mention above?
Thanks!!
Previous attempts:
I tried this:
Class User...
scope :active_at, -> (date) {
joins(:my_versions).merge(MyVersion.on_state.before_date(date)
.where("my_versions.created_at = (SELECT MAX(created_at) FROM my_versions WHERE user_id = users.id AND after = 'activo')"))
}
But this create the folliwing query:
SELECT `users`.* FROM `users` INNER JOIN `my_versions` ON `my_versions`.`user_id` = `users`.`id` WHERE `my_versions`.`object_changed` = 'state' AND (my_versions.created_at < '2016-01-31') AND (my_versions.created_at = (SELECT MAX(created_at) FROM my_versions WHERE user_id = users.id AND after = 'activo'))
This is not what I need.
class Man
has_many :sons
# id
end
class Son
belongs_to :man
# id, man_id, age
end
I was to retrieve men from the DB and I want them ordered based on the age of their oldest son. Here's an example.
first_man = Man.create
first_man.sons.create(age: 10)
first_man.sons.create(age: 5)
second_man = Man.create
second_man.sons.create(age: 20)
second_man.sons.create(age: 5)
third_man = Man.create
third_man.sons.create(age: 19)
third_man.sons.create(age: 8)
Man.order('[some order]').to_a
=> [second_man, third_man, first_man]
How do I get ActiveRecord to do this?
Edit
I get invalid SQL when I try to do Man.joins(:sons).order("sons.age DESC").uniq.
ActiveRecord::StatementInvalid:
PG::InvalidColumnReference: ERROR: for SELECT DISTINCT, ORDER BY expressions must appear in select list
LINE 1: ...sons"."man_id" = "men"."id" ORDER BY sons...
^
: SELECT DISTINCT "men".* FROM "men" INNER JOIN "sons" ON "sons"."man_id" = "men"."id" ORDER BY sons.age DESC LIMIT 15
Not tried it but i guess this should work
Man.includes(:sons).order("sons.age").to_a
Try this
Man.joins(:sons).order('sons.age DESC').uniq
Updated
Maybe this will help but is's ugly
Son.order('age DESC').map(&:man).uniq
The code below is from a Sinatra app (that uses DataMappe), which I am trying to convert to a Rails 3 application. It is a class method in the Visit class.
def self.count_by_date_with(identifier,num_of_days)
visits = repository(:default).adapter.query("SELECT date(created_at) as date, count(*) as count FROM visits where link_identifier = '#{identifier}' and created_at between CURRENT_DATE-#{num_of_days} and CURRENT_DATE+1 group by date(created_at)")
dates = (Date.today-num_of_days..Date.today)
results = {}
dates.each { |date|
visits.each { |visit| results[date] = visit.count if visit.date == date }
results[date] = 0 unless results[date]
}
results.sort.reverse
end
My problem is with this part
visits = repository(:default).adapter.query("SELECT date(created_at) as date, count(*) as count FROM visits where link_identifier = '#{identifier}' and created_at between CURRENT_DATE-#{num_of_days} and CURRENT_DATE+1 group by date(created_at)")
Rails (as far as I know) doesn't have this repository method, and I would expect a query to be called on an object of some sort, such as Visit.find
Can anyone give me a hint how this would best be written for a Rails app?
Should I do
Visit.find_by_sql("SELECT date(created_at) as date, count(*) as count FROM visits where link_identifier = '#{identifier}' and created_at between CURRENT_DATE-#{num_of_days} and CURRENT_DATE+1 group by date(created_at)")
Model.connection.execute "YOUR SQL" should help you. Something like
class Visit < Activerecord::Base
class << self
def trigger(created_at,identifier,num_of_days)
sql = "SELECT date(created_at) as date, count(*) as count FROM visits where link_identifier = '#{identifier}' and created_at between CURRENT_DATE-#{num_of_days} and CURRENT_DATE+1 group by date(created_at)"
connection.execute sql
end
end
end
I know you already accepted an answer, but you asked for the best way to do what you asked in Rails. I'm providing this answer because Rails does not recommend building conditions as pure query strings.
Building your own conditions as pure strings can leave you vulnerable to SQL injection exploits. For example, Client.where("first_name LIKE '%#{params[:first_name]}%'") is not safe.
Fortunately, Active Record is incredibly powerful and can build very complex queries. For instance, your query can be recreated with four method calls while still being easy to read and safe.
# will return a hash with the structure
# {"__DATE__" => __COUNT__, ...}
def self.count_by_date_with(identifier, num_of_days)
where("link_identifier = ?", identifier)
.where(:created_at => (num_of_days.to_i.days.ago)..(1.day.from_now))
.group('date(created_at)')
.count
end
Active Record has been built to turn Ruby objects into valid SQL selectors and operators. What makes this so cool is that Rails can turn a Ruby Range into a BETWEEN operator or an Array into an IN expression.
For more information on Active Record check out the guide. It explains what Active Record is capable of and how to use it.