Record with Group by clause on created_at is 9 but it gives the total instead which is 12 without group by in laravel eloquent - sql

I have 3 tables, Users, Tasks, TasksSession.
what I am trying to find is total session conducted and total days the user was involved in that task based on created_at group by statement.
User::with(['Task','TotalSessionsGroupByDate'])
->withCount(['TotalSessions','TotalSessionsGroupByDate'])
->whereId(2)
->first(),
With the above total_sessions_group_by_date_count = 12 and total_sessions_group_by_date_count = 9 records in an array.
function code for the TotalSessionsGroupByDate is as follows
public function TotalSessionsGroupByDate()
{
return $this->hasMany(TaskSession::class)->groupBy('created_at');
}
I have the total number of the session conducted / attendance but how to get the total number of days irrelevant to sessions. as per the example, total_sessions_group_by_date_count should be 9 and not 12 total_sessions_group_by_date_count.
the only solution that I was able to find as of now, able to come up is as below:
$user = User::with(['TaskIssues','TotalSessionsGroupByDate'])
->withCount(['TotalSessions'])
->whereId(16)
->first();
$user->TotalSessionsGroupByDate = $user->TotalSessionsGroupByDate->count();
any better way to do the same in model?

Related

Eloquent: get AVG from all rows that have minimum timestamp

I want to get the User ID and it's average score from every minimum timestamp for each category. Here's the table structure
Skill Table
id | user_id | category | score | timestamp**
0....10............a................11........12
1....10............a................10........9
2....10............b................12........10
3....10............c................11........8
4....11............a................8........9
5....11............b................9........10
6....11............c................10........8
7....11............c................15........14
I want to get the result like this:
user_id | AVG(score)
10........11 (average id: 1,2,3)
11........9 (average id: 4,5,6)
For now I use the looping query for every user
foreach ($userIds as $id){
// in some case I need to get from only specified Ids not all of them
foreach ($category as $cat) {
// get the minimum timestamp's score for each category
$rawCategory = Skill::where("user_id", $id)->where("timestamp", "<=", $start)->where("category",$cat->id)->orderBy("timestamp", "desc")->first();
if($rawCategory){
$skillCategory[$cat->cat_name] = $rawCategory->score;
}
}
//get the average score
$average = array_sum($skillCategory) / count($skillCategory);
}
I want to create better Eloquent query to get the data like this with good performance (< 60 sec). Have anyone faced a similar problem and solved it? If so, can you please give me the link. Thanks

What is the potential performance issue with the following code and how would you suggest to fix it?

i had an interview with microsoft and they asked me this following question! i didn't knew how to solve it and i'm very interesting to know what's the solution
p.s: it's only for me to improve myself because i was denied..
anyways: please assume that EmployeeRepository and ServiceTicketsRepository are implementing EntityFramework ORM repositories. The actual storage is a SQL database in the cloud.
Bonus: what is the name of the anti-pattern?
//
// Return overall number of pending work tickets for all employees in the repository
//
public int GetTicketsForEmployees()
{
EmployeeRepository employeeRepository = new EmployeeRepository();
ServiceTicketsRepository serviceTicketRepository = new ServiceTicketRepository();
int ticketscount = 0;
var employees = employeeRepository.All.Select(e => new EmployeeSummary { Employee = e }).ToList();
foreach (var employee in employees)
{
var tickets = serviceTicketRepository.AllIncluding(t => t.Customer).Where(t => t.AssignedToID ==employee.Employee.ID).ToList();
ticketscount += tickets.Count();
}
return ticketscount;
{
This is called the 1 + N anti-pattern. It means that you will do 1 + N round trips to the database where N is the number of records in the Employee table.
It will do 1 query to find all employees, then for each employee do another query to find their tickets, in order to count them.
The performance issue is that when N grows, your application will do more and more round trips, each taking a few milliseconds. Even at only 1000 employees this will be slow.
In addition to the round trips, this code is fetching all the columns for all the rows in the Employee table and also from the Ticket table. This will add up to a lot of bytes and in the end might cause an out of memory exception when the number of Employees and Tickets have grown to a big amount.
The fix is to perform one query which counts all the tickets which belongs to employees and then only returning the count. This will become one round trip sending only a few bytes over the network.
I'am not a C# but what I can see from my side is you are not using any join procedure.
If you have 1 million of employees and you have about 1000 tickets per employee.
You will do a 1 billion of query (loop including) :/ and you just want to return a count of ticket reported by your employee
Edit : I supposed you are in a eager loading and during your loop your EntityFramework instance will be open for the all duration of your loop.
Edit 2 : With a inner join you wont have to repeat t => t.AssignedToID ==employee.Employee.ID The join will do that for you.

Activerecord sort by column and then created_at?

So I have a table of items. Each item has a status that can be "daily", "monthly", "yearly", or "outstanding".
What I'm trying to do is create a single Activerecord (or SQL) query that arranges the outstanding items first (by created_at) and then the rest of the items (regardless of their status) by their created_at date, while limiting the total number of items returned to 15.
So for instance, if I have 30 outstanding items and 30 yearly items, the query returns 15 outstanding items (by their created_at).
If I have 10 outstanding items and 30 yearly items, it returns those 10 outstanding items (by created_at) and then 5 yearly items (by created_at) — the outstanding items returned should be at the beginning of the returned array.
If I do not have any outstanding items and 30 yearly items, it would return 15 yearly items by their created_at date.
Thank you for the help!
Firstly, just define a scope like this:
class Item < ActiveRecord::Base
scope :ordered, -> {
joins("LEFT OUTER JOIN ( SELECT id, created_at, status
FROM items
WHERE items.status = 'outstanding'
) AS temp ON temp.id = items.id AND temp.status = items.status"
).order('temp.created_at NULLS LAST, items.status, items.created_at')
}
end
The magic is: (expect your table name is items)
Left join items with outstanding items. So temp.id and temp.created_at will be NULL for items which don't have status outstanding
Order by temp.created_at NULLS LAST first, so the items which don't have status outstanding will be ordered last. Then just do order by normally: items.status(this makes the same statuses will be closed by each other) and items.created_at
You can run the query with scope ordered for 15 items only:
Item.ordered.limit(15)
Though it's not a single query in worst case scenario (2 queries), but it solves your problem:
items = Item.where(:status => "outstanding").order('created_at DESC').limit(15)
items = (items.size == 15) ? items : items + Item.where('status != ?', "outstanding").order('created_at DESC').limit(15-items.size)
Not sure about single query, but here is an adequate solution:
scope :ordered, -> { order(:created_at) }
scope :outstanding, -> { where(status: :outstanding).ordered }
def self.collection
return ordered.limit(15) if outstanding.count.zero?
return outstanding.limit(15) if outstanding.count >= 15
outstanding + ordered.limit(15 - outstanding.count)
end
Item.collection # will return the array of your records

Cakephp query to get last single field data from multi user

I have a table called Transaction with relation User, Transaction has a field called balance.
Data looks like:
id user_id balance
1 22 365
2 22 15
3 22 900
4 32 100
4 32 50
I need all users associative data and last insert balance field of User. For example here id=3 is last inserted data for user_id=22.
In raw SQL I have tried this:
select * from transactions where id in (select max(id) from transactions group by user_id)
If I add here a inner join I know I can also retrieve User data. But how can I do this in CakePHP?
IMHO, subqueries are ugly in CakePHP 2.x. You may as well hard code the SQL statement and execute it through query(), as suggested by #AgRizzo in the comments.
However, when it comes to retrieving the last (largest, oldest, etc.) item in a group, there is a more elegant solution.
In this SQL Fiddle, I've applied the technique described in
Retrieving the last record in each group
The CakePHP 2.x equivalent would be:
$this->Transaction->contains('User');
$options['fields'] = array("User.id", "User.name", "Transaction.balance");
$options['joins'] = array(
array('table' => 'transactions',
'alias' => 'Transaction2',
'type' => 'LEFT',
'conditions' => array(
'Transaction2.user_id = Transaction2.user_id',
'Transaction.id < Transaction2.id'
)
),
);
$options['conditions'] = array("Transaction2.id IS NULL");
$transactions=$this->Transaction->find('all', $options);

Issues with DISTINCT when used in conjunction with ORDER

I am trying to construct a site which ranks performances for a selection of athletes in a particular event - I have previously posted a question which received a few good responses which me to identify the key problem with my code currently.
I have 2 models - Athlete and Result (Athlete HAS MANY Results)
Each athlete can have a number of recorded times for a particular event, i want to identify the quickest time for each athlete and rank these quickest times across all athletes.
I use the following code:
<% #filtered_names = Result.where(:event_name => params[:justevent]).joins(:athlete).order('performance_time_hours ASC').order('performance_time_mins ASC').order('performance_time_secs ASC').order('performance_time_msecs ASC') %>
This successfully ranks ALL the results across ALL athletes for the event (i.e. one athlete can appear a number of times in different places depending on the times they have recorded).
I now wish to just pull out the best result for each athlete and include them in the rankings. I can select the time corresponding to the best result using:
<% #currentathleteperformance = Result.where(:event_name => params[:justevent]).where(:athlete_id => filtered_name.athlete_id).order('performance_time_hours ASC').order('performance_time_mins ASC').order('performance_time_secs ASC').order('performance_time_msecs ASC').first() %>
However, my problem comes when I try to identify the distinct athlete names listed in #filtered_names. I tried using <% #filtered_names = #filtered_names.select('distinct athlete_id') %> but this doesn't behave how I expected it to and on occasions it gets the rankings in the wrong order.
I have discovered that as it stands my code essentially looks for a difference between the distinct athlete results, starting with the hours time and progressing through to mins, secs and msec. As soon as it has found a difference between a result for each of the distinct athletes it orders them accordingly.
For example, if I have 2 athletes:
Time for Athlete 1 = 0:0:10:5
Time for Athlete 2 = 0:0:10:3
This will yield the order, Athlete 2, Athlete1
However, if i have:
Time for Athlete 1 = 0:0:10:5
Time for Athlete 2 = 0:0:10:3
Time for Athlete 2 = 0:1:11:5
Then the order is given as Athlete 1, Athlete 2 as the first difference is in the mins digit and Athlete 2 is slower...
Can anyone suggest a way to get around this problem and essentially go down the entries in #filtered_names pulling out each name the first time it appears (i.e. keeping the names in the order they first appear in #filtered_names
Thanks for your time
If you're on Ruby 1.9.2+, you can use Array#uniq and pass a block specifying how to determine uniqueness. For example:
#unique_results = #filtered_names.uniq { |result| result.athlete_id }
That should return only one result per athlete, and that one result should be the first in the array, which in turn will be the quickest time since you've already ordered the results.
One caveat: #filtered_names might still be an ActiveRecord::Relation, which has its own #uniq method. You may first need to call #all to return an Array of the results:
#unique_results = #filtered_names.all.uniq { ... }
You should use DB to perform the max calculation, not the ruby code. Add a new column to the results table called total_time_in_msecs and set the value for it every time you change the Results table.
class Result < ActiveRecord::Base
before_save :init_data
def init_data
self.total_time_in_msecs = performance_time_hours * MSEC_IN_HOUR +
performance_time_mins * MSEC_IN_MIN +
performance_time_secs * MSEC_IN_SEC +
performance_time_msecs
end
MSEC_IN_SEC = 1000
MSEC_IN_MIN = 60 * MSEC_IN_SEC
MSEC_IN_HOUR = 60 * MSEC_IN_MIN
end
Now you can write your query as follows:
athletes = Athlete.joins(:results).
select("athletes.id,athletes.name,max(results.total_time_in_msecs) best_time").
where("results.event_name = ?", params[:justevent])
group("athletes.id, athletes.name").
orde("best_time DESC")
athletes.first.best_time # prints a number
Write a simple helper to break down the the number time parts:
def human_time time_in_msecs
"%d:%02d:%02d:%03d" %
[Result::MSEC_IN_HOUR, Result::MSEC_IN_MIN,
Result::MSEC_IN_SEC, 1 ].map do |interval|
r = time_in_msecs/interval
time_in_msecs = time_in_msecs % interval
r
end
end
Use the helper in your views to display the broken down time.