How to retrieve all users with their last post - sql

I am trying to get my users (authors actually, there will be maximum of 5-6 authors) with their one last post to show in a sidebar in the homepage. Since they will be listed in the homepage I am trying to reduce the amount of sql queries due to performance issues. I have tried;
$users=User::with(array('posts'=>function($query){
$query->take(1);
}))->get();
However it gets only one post in total, not one for every user. And my sql knowledge is limited.
How can I solve my poblem using Eloquent ORM, Query Builder or raw sql query?

A solution is to define a hasOne relationship on the User model ordering by the posts created_at column.
public function lastPost()
{
return $this->hasOne('Post')->orderBy('created_at', 'desc');
}
Then your query would be as such.
$users = User::with('lastPost')->get();
To limit the columns you can constrain the query either at the relationship level:
return $this->hasOne('Post')->select('id', 'user_id', 'title', 'created_at')->orderBy('created_at', 'desc');
Or when you use the with method:
$users = User::with(['lastPost' => function($query)
{
$query->select('id', 'user_id', 'title', 'created_at');
})->get();
Note that you need the user_id and created_at columns as well, as they're required for the WHERE and ORDER BY clauses in the query.

The general SQL way to do this is:
select p.*
from posts p join
(select p.authorid, max(created_at) as maxdate
from posts p
group by p.authorid
) psum
on p.authorid = psum.authorid and p.created_at = psum.maxdate
This assumes that there is are no duplicates.
Depending on the database you are using, there are definitely other ways to write this query. That version is standard SQL.

Related

Active Record - How to perform a nested select on a second table?

I need to list all customers along with their latest order date (plus pagination).
How can I write the following SQL query using Active Record?
select *,
(
select max(created_at)
from orders
where orders.customer_id = customers.id
) as latest_order_date
from customers
limit 25 offset 0
I tried this but it complains missing FROM-clause entry for table "customers":
Customer
.select('*')
.select(
Order
.where('customer_id = customers.id')
.maximum(:created_at)
).page(params[:page])
# Generates this (clearly only the Order query):
SELECT MAX("orders"."created_at")
FROM "orders"
WHERE (customer_id = customers.id)
EDIT: it would be good to keep AR's parameterization and kaminari's pagination goodness.
You haven't given us any information about the relationship between these two tables, so I will assume Customer has_many Orders.
While ActiveRecord doesn't support what you are trying to do, it is built on top of Arel, which does.
Every Rails model has a method named arel_table that will return its corresponding Arel::Table. You might want a helper library to make this cleaner because the default way is a little cumbersome. I will use the plain Arel syntax to maximize compatibility.
ActiveRecord understands Arel objects and can accept them alongside its own syntax.
orders = Order.arel_table
customers = Customer.arel_table
Customer.joins(:orders).group(:id).select([
customers[Arel.star],
orders[:created_at].maximum.as('latest_order_date')
])
Which produces
SELECT "customers".*, MAX("orders"."created_at") AS "latest_order_date"
FROM "customers"
INNER JOIN "orders" ON "orders"."customer_id" = "customers"."id"
GROUP BY "customers"."id"
This is the customary way of doing this, but if you still want to do it as a subquery, you can do this
Customer.select([
customers[Arel.star],
orders.project(orders[:created_at].maximum)
.where(orders[:customer_id].eq(customers[:id]))
.as('latest_order_date')
])
Which gives us
SELECT "customers".*, (
SELECT MAX("orders"."created_at")
FROM "orders"
WHERE "orders"."customer_id" = "customers"."id" ) "latest_order_date"
FROM "customers"
The most Active Record-ish way I've come up with so far is:
Customer
.page(params[:page])
.select('*')
.select(<<-SQL.squish)
(
SELECT MAX(created_at) AS latest_order_date
FROM orders
WHERE orders.customer_id = customers.id
)
SQL
I still wish I could make the string part more Active Record-ish.
The <<-SQL is just heredoc.
Here is the same answer #adam was giving, but not using AREL and just straight ActiveRecord. Not sure it's really much better than #João Marcelo Souza
Customer.select("customers.*, max(orders.created_at)").joins(:orders).group("customers.id").page(params[:page])
(The group by avoids list all the customers columns by using this feature of Postgres 9.1 and higher.)
The OP doesn't say, but the query doesn't handle the case where the customer has no orders. This version does that:
Customer.select("customers.*, coalesce(max(orders.created_at),0)").joins("left outer join orders on orders.customer_id=customers.id").group("customers.id").page(params[:page])

How to join products and their characteristics

How to join products and their characteristics
I have two tables.
Product (id, title, price, created_at, updated_at etc)
and
ProductCharacteristic(id, product_id, sold_quantity, date, craated_at, updated_at etc).
I should show products table (header is product.id, product.title, product.price, sold_quantity) for some period of time and ordered by any fields from header.
And I can't write query
Now I have following query
> current_project.products.includes(:product_characteristics).group('products.id').pluck(:title, 'SUM(product_characteristics.sold_quantity) AS sold_quantity')
(45.4ms) SELECT "products"."title", SUM(product_characteristics.sold_quantity) AS sold_quantity FROM "products" LEFT OUTER JOIN "product_characteristics" ON "product_characteristics"."product_id" = "products"."id" WHERE "products"."project_id" = $1 GROUP BY products.id [["project_id", 20]]
Please help me to write query through orm(to add where with dates and ordering) or write raw sql query.
I used pluck. It returns array of arrays (not array of hashes). It's no so good of course.
product_characteristics.date field is unique by scope product_id. But please give me two examples (with this condition and without it to satisfy my curiosity).
And I use postgresql and rails 4.2.x
P.S. By the way the ProductCharacteristic table will have a lot of records(mote than one million). Should I use postgresql table partitioning. Can it improve performance?
Thank you.
You can use select instead of count in that case, and the property will be accessible as product.sold_quantity
The query becomes
products = current_project.products.joins(:product_characteristics).group('products.id').select(:title, 'SUM(product_characteristics.sold_quantity) AS sold_quantity')
products.first.sold_quantity # => works
To order, you can just add an order clause
products = products.order(id: :asc)
or
products = products.order(id: :desc)
for instance
And for the where
products = products.where("created_at > ?", 2.days.ago)
for instance.
You can chain sql clauses after the first line, it does not matter cause the query will only be launched when you actually use the retrieved set.
And so you can also do stuff like
if params[:foo]
products = products.order(:id)
end

Rails ActiveRecord query where relationship does not exist based on third attribute

I have an Adventure model, which is a join table between a Destination and a User (and has additional attributes such as zipcode and time_limit). I want to create a query that will return me all the Destinations where an Adventure between that Destination and the User currently trying to create an Adventure does not exist.
The way the app works when a User clicks to start a new Adventure it will create that Adventure with the user_id being that User's id and then runs a method to provide a random Destination, ex:
Adventure.create(user_id: current_user.id) (it is actually doing current_user.adventures.new ) but same thing
I have tried a few things from writing raw SQL queries to using .joins. Here are a few examples:
Destination.joins(:adventures).where.not('adventures.user_id != ?'), user.id)
Destination.joins('LEFT OUTER JOIN adventure ON destination.id = adventure.destination_id').where('adventure.user_id != ?', user.id)
The below should return all destinations that user has not yet visited in any of his adventures:
destinations = Destination.where('id NOT IN (SELECT destination_id FROM adventures WHERE user_id = ?)', user.id)
To select a random one append one of:
.all.sample
# or
.pluck(:id).sample
Depending on whether you want a full record or just id.
No need for joins, this should do:
Destination.where(['id not in ?', user.adventures.pluck(:destination_id)])
In your first attempt, I see the problem to be in the usage of equality operator with where.not. In your first attempt:
Destination.joins(:adventures).where.not('adventures.user_id != ?'), user.id)
you're doing where.not('adventures.user_id != ?'), user.id). I understand this is just the opposite of what you want, isn't it? Shouldn't you be calling it as where.not('adventures.user_id = ?', user.id), i.e. with an equals =?
I think the following query would work for the requirement:
Destination.joins(:adventures).where.not(adventures: { user_id: user.id })
The only problem I see in your second method is the usage of destinations and adventures table in both join and where conditions. The table names should be plural. The query should have been:
Destination
.joins('LEFT OUTER JOIN adventures on destinations.id = adventures.destination_id')
.where('adventures.user_id != ?', user.id)
ActiveRecord doesn't do join conditions but you can use your User destinations relation (eg a has_many :destinations, through: adventures) as a sub select which results in a WHERE NOT IN (SELECT...)
The query is pretty simple to express and doesn't require using sql string shenanigans, multiple queries or pulling back temporary sets of ids:
Destination.where.not(id: user.destinations)
If you want you can also chain the above realation with additional where terms, ordering and grouping clauses.
I solved this problem with a mix of this answer and this other answer and came out with:
destination = Destination.where
.not(id: Adventure.where(user: user)
.pluck(:destination_id)
)
.sample
The .not(id: Adventure.where(user: user).pluck(:destination_id)) part excludes destinations present in previous adventures of the user.
The .sample part will pick a random destination from the results.

Can I sort records by child record count with DataMapper (without using raw SQL)?

What I want to do feels pretty basic to me, but I'm not finding a way to do it using DataMapper without resorting to raw SQL. That would look something like:
select u.id, u.name, count(p.id) as post_count
from posts p
inner join users u on p.user_id = u.id
group by p.user_id
order by post_count desc;
The intention of the above query is to show me all users sorted by how many posts each user has. The closest I've found using DataMapper is aggregate, which doesn't give me back resource objects. What I'd like is some way to generate one query and get back standard DM objects back.
Assuming you have relationships
has_n, :posts
you should be able to do
User.get(id).posts.count
or
User.first(:some_id => id).posts.count
or
u = User.get(1)
u.posts.count
you can also chain conditions
User.get(1).posts.all(:date.gt => '2012-10-01')
see scopes and chaining here http://datamapper.org/docs/find.html
finally add the ordering
User.get(1).posts.all(:order => [:date.desc])

Django: Order a model by a many-to-many field

I am writing a Django application that has a model for People, and I have hit a snag. I am assigning Role objects to people using a Many-To-Many relationship - where Roles have a name and a weight. I wish to order my list of people by their heaviest role's weight. If I do People.objects.order_by('-roles__weight'), then I get duplicates when people have multiple roles assigned to them.
My initial idea was to add a denormalized field called heaviest-role-weight - and sort by that. This could then be updated every time a new role was added or removed from a user. However, it turns out that there is no way to perform a custom action every time a ManyToManyField is updated in Django (yet, anyway).
So, I thought I could then go completely overboard and write a custom field, descriptor and manager to handle this - but that seems extremely difficult when the ManyRelatedManager is created dynamically for a ManyToManyField.
I have been trying to come up with some clever SQL that could do this for me - I'm sure it's possible with a subquery (or a few), but I'd be worried about it not being compatible will all the database backends Django supports.
Has anyone done this before - or have any ideas how it could be achieved?
Django 1.1 (currently beta) adds aggregation support. Your query can be done with something like:
from django.db.models import Max
People.objects.annotate(max_weight=Max('roles__weight')).order_by('-max_weight')
This sorts people by their heaviest roles, without returning duplicates.
The generated query is:
SELECT people.id, people.name, MAX(role.weight) AS max_weight
FROM people LEFT OUTER JOIN people_roles ON (people.id = people_roles.people_id)
LEFT OUTER JOIN role ON (people_roles.role_id = role.id)
GROUP BY people.id, people.name
ORDER BY max_weight DESC
Here's a way to do it without an annotation:
class Role(models.Model):
pass
class PersonRole(models.Model):
weight = models.IntegerField()
person = models.ForeignKey('Person')
role = models.ForeignKey(Role)
class Meta:
# if you have an inline configured in the admin, this will
# make the roles order properly
ordering = ['weight']
class Person(models.Model):
roles = models.ManyToManyField('Role', through='PersonRole')
def ordered_roles(self):
"Return a properly ordered set of roles"
return self.roles.all().order_by('personrole__weight')
This lets you say something like:
>>> person = Person.objects.get(id=1)
>>> roles = person.ordered_roles()
Something like this in SQL:
select p.*, max (r.Weight) as HeaviestWeight
from persons p
inner join RolePersons rp on p.id = rp.PersonID
innerjoin Roles r on rp.RoleID = r.id
group by p.*
order by HeaviestWeight desc
Note: group by p.* may be disallowed by your dialect of SQL. If so, just list all the columns in table p that you intend to use in the select clause.
Note: if you just group by p.ID, you won't be able to call for the other columns in p in your select clause.
I don't know how this interacts with Django.