I'm using Rails 4.2 and PostgreSQL 9.4.
I have a basic users, reservations and events schema.
I'd like to return a list of users and the most recent event they attended, along with what date/time this was at.
I've created a query that returns the user and the time of the most recent event. However I need to return the events.id as well.
My application does not allow a user to reserve two events with the same start time, however I appreciate SQL does not know anything about this and thinks there can be multiple events in the result. Hence I am happy for the query to return an appropriate event ID at random in the case of a hypothetical 'tie' for events.starts_at.
User.all.joins(reservations: :event)
.select('users.*, max(events.starts_at)')
.where('reservations.state = ?', "attended")
.where('events.company_id = ?', 1)
.group('users.id')
The corresponding SQL query is:
SELECT users.*, max(events.starts_at) FROM "users" INNER JOIN "reservations" ON "reservations"."user_id" = "users"."id" INNER JOIN "events" ON "events"."id" = "reservations"."event_id" WHERE (reservations.state = 'attended') AND (events.company_id = 1) GROUP BY users.id
The reservations table is very large so loading the entire set into Rails and processing it via Ruby code is undesirable. I'd like to perform the entire query in SQL if it is possible to do so.
My basic model:
User
has_many :reservations
Reservation
belongs_to :user
belongs_to :event
Event
belongs_to :company
has_many :reservations
The generic sql that returns data for the most recent event looks like this:
select yourfields
from yourtables
join
(select someField
, max(datetimefield) maxDateTime
from table1
where whatever
group by someField ) temp on table1.someField = temp.somefield
and table1.dateTimeField = maxDateTime
where whatever
The two "where whatever" things should be the same. All you have to do is adapt this construct into your app. You might consider putting the query into a stored procedure which you then call from your app.
I think your query should focus first to retrieve the most recent reservation.
SELECT MAX(`events.starts_at`),`events"."id`,`user_id` FROM `reservations` WHERE (reservations.state = 'attended')
Then JOIN the Users and Events.
Assuming the results will include every User and Event it may be more efficient to retrieve all users and events and store then in two arrays keyed by id.
The logic behind that is rather than a separate lookup into the user and events table for each resulting reservation by the db engine, it is more efficient to get them all in a single query.
SELECT * FROM Users' WHERE 1 ORDER BYuser_id`
SELECT * FROM Events' WHERE 1 ORDER BYevent_id`
I am not familiar with Rails syntax so cannot give exact code but can show using it in PHP code, the results are put into the array with a single line of code.
while ($row = mysql_fetch_array($results, MYSQL_NUM)){users[$row(user_id)] = $row;}
Then when processing the Reservations you get the user and event data from the arrays.
The Index for reservations is critical and may be worth profiling.
Possible profile choices may be to include and exclude 'attended' in the Index. The events.starts_at should be the first column in the index followed by user_id. But profiling the Index's column order should be profiled.
You may want to use a unique Index to enforce the no duplicate reservations times.
Related
I have in my Rails (4.1.8) app the following models: Event, User, Box, EventBoxMapping and following associations relevant to the question among them:
User has_many events; Event belogs_to User
Event has_many boxes through event_box_mappings
I'm trying to achieve faster (and hopefully more memory efficient) CSV generation through ActiveAdmin by using PostgreSQL's COPY functionality to stream output of a raw SQL directly into a CSV export. To achieve this, however, I need to pass a raw SQL string, which I'm having some trouble creating for all bits of information generated to populate columns of our Events CSV. In particular, I'd like to pick out counts of distinct values of box_property_id column of the boxes of events.
Now, so far, I have the following SQL that runs perfectly to maps some values of Event and User models:
SELECT
events.id,
events.user_id,
events.event_type,
events.promo_code,
events.created_at,
events.transport_fee,
events.boxes_count,
users.email,
users.gender,
users.first_name,
users.last_name
FROM events
LEFT JOIN users ON users.id = events.user_id`
I'm stuck at the part I mentioned above - to include counts of "each kind of box" represented by the box_property_id field in boxes of an Event in the table returned by the above SQL.
I come from a NoSQL background from past experience and fairly new to this field, and so I apologise if my query is ambiguous/incomplete in some form.
As understood, you need you wanna get distinct count of box_property_id with other columns.
Event.joins(:user, event_box_mappings: :box)
.select("events.id, events.user_id, events.event_type, events.promo_code,
events.created_at, events.transport_fee, events.boxes_count,
users.email, users.gender, users.first_name, users.last_name,
COUNT(distinct boxes.box_property_id) AS total")
total will return the count of distinct box_property_id.
Hope this would be helpful
New to seqeul and sql in general. I have two tables, groups and resources, that are associated many_to_many and therefore have a groups_resources join table. I also have a task table that has a foreign_key :group_id, :groups and is associated many_to_one with groups.
I'm trying to figure out what query to use that will allow my to get the resources that are able to do a task, based on a task's group. Do I have to do a complicated query via the `groups_resources' join table, or is there a more straightforward query/ way of setting up my associations?
Thanks!
I would structure the SQL statement as below. Which would provide you the resources objects that are associated with a specific task id through the join table.
SELECT r.*
FROM resources r
JOIN groups_resources gr ON gr.resources_id = r.id
JOIN groups g ON gr.group_id = g.id
JOIN task t ON t.id = g.id
WHERE t.id = ?
I think following is enough:
select res.* from resources res, task tk, groups_resources gr
where res.resource_id = gr.resource_id and
gr.group_id = tk.group_id and
tk.group_id=<>;
The other two answers are helpful for how to structure a SQL query, but thought I would answer my own question specifically as it relates to Sequel. Turns out there is a many_through_many plugin that makes this sort of querying simple, if you make both tables many_to_many :
Task.plugin :many_through_many
Task.many_through_many :resources,
:through =>[
[:groups_tasks, :task_id, :group_id],
[:groups, :id, :id],
[:groups_resources, :group_id, :resource_id]
]
Now you can just call something like task.resources on a Task instance, even though your tables don't explicitly associate tasks and resources.
I have an Adventure model, which is a join table between a Destination and a User (and has additional attributes such as zipcode and time_limit). I want to create a query that will return me all the Destinations where an Adventure between that Destination and the User currently trying to create an Adventure does not exist.
The way the app works when a User clicks to start a new Adventure it will create that Adventure with the user_id being that User's id and then runs a method to provide a random Destination, ex:
Adventure.create(user_id: current_user.id) (it is actually doing current_user.adventures.new ) but same thing
I have tried a few things from writing raw SQL queries to using .joins. Here are a few examples:
Destination.joins(:adventures).where.not('adventures.user_id != ?'), user.id)
Destination.joins('LEFT OUTER JOIN adventure ON destination.id = adventure.destination_id').where('adventure.user_id != ?', user.id)
The below should return all destinations that user has not yet visited in any of his adventures:
destinations = Destination.where('id NOT IN (SELECT destination_id FROM adventures WHERE user_id = ?)', user.id)
To select a random one append one of:
.all.sample
# or
.pluck(:id).sample
Depending on whether you want a full record or just id.
No need for joins, this should do:
Destination.where(['id not in ?', user.adventures.pluck(:destination_id)])
In your first attempt, I see the problem to be in the usage of equality operator with where.not. In your first attempt:
Destination.joins(:adventures).where.not('adventures.user_id != ?'), user.id)
you're doing where.not('adventures.user_id != ?'), user.id). I understand this is just the opposite of what you want, isn't it? Shouldn't you be calling it as where.not('adventures.user_id = ?', user.id), i.e. with an equals =?
I think the following query would work for the requirement:
Destination.joins(:adventures).where.not(adventures: { user_id: user.id })
The only problem I see in your second method is the usage of destinations and adventures table in both join and where conditions. The table names should be plural. The query should have been:
Destination
.joins('LEFT OUTER JOIN adventures on destinations.id = adventures.destination_id')
.where('adventures.user_id != ?', user.id)
ActiveRecord doesn't do join conditions but you can use your User destinations relation (eg a has_many :destinations, through: adventures) as a sub select which results in a WHERE NOT IN (SELECT...)
The query is pretty simple to express and doesn't require using sql string shenanigans, multiple queries or pulling back temporary sets of ids:
Destination.where.not(id: user.destinations)
If you want you can also chain the above realation with additional where terms, ordering and grouping clauses.
I solved this problem with a mix of this answer and this other answer and came out with:
destination = Destination.where
.not(id: Adventure.where(user: user)
.pluck(:destination_id)
)
.sample
The .not(id: Adventure.where(user: user).pluck(:destination_id)) part excludes destinations present in previous adventures of the user.
The .sample part will pick a random destination from the results.
Ok so I'm doing an inner join on an association in rails like so:
#visits = #customer.visits.joins(:messages).select("distinct(visits.id)")
And this is returning unique visit id's however I want to loop through these visits, and access its associations (Each visit has a merchant_id attached to it as well). The problem with this inner join is that is is only returning the id so when I do something like this:
#visits.each do |v|
merchant = v.merchant
end
I just end up with a nil class.
How can I select "visits" based on a unique visit.id but also return all the other columns in that row? Group by?
All you are selecting is the id, that's correct. Since you want an object, you have two options. So what you really want is along the lines of:
Visit.where(:visit_id => #customers.visits.select("distinct(id)"))
This is equivalent to:
"SELECT * FROM VISITS WHERE VISITS.ID IN (SELECT DISTINCT(ID) FROM VISITS WHERE VISITS.CUSTOMER_ID IN ?)", <id_list>
I'm not sure where 'messages' factors in in your schema, but it's basically just a join in the above query. Sometimes it;s helpful to figure out the SQL, and work backwards to ActiveRecord syntax if you're having trouble.
UPDATE: So thanks to #Erwin Brandstetter, I now have this:
def self.unique_users_by_company(company)
users = User.arel_table
cards = Card.arel_table
users_columns = User.column_names.map { |col| users[col.to_sym] }
cards_condition = cards[:company_id].eq(company.id).
and(cards[:user_id].eq(users[:id]))
User.joins(:cards).where(cards_condition).group(users_columns).
order('min(cards.created_at)')
end
... which seems to do exactly what I want. There are two shortcomings that I would still like to have addressed, however:
The order() clause is using straight SQL instead of Arel (couldn't figure it out).
Calling .count on the query above gives me this error:
NoMethodError: undefined method 'to_sym' for
#<Arel::Attributes::Attribute:0x007f870dc42c50> from
/Users/neezer/.rvm/gems/ruby-1.9.3-p0/gems/activerecord-3.1.1/lib/active_record/relation/calculations.rb:227:in
'execute_grouped_calculation'
... which I believe is probably related to how I'm mapping out the users_columns, so I don't have to manually type in all of them in the group clause.
How can I fix those two issues?
ORIGINAL QUESTION:
Here's what I have so far that solves the first part of my question:
def self.unique_users_by_company(company)
users = User.arel_table
cards = Card.arel_table
cards_condition = cards[:company_id].eq(company.id)
.and(cards[:user_id].eq(users[:id]))
User.where(Card.where(cards_condition).exists)
end
This gives me 84 unique records, which is correct.
The problem is that I need those User records ordered by cards[:created_at] (whichever is earliest for that particular user). Appending .order(cards[:created_at]) to the scope at the end of the method above does absolutely nothing.
I tried adding in a .joins(:cards), but that give returns 587 records, which is incorrect (duplicate Users). group_by as I understand it is practically useless here as well, because of how PostgreSQL handles it.
I need my result to be an ActiveRecord::Relation (so it's chainable) that returns a list of unique users who have cards that belong to a given company, ordered by the creation date of their first card... with a query that's written in Ruby and is database-agnostic. How can I do this?
class Company
has_many :cards
end
class Card
belongs_to :user
belongs_to :company
end
class User
has_many :cards
end
Please let me know if you need any other information, or if I wasn't clear in my question.
The query you are looking for should look like this one:
SELECT user_id, min(created_at) AS min_created_at
FROM cards
WHERE company_id = 1
GROUP BY user_id
ORDER BY min(created_at)
You can join in the table user if you need columns of that table in the result, else you don't even need it for the query.
If you don't need min_created_at in the SELECT list, you can just leave it away.
Should be easy to translate to Ruby (which I am no good at).
To get the whole user record (as I derive from your comment):
SELECT u.*,
FROM user u
JOIN (
SELECT user_id, min(created_at) AS min_created_at
FROM cards
WHERE company_id = 1
GROUP BY user_id
) c ON u.id = c.user_id
ORDER BY min_created_at
Or:
SELECT u.*
FROM user u
JOIN cards c ON u.id = c.user_id
WHERE c.company_id = 1
GROUP BY u.id, u.col1, u.col2, .. -- You have to spell out all columns!
ORDER BY min(c.created_at)
With PostgreSQL 9.1+ you can simply write:
GROUP BY u.id
(like in MySQL) .. provided id is the primary key.
I quote the release notes:
Allow non-GROUP BY columns in the query target list when the primary
key is specified in the GROUP BY clause (Peter Eisentraut)
The SQL standard allows this behavior, and because of the primary key,
the result is unambiguous.
The fact that you need it to be chainable complicates things, otherwise you can either drop down into SQL yourself or only select the column(s) you need via select("users.id") to get around the Postgres issue. Because at the heart of it your query is something like
SELECT users.id
FROM users
INNER JOIN cards ON users.id = cards.user_id
WHERE cards.company_id = 1
GROUP BY users.id, DATE(cards.created_at)
ORDER BY DATE(cards.created_at) DESC
Which in Arel syntax is more or less:
User.select("id").joins(:cards).where(:"cards.company_id" => company.id).group_by("users.id, DATE(cards.created_at)").order("DATE(cards.created_at) DESC")