SQL complicated query with joins - sql

I have problem with one query.
Let me explain what I want:
For the sake of bravity let's say that I have three tables:
-Offers
-Ratings
-Users
Now what I want to do is to create SQL query:
I want Offers to be listed with all its fields and additional temporary column that IS NOT storred anywhere called AverageUserScore.
This AverageUserScore is product of grabbing all offers, belonging to particular user and then grabbing all ratings belonging to these offers and then evaluating those ratings average - this average score is AverageUserScore.
To explain it even further, I need this query for Ruby on Rails application. In the browser inside application you can see all offers of other users , with AverageUserScore at the very end, as the last column.
Associations:
Offer has many ratings
Offer belongs to user
Rating belongs to offer
User has many offers

Assumptions made:
You actually have a numeric column (of any type that SQL's AVG is fine with) in your Rating model. I'm using a column ratings.rating in my examples.
AverageUserScore is unconventional, so average_user_score is better.
You don't mind not getting users that have no offers: average rating is not clearly defined for them anyway.
You don't deviate from Rails' conventions far enough to have a primary key other than id.
Displaying offers for each user is a straightforward task: in a loop of #users.each do |user|, you can do user.offers.each do |offer| and be set. The only problem here is that it will execute a separate query for every user. Not good.
The "fetching offers" part is a standard N+1 counter seen even in the guides.
#users = User.includes(:offers).all
The interesting part here is only getting the averages.
For that I'm going to use Arel. It's already part of Rails, ActiveRecord is built on top of it, so you don't need to install anything extra.
You should be able to do a join like this:
User.joins(offers: :ratings)
And this won't get you anything interesting (apart from filtering users that have no offers). Inside though, you'll get a huge set of every rating joined with its corresponding offer and that offer's user. Since we're taking averages per-user we need to group by users.id, effectively making one entry per one users.id value. That is, one per user. A list of users, yes!
Let's stop for a second and make some assignments to make Arel-related code prettier. In fact, we only need two:
users = User.arel_table
ratings = Rating.arel_table
Okay. So. We need to get a list of users (all fields), and for each user fetch an average value seen on his offers' ratings' rating field. So let's compose these SQL expressions:
# users.*
user_fields = users[Arel.star] # Arel.star is a portable SQL "wildcard"
# AVG(ratings.rating) AS average_user_score
average_user_score = ratings[:rating].average.as('average_user_score')
All set. Ready for the final query:
User.includes(:offers) # N+1 counteraction
.joins(offers: :ratings) # dat join
.select(user_fields, average_user_score) # fields we need
.group(users[:id]) # grouping to only get one row per user

Related

Return first 'unsorted' join in Oracle SQL

I have a table 'ACCOUNTS', with fields ACCTNO and ACPARENT. One account can be the parent of another. One account can have many children.
It's been discovered that certain external processes are using the 'first child' in certain reports and outputs - but there's no actual 'reason' for any particular child to be 'first', just an unintended bug in the code.
First step in untangling this - I need a query, that can be re-run (but not often, so optimisation is not really a factor) that will identify, for all accounts that are parents, what their 'first child' is.
Problem - the 'first child' isn't necessarily anything to do with record ID. If I run the following query, for example:
SELECT ACCTNO FROM ACCOUNTS WHERE ACPARENT = '80005217';
I get a result of:
ACCTNO
______
80007325
80007310
80007315
80007298
I can absolutely, 100% confirm that for this particular example, account 80007325 is the account ID being used as the 'first child'.
On the flipside, if I run a naive query of:
SELECT A1.ACCTNO, A2.ACCTNO AS CHILDACCOUNT FROM ACCOUNTS A1
INNER JOIN ACCOUNTS A2 ON A1.ACCTNO = A2.ACPARENT
WHERE A1.ACCTNO IN
(SELECT ACPARENT FROM ACCOUNTS);
then if I scroll down to where 80005217 is the parent account, I see the following list:
CHILDACCOUNT
______
80007298
80007310
80007315
80007325
It's sorted, even though it's exactly not what I want.
Is there a query that will get me a list of what I want in a single query? A list of all parent accounts, and their 'first child' as returned by SQL unsorted?
To guarantee records coming in a fixed order we must provide the database with sort criteria in the ORDER BY clause. If there is no attribute which defines "first-ness" then no guarantee is possible. Without an ORDER BY clause the records are essentially in an uncontrolled order, although because of
database internals they often fall into some kind of pattern.
So, what makes account 80007325 the first child WHERE ACPARENT = '80005217'? Clearly not numerical order. Is there some other criterion? Date created? A flag column? Seems like you need to talk to your users. Do they really care which records come first? All the time or just in some specific report?
If your users cannot specify the criteria there's not much you can do...
...although I might be tempted to sort CHILDACCOUNT numerically by ACCTNO whenever it is displayed. At least that would provide consistency, and the users will get used to it.

Selecting specific joined record from findAll() with a hasMany() include

(I tried posting this to the CFWheels Google Group (twice), but for some reason my message never appears. Is that list moderated?)
Here's my problem: I'm working on a social networking app in CF on Wheels, not too dissimilar from the one we're all familiar with in Chris Peters's awesome tutorials. In mine, though, I'm required to display the most recent status message in the user directory. I've got a User model with hasMany("statuses") and a Status model with belongsTo("user"). So here's the code I started with:
users = model("user").findAll(include="userprofile, statuses");
This of course returns one record for every status message in the statuses table. Massive overkill. So next I try:
users = model("user").findAll(include="userprofile, statuses", group="users.id");
Getting closer, but now we're getting the first status record for each user (the lowest status.id), when I want to select for the most recent status. I think in straight SQL I would use a subquery to reorder the statuses first, but that's not available to me in the Wheels ORM. So is there another clean way to achieve this, or will I have to drag a huge query result or object the statuses into my CFML and then filter them out while I loop?
You can grab the most recent status using a calculated property:
// models/User.cfc
function init() {
property(
name="mostRecentStatusMessage",
sql="SELECT message FROM statuses WHERE userid = users.id ORDER BY createdat DESC LIMIT 1,1"
);
}
Of course, the syntax of the SELECT statement will depend on your RDBMS, but that should get you started.
The downside is that you'll need to create a calculated property for each column that you need available in your query.
The other option is to create a method in your model and write custom SQL in <cfquery> tags. That way is perfectly valid as well.
I don't know your exact DB schema, but shouldn't your findAll() look more like something such as this:
statuses = model("status").findAll(include="userprofile(user)", where="userid = users.id");
That should get all statuses from a specific user...or is it that you need it for all users? I'm finding your question a little tricky to work out. What is it you're exactly trying to get returned?

Rails3: left join aggregate count - how to calculate?

In my application Users register for Events, which belong to a Stream. The registrations are managed in the Registration model, which have a boolean field called 'attended'.
I'm trying to generate a leaderboard and need to know: the total number of registrations for each user, as well as a count for user registrations in each individual event stream.
I'm trying this (in User.rb):
# returns an array of users and their attendence count
def self.attendance_counts
User.all(
:select => "users.*, sum(attended) as attendance_count",
:joins => 'left join `registrations` ON registrations.user_id = users.id',
:group => 'registrations.user_id',
:order => 'attendance_count DESC'
)
end
The generated SQL works for just returning the total attended count for each user when I run it in the database, but all that gets returned is the User record in Rails.
I'm about to give up and hardcode a counter_cache for each stream (they are fairly fixed) into the User table, which gets manually updated whenever the attended attribute changes on a Registration model save.
Still, I'm really curious as to how to perform a query like this. It must come up all the time when calculating statistics and reports on records with relationships.
Your time and consideration is much appreciated. Thanks in advance.
Firstly as a couple of points on style and rails functions to help you with building DB queries.
1) You're better writing this as a scope rather than a method i.e.
scope attendance_counts, select("users.*, sum(attended) as attendance_count").joins(:registrations).group('registrations.user_id').order('attendance_count DESC')
2) It's better not to call all/find/first on the query you've built up until you actually need it (i.e. in the controller or view). That way if you decide to implement action / fragment caching later on the DB query won't get called if the cached action / fragment is served to the user.
3) Rails has a series of functions to help with aggregating db data. for example if you only wanted a user's id and the sum of attended you could use something like the following code:
Registrations.group(:user_id).sum(:attended)
Other functions include count, avg, minimum, maximum
Finally in answer to your question, rails will create an attribute for you to access the value of any custom fields you have in the select part of your query. e.g.
#users = User.attendance_counts
#users[0].attendance_count # The attendance count for the first user returned by the query

Designing database

I am finding it difficult to decide on an efficient design of the database. My application would get a number of ingredients(table) from the user and check with the database to find the recipe that could be prepared from the list of ingredients that the user provides.
My initial design is
Useringredients(ing_id,ing_name..);
the recipe database would be
recipe(rec_id,rec_text,...);
items_needed(rec_id,item_id,...);
items(item_id,item_name);
Is this a good way ? If so how will i be able to query to retrieve the recipes from the list of user ingredients.
Help would be very much appreciated.
This design could work. You have one table recording recipes, one recording items and one recording the many-to-many relationship between the two (though I would work on your naming conventions to keep things consistent).
To get any recipes that contain at least one item in your list, you could use the following:
Select rec.rec_id,
Count(itn.item_id) as [NumMatches]
From recipe as rec
Join items_needed as itn on itn.rec_id = rec.rec_id
Where itn.item_id in (comma-delimited-list-of-itemIDs)
Group By rec.rec_ID
Having Count(itn.item_id) > 0
Order By Count(itn.item_id) desc
This returns any recipes that contain at least some of the items that are selected, sorted with the first recipes having the highest number of matches.
The following query should give you a list of unique recipes using any one of the ingredients the user searches for
select distinct rec_id,rec_text,ii.item_name
from recipe rr
join items_needed itn on itn.rec_id=rr.rec_id
join items ii on ii.item_id=itn.item_id
join userIngredients ui on ui.ing_id=ii.item_id
Yaakov's query looks like it will handle the situation where you want all ingredients. You might be able to replace (comma-delimited-list-of-itemIDS) with (select ing_id from userIngredients)

Rails Custom Model Functions

I'm in a databases course and the instructor wants us to develop an e-commerce app. She said we can use any framework we like, and now that we're halfway through the semester she decided that Rails does too much and wants me to explicitly write my SQL queries.
So, what I'd like to do is to write my own functions and add them to the models to essentially duplicate already existing functionality (but with SQL that I wrote myself).
So the questions then become:
How do I execute manually created queries inside the model?
How do I stuff the results into an empty object that I can then return and work with inside the view?
Also, I'm aware of what terrible practice this is, I just don't want to start all over in PHP at this point.
I think, you should know 2-3 really necessary methods, to use it.
(assume we have at least 2 models, Order and User(customer for order))
For example, just to run query on your database use this:
Order.connection.execute("DELETE FROM orders WHERE id = '2')
to get number of objects from your database, the best way is use method "count_by_sql", it's scalable. I'm using it in my projects, where table has over 500 thousands records. All work to count application gives to database, and it did it much more efficient than app.
Order.count_by_sql("SELECT COUNT(DISTINCT o.user_id) FROM orders o")
this query gets number of all uniq users who has an order. we can "JOIN ON" tables, order results using "ORDER BY" and group results.
and the most often use method: find_by_sql
Order.find_by_sql("SELECT * FROM orders")
it returns to you an array with ruby objects.
Lets say you have a purchase
class Purchase < ActiveRecord:Base
def Purchase.find(id)
Purchase.find_by_sql(["Select * from purchases where id=?", id])
end
end
Maybe you want the products for a particular purchase. You can manually define the purchased_items in your Purchase model.
class Purchase < ActiveRecord:Base
def purchased_items
PurchasedItem.find_by_sql(["Select * from purchased_items where purchase_id=?",self.id])
end
end
So for example, in your controller where you now want to get the purchased items for a particular purchase you can now do this
#purchase = Purchase.find(params[:id])
#purchased_items = #purchase.purchased_items
If you need a more raw connection to the database, you can look into ActiveRecord:Base.connection.execute(sql)