Can I do a NATURAL JOIN is Slick v2? - sql

The title is self-explanatory. Using 2.0.0-M3, I'd like to avoid unnecessary verbosity is the form of explicitly naming the columns to be joined on, since they are appropriately named, and since NATURAL JOIN is part of the SQL standard. Not to mention, Wikipedia itself even says that "The natural join is arguably one of the most important operators since it is the relational counterpart of logical AND."
I think the foregoing ought to be clear enough, but just if not, read on. Suppose I want to know the supplier-name and part-number of each part. Assuming appropriate case classes not shown:
class Suppliers(tag: Tag) extends Table[Supplier](tag, "suppliers") {
def snum = column[String]("snum")
def sname = column[String]("sname")
def * = (snum, sname) <> (Supplier.tupled, Supplier.unapply _)
}
class Shipments(tag: Tag) extends Table[Shipment](tag, "shipments") {
def snum = column[String]("snum")
def pnum = column[String]("pnum")
def * = (snum, pnum) <> (Shipment.tupled, Shipment.unapply _)
}
val suppliers = TableQuery[Suppliers]
val shipments = TableQuery[Shipments]
Given that both tables have the snum column I want to join on, seems as if
( suppliers join shipments ).run
ought to return a Vector with my desired data, but I get a failed attempt at an INNER JOIN, failing (at run-time) since it's missing any join condition.
I know I can do
suppliers.flatMap( s => shipments filter (sp => sp.snum === s.snum) map (sp => (s.sname, sp.pnum)) )
but, even without the names of all the columns I omitted for clarity of this question, it's still quite a lot more typing (and proofreading) than simply
suppliers join shipments
or, for that matter
SELECT * FROM suppliers NATURAL JOIN shipments;
If the Scala code is messier than the SQL code, then I really start questioning things. Is there no way simply to do a natural join in Slick?

Currently not supported by Slick. Please submit a ticket or pull request.
To improve readability of query code, you can put your join conditions into re-usable values. Or you can put the whole join in a function or method extension of Query[Suppliers,Supplier].
Alternatively you could look at the AutoJoin pattern (which basically makes your join conditions implicit) described here http://slick.typesafe.com/docs/#20130612_slick_vs_orm_scaladays_2013 and implemented here https://github.com/cvogt/play-slick/blob/scaladays2013/samples/computer-database/app/util/autojoin.scala

Related

Many to many query joins in aqueduct

I have A -> AB <- B many to many relationship between 2 ManagedObjects (A and B), where AB is the junction table.
When querying A from db, how do i join B values to AB joint objects?
Query<A> query = await Query<A>(context)
..join(set: (a) => a.ab);
It gives me a list of A objects which contains AB joint objects, but AB objects doesn't include full B objects, but only b.id (not other fields from class B).
Cheers
When you call join, a new Query<T> is created and returned from that method, where T is the joined type. So if a.ab is of type AB, Query<A>.join returns a Query<AB> (it is linked to the original query internally).
Since you have a new Query<AB>, you can configure it like any other query, including initiating another join, adding sorting descriptors and where clauses.
There are some stylistic syntax choices to be made. You can condense this query into a one-liner:
final query = Query<A>(context)
..join(set: (a) => a.ab).join(object: (ab) => ab.b);
final results = await query.fetch();
This is OK if the query remains as-is, but as you add more criteria to a query, the difference between the dot operator and the cascade operator becomes harder to track. I often pull the join query into its own variable. (Note that you don't call any execution methods on the join query):
final query = Query<A>(context);
final join = query.join(set: (a) => a.ab)
..join(object: (ab) => ab.b);
final results = await query.fetch();

Union between optional and non-optional tables

I have two queries that select records where a union needs to be taken, one of which is a left join and one of which is a regular (i.e. inner) join.
Here's the left join case:
def regularAccountRecords = for {
(customer, account) <- customers joinLeft accounts on (_.accountId === _.accountId) // + some other special conditions
} yield (customer, account)
Here's the regular join case:
def specialAccountRecords = for {
(customer, account) <- customers join accounts on (_.accountId === _.accountId) // + some other special conditions
} yield (customer, account)
Now I want to take a union of the two record sets:
regularAccountRecords ++ specialAccountRecords
Obviously this doesn't work because in the regular join case it returns Query[(Customer, Account),...] and in the left join case it returns Query[(Customer, Rep[Option[Account]]),...] and this results in a Type Mismatch error.
Now, If this were a regular column type (e.g. Rep[String]) I could convert it to an optional via the ? operator (i.e. record.?) and get Rep[Option[String]] but using it on a table (i.e. the accounts table) causes:
Error:(62, 85) value ? is not a member of com.test.Account
How do I work around this issue and do the union properly?
Okay, looks like this is what the '?' projection is for but I didn't realize it because I disabled the optionEnabled option in the Codegen. Here's what your codegen extension is supposed to look like:
class MyCodegen extends SourceCodeGenerator(inputModel) {
override def TableClass = new TableClassDef {
override def optionEnabled = true
}
}
Alternatively, you can use implicit classes to tack this thing onto the generated TableClass yourself. Here is how that would look:
implicit class AccountExtensions(account:Account) {
def ? = (Rep.Some(account.id), account.name).shaped.<>({r=>r._1.map(_=> Account.tupled((r._2, r._1.get)))}, (_:Any) => throw new Exception("Inserting into ? projection not supported."))
}
NOTE: be sure to check the field ordering, depending on how this
projection is done, the union query might put the ID field in the wrong
place in the output, use
println(query.result.statements.headOption) to debug the output
SQL to be sure.
Once you do that, you will be able to use account.? in the yield statement:
def specialAccountRecords = for {
(customer, account) <- customers join accounts on (_.accountId === _.accountId)
} yield (customer, account.?)
...and then you will be able to unionize the tables correctly
regularAccountRecords ++ specialAccountRecords
I really wish the Slick people would put a note on how the '?' projection is useful in the documentation beyond the vague statement 'useful for outer joins'.

Creating a Rails 3 scope that joins to a subquery

First off, I'm a Ruby/Rails newbie, so I apologize if this question is basic.
I've got a DB that (among other things) looks like this:
organizations { id, name, current_survey_id }
surveys { id, organization_id }
responses { id, survey_id, question_response_integer }
I'm trying to create a scope method that adds the average of the current survey answers to a passed-in Organization relation. In other words, the scope that's getting passed into the method would generate SQL that looks like more-or-less like this:
select * from organizations
And I'd like the scope, after it gets processed by my lambda, to generate SQL that looks like this:
select o.id, o.name, cs.average_responses
from organizations o join
(select r.id, avg(r.question_response_integer) as average_responses
from responses r
group by r.id) cs on cs.id = o.current_survey_id
The best I've got is something like this:
current_survey_average: lambda do |scope, sort_direction|
average_answers = Responses.
select("survey_id, avg(question_response_integer) as average_responses").
group("survey_id")
scope.joins(average_answers).order("average_responses #{sort_direction}")
end
That's mostly just a stab in the dark - among other things, it doesn't specify how the scope could be expected to join to average_answers - but I haven't been able to find any documentation about how to do that sort of join, and I'm running out of things to try.
Any suggestions?
EDIT: Thanks to Sean Hill for the answer. Just to have it on record, here's the code I ended up going with:
current_survey_average: lambda do |scope, sort_direction|
scope_table = scope.arel.froms.first.name
query = <<-QUERY
inner join (
select r.survey_id, avg(r.question_response_integer) as average_responses
from responses r
group by r.survey_id
) cs
on cs.survey_id = #{scope_table}.current_survey_id
QUERY
scope.
joins(query).
order("cs.average_responses #{sort_direction}")
end
That said, I can see the benefit of putting the averaged_answers scope directly onto the Responses class - so I may end up doing that.
I have not been able to test this, but I think the following would work, either as-is or with a little tweaking.
class Response < ActiveRecord::Base
scope :averaged, -> { select('r.id, avg(r.question_response_integer) as average_responses').group('r.id') }
scope :current_survey_average, ->(incoming_scope, sort_direction) do
scope_table = incoming_scope.arel.froms.first.name
query = <<-QUERY
INNER JOIN ( #{Arel.sql(averaged.to_sql)} ) cs
ON cs.id = #{scope_table}.current_survey_id
QUERY
incoming_scope.joins(query).order("average_responses #{sort_direction}")
end
end
So what I've done here is that I have split out the inner query into another scope called averaged. Since you do not know which table the incoming scope in current_survey_average is coming from, I got the scope table name via scope.arel.froms.first.name. Then I created a query string that uses the averaged scope and joined it using the scope_table variable. The rest is pretty self-explanatory.
If you do know that the incoming scope will always be from the organizations table, then you don't need the extra scope_table variable. You can just hardcode it into the join query string.
I would make one suggestion. If you do not have control over sort_direction, then I would not directly input that into the order string.

ActiveRecord/ARel modify `ON` in a left out join from includes

I'm wondering if it's possible to specify additional JOIN ON criteria using ActiveRecord includes?
ex: I'm fetching a record and including an association with some conditions
record.includes(:other_record).where(:other_record => {:something => :another})
This gives me (roughly):
select * from records
left outer join other_records on other_records.records_id = records.id
where other_records.something = another
Does anyone know how I can specify an extra join condition so I could achieve something like.
select * from records
left outer join other_records on other_records.records_id = records.id
and other_records.some_date > now()
where other_records.something = another
I want my includes to pull in the other_records but I need additional criteria in my join. Anything using ARel would also be great, I've just never known how to plug a left outer join from ARel into and ActiveRecord::Relation
I can get you close with ARel. NOTE: My code ends up calling two queries behind the scenes, which I'll explain.
I had to work out LEFT JOINs in ARel myself, recently. Best thing you can do when playing with ARel is to fire up a Rails console or IRB session and run the #to_sql method on your ARel objects to see what kind of SQL they represent. Do it early and often!
Here's your SQL, touched up a bit for consistency:
SELECT *
FROM records
LEFT OUTER JOIN other_records ON other_records.record_id = records.id
AND other_records.some_date > now()
WHERE other_records.something = 'another'
I'll assume your records model is Record and other_records is OtherRecord. Translated to ARel and ActiveRecord:
class Record < ActiveRecord::Base
# Named scope that LEFT JOINs OtherRecords with some_date in the future
def left_join_other_in_future
# Handy ARel aliases
records = Record.arel_table
other = OtherRecord.arel_table
# Join criteria
record_owns_other = other[:record_id].eq(records[:id])
other_in_future = other[:some_date].gt(Time.now)
# ARel's #join method lets you specify the join node type. Defaults to InnerJoin.
# The #join_sources method extracts the ARel join node. You can plug that node
# into ActiveRecord's #joins method. If you call #to_sql on the join node,
# you'll get 'LEFT OUTER JOIN other_records ...'
left_join_other = records.join(other, Arel::Nodes::OuterJoin).
on(record_owns_other.and(other_in_future)).
join_sources
# Pull it together back in regular ActiveRecord and eager-load OtherRecords.
joins(left_join_other).includes(:other_records)
end
end
# MEANWHILE...
# Elsewhere in your app
Record.left_join_other_in_future.where(other_records: {something: 'another'})
I bottled the join in a named scope so you don't need to have all that ARel mixed in with your application logic.
My ARel ends up calling two queries behind the scenes: the first fetches Records using your JOIN and WHERE criteria, the second fetches all OtherRecords "WHERE other_records.record_id IN (...)" using a big list of all the Record IDs from the first query.
Record.includes() definitely gives you the LEFT JOIN you want, but I don't know of a way to inject your own criteria into the join. You could use Record.joins() instead of ARel if you wanted to write the SQL yourself:
Record.joins('LEFT OUTER JOIN other_records' +
' ON other_records.record_id = records.id' +
' AND other_records.some_date > NOW()')
I really, really prefer to let the database adapter write my SQL, so I used ARel.
If it were me, I'd consider putting the additional join criterion in the WHERE clause. I assume you're asking because putting the additional criterion on the join makes the query's EXPLAIN look better or because you don't want to deal with NULLs in the other_records.some_date column when there aren't any related other_records.
If you have a simple (equality) extra join condition it could simply be
record.includes(:other_record).where(:other_record => {:something => :another,
:some_date => Time.now})
But if you need the greater than comparison the following should do it.
record.includes(:other_record).where([
'other_records.something = ? and other_records.some_date > ?',
another, Time.now])
Hope that helps.

How do I write a named scope to filter by all of an array passed in, and not just by matching one element (using IN)

I have two models, Project and Category, which have a many-to-many relationship between them. The Project model is very simple:
class Project < ActiveRecord::Base
has_and_belongs_to_many :categories
scope :in_categories, lambda { |categories|
joins(:categories).
where("categories.id in (?)", categories.collect(&:to_i))
}
end
The :in_categories scope takes an array of Category IDs (as strings), so using this scope I can get back every project that belongs to at least one of the categories passed in.
But what I'm actually trying to do is filter (a better name would be :has_categories). I want to just get the projects that belong to all of the categories passed in. So if I pass in ["1", "3", "4"] I only want to get the projects that belong to all of the categories.
There are two common solutions in SQL to do what you're describing.
Self-join:
SELECT ...
FROM Projects p
JOIN Categories c1 ON c1.project_id = p.id
JOIN Categories c3 ON c3.project_id = p.id
JOIN Categories c4 ON c4.project_id = p.id
WHERE (c1.id, c3.id, c4.id) = (1, 3, 4);
Note I'm using syntax to compare tuples. This is equivalent to:
WHERE c1.id = 1 AND c3.id = 3 AND c4.id = 4;
In general, the self-join solution has very good performance if you have a covering index. Probably Categories.(project_id,id) would be the right index, but analyze the SQL with EXPLAIN to be sure.
The disadvantage of this method is that you need four joins if you're searching for projects that match four different categories. Five joins for five categories, etc.
Group-by:
SELECT ...
FROM Projects p
JOIN Categories cc ON c.project_id = p.id
WHERE c.id IN (1, 3, 4)
GROUP BY p.id
HAVING COUNT(*) = 3;
If you're using MySQL (I assume you are), most GROUP BY queries invoke a temp table and this kills performance.
I'll leave it as an exercise for you to adapt one of these SQL solutions to equivalent Rails ActiveRecord API.
It seems like in ActiveRecord you would do it like so:
scope :has_categories, lambda { |categories|
joins(:categories).
where("categories.id in (?)", categories.collect(&:to_i)).
group("projects.id HAVING COUNT(projects.id) = #{categories.count}")
}