How to combine inner join and left outer join in Rails - sql

I have two Models invoice and payments. The relationship between them is invoice has_many payments.
I'm using the following left outer join to return all invoices that have not been paid at all:
result1 = Invoice.includes(:payments).where(:payments => { :id => nil })
I'm also interested in all invoices that have been partially paid. To return those I use an inner join:
result2 = Invoice.joins(:payments).group("transfers.id").having("sum(payments.amount) < invoice.amount")
I would now like to combine the two results, i.e. I want all Invoices that have either not been paid, or not been fully paid. I know that I could just do result = result1 + result2. However, this doesn't return an ActiveRecord object. Is there a way to combine those two queries in a single query?
I use Rails 4.1 and PostgreSQL

I believe you're correct that you can't get ActiveRecord to generate anything other than an inner join without writing it yourself. But I wouldn't use includes in this context, for while it does happen to cause ActiveRecord to generate a different join, that is an "implementation detail" -- it's fundamental purpose is to load the associated models, which is not necessarily what you want.
In your proposed solution, I don't understand why you'd group by both invoices.id and payments.id -- that seems to defeat the purpose of grouping.
You could do something like the following, although I can't say that this seems much more Rails-ish.
Invoice.joins("LEFT JOIN payments ON payments.transfer_id = invoices.id")
.select("invoices.id, SUM(payments.amount) AS total_paid")
.group("invoices.id")
.having("SUM(payments.amount) IS NULL OR SUM(payments.amount) < invoices.amount")
This will return a list of Invoice objects with only the id field and total_paid field set. If you need other fields available, add them to both the select statement and the the group statement.

I ended up doing just a LEFT OUTER JOIN. However, Rails doesn't seem to support group and having with the LEFT OUTER JOIN generated by includes(). Therefore, i had to construct the join manually which actually doesn't feel like "the Rails way".
My query:
Invoice.joins("LEFT JOIN payments ON payments.transfer_id = invoices.id").group("invoices.id, payments.id").having("((payments.id IS NULL) OR (sum(payments.amount) < invoices.amount))")
Update: This doesn't work as expected. Please see the accepted answer.

Related

PostgreSQL: SQL JOIN with "Lookahead" Condition in ON or WHERE Clause

I am not sure how best to describe the problem I have, but it feels very much like an SQL query need for a lookahead condition such as those in a regular expression :). Pardon my verbosity as I try to find a way to express this problem.
I have a relatively complex schema of medical services, patient info, patient payment types (including insurances, workers comp info, etc), and medical sales commission earnings. Naturally there are sales reporting tools. What follows is a very oversimplified version of a query involving varying JOIN conditional cases. Most of this query is just context for the clause my question focuses on, the 2nd-to-last WHERE condition (which defines JOIN conditions):
SELECT vendor_services.*, patients.*, payments.*, vendor_products.*, products.*, vendor_product_commissions.*, vendor_commissions.*, commission_earners.*, users.*
FROM vendor_services
JOIN patients ON patients.vendor_id = vendor_services.vendor_id
JOIN payments ON payments.patient_id = patiends.id
JOIN payment_types ON payment_types.id = payments.payment_type_id
JOIN vendor_products ON vendor_products.id = vendor_services.vendor_product_id
JOIN products ON products.id = vendor_products.product_id
JOIN vendor_product_commissions ON vendor_product_commissions.vendor_product_id = vendor_products.id
JOIN vendor_commissions ON vendor_commissions.id = vendor_product_commissions.vendor_commissions.id
LEFT JOIN commission_earners ON commission_earners.id = vendor_commissions.commission_earners_id
JOIN users ON commission_earners.user_id = users.id
WHERE
vendor_services.state != 'In Progress'
AND
vendor_services.date BETWEEN :datetime_1 AND :datetime_2
AND
vendor_commissions.start_date > :datetime_1
AND
vendor_commissions.end_date < :datetime_2
AND
vendor_product_commissions.payment_type = payment_types.type
AND
payments.transaction_type = 'Paid'
GROUP BY
....
Again, this is very oversimplified: the SELECT clause is far more complex, as are the GROUP BY and ORDER clauses, performing CASE statements and aggregate calculations, etc. I have left out many other tables which represent other systems within the overall application, and focused just on the data and clauses that are relevant. My question is in regards to a needed change to this particular WHERE condition regarding the following JOIN:
WHERE ... AND vendor_product_commissions.payment_type = payment_types.type
There has been an introduction of a new possible vendor_product_commissions.payment_type value that is not a member of the payment_types.type values. With the SQL query exactly as is, it no longer selects rows in most cases, as much of the LEFT table will be using this new value. When adding an OR clause, then duplicate rows are selected when only one row should be selected:
WHERE ... AND vendor_product_commissions.payment_type = payment_types.type OR vendor_product_commissions.payment_type = 'DEFAULTVALUE'
What I need is to JOIN only on the row where vendor_product_commissions.payment_type = payment_types.type, unless that produces NULL, in which case I need to perform the JOIN on vendor_product_commissions.payment_type = 'DEFAULTVALUE'.
I can do this programatically with the ORM code surrounding this query, but that is very inefficient for a very large reporting system (essentially, query first for the specific type, then if none returned, query again for the "default" type).
I dont believe this feature exists in PostgreSQL, but thats why I am describing it as a "lookahead JOIN" problem - I need to have a sort of CASE statement that if the first JOIN condition produces a NULL relation, then perform the subsequent JOIN (OR) condition to match against this newly introduced value ('DEFAULTVALUE'). Can this be done in raw SQL? Or do I need to break this whole query apart and perform the selection of services and related data, and then programatically / iteratively relate it (via ORM/application language code) to the sales commission data? I have a strong hunch that the query can be modified to do this, but without being knowledgeable of a particular label or term for this problem, I am having a hard time searching for a possible SQL-based solution.
This is for a Ruby on Rails 4 application, using ActiveRecord, though the SQL JOIN statements are all in plaintext / strings since AR doesnt provide methods for LEFT JOIN (again, there are more and more types of JOIN statements than those listed above). I am not sure if Rails is relevant to my question, but I figured I would mention it.

outer joins models that are not associations

I have the following SQL I want to create with activerecord. My problem is that I am stuck in a logic loop where I can't LEFT OUTER JOIN a table which has yet to be joined, and I can't find my entry point to the join fiasco
in activerecord I am trying to do
AdMsgs.joins("LEFT OUTER JOIN shows ON ad_msgs.user_id = shows.id OR ad_msgs.user_id = shows.b_id ")
.joins("LEFT OUTER JOIN m ON m.user_id = users.id OR m.m_id = shops.id OR m.m_id = shows.b_id")
.joins("LEFT OUTER JOIN users ON ad_msgs.to = users.email OR ad_msgs.user_id = users.id OR users.id = m.user_id")
.where("shows.id = ?", self.id)
.distinct("ad_msgs.id")
the query outputs an error saying it doesn't know what users is on the second join (probably since I haven't joined it yet) but I need to select the m records according the the users
AdMsgs doesn't have an association with neither of the tables.
Is there a way to full outer join these 3 tables and then select the ones relevant (or any better ways?)
use find_by_sql to implement such scenarios.
otherwise as a rule of thumb, if you can't use joins like this Blog.joins(articles: :comments) you are probably doing something bad or use find_by_sql instead.
in a complex case, i'm writing my query in SQL first to verify the logic involved. often times it's trivial to replace one complex query with 2 simple ones (using IN(*ids)).

Combining data from multiple tables issue Oracle 11g

TABLE // FIELD
Customer // Company
Stock // Description
Manufact // Manu_Name
Items // Quantity, total_price
I am using Oracle 11g Application Express. I need to display a list of each stock ordered for EACH CUSTOMER. I need to display the Manufacturer, quantity ordered, and total price paid.
When I run this query within my SQL*PLUS command prompt, it endlessly displays results from the tables mentioned until I force-quit (ctrl+c) the application. This is incredibly frustrating - I've tried joining tables, using the EXISTS clause, I just don't know what the hell to do. Any insight would be wonderful - not looking for someone to simply solve this for me, more-so just guide me.
SELECT c.company, s.description, m.manu_name, i.quantity, i.total_price
FROM db1.customer c JOIN db1.orders o USING (customer_num), db1.stock s, db1.manufact m, db1.items i
WHERE o.order_num = i.order_num;
This causes a never-ending display of what seems like the same results, over, and over, and over.
Essentially, I need to display the required information for EACH ORDER of stock. However, I don't need the order_num in my output display of columns, so I thought I needed to use the order_num (in db1.orders o & db1.items i) to essentially tell Oracle, "For each order_num (an order can't exist without an order_num), display (results)...
I am incredibly lost - I've tried outer joins, I've tried using an EXIST operator, I am just stumped and I feel like it's something easy that I'm overlooking.
EDIT: So, it seems I finally found it, after an enormous amount of pondering.
This is how I did it, in case anyone else runs into this issue:
SELECT c.company, s.description, m.manu_name, i.quantity, i.total_price
FROM db1.customer c JOIN db1.orders o USING (customer_num)
JOIN db1.items i USING (order_num)
JOIN db1.stock USING (stock_num)
JOIN db1.manufact m ON m.manu_code = s.manu_code
ORDER BY c.company, s.description;
If you JOIN db1.manufact m USING (manu_code), you get an ambiguously defined column error from Oracle - this is because I already joined the other tables and that column was in one of them (It was the db1.stock table). You can still join them, but you have to use JOIN ON instead.
This displayed the results I needed. Thanks anyways, and cheers if this helped anybody out!
You've only provided two joins (one USING and one in the WHERE) between 5 tables - in this case, you will get the cartesian product of all other rows in all other tables, hence the large number of rows.
(Edit, by implication you need to join all tables together, whether by USING or JOIN)
In order to use the USING join sugar, the same column must be present on the immediate lhs and rhs tables. For multiple joins, into a hierarchy, you may need to nest the USINGs like so:
SELECT c.company, s.description, m.manu_name, i.quantity, i.total_price
FROM customer c
JOIN orders o
JOIN stock s
JOIN items i
JOIN manufact m USING(manid)
USING(itemid)
USING (stockid)
USING (customer_num);
There where join isn't needed since we already have the USING join
I've assumed some columns and relationships between your table in this fiddle here:
You can also drop the USING and use explicit JOIN syntax, which will allow you to avoid the nesting (this is also more portable across the ANSI world):
SELECT c.company, s.description, m.manu_name, i.quantity, i.total_price
FROM customer c
INNER JOIN orders o on c.customer_num = o.customer_num
INNER JOIN stock s on o.stockid = s.stockid
INNER JOIN items i on i.itemid = s.itemid
INNER JOIN manufact m on m.manid = i.manid;
Edit
As OP has demonstrated, no requirement to nest the USING joins, provided that the join ordering is sensible, and provided that the FK JOIN column isn't duplicated across multiple tables.
http://sqlfiddle.com/#!4/91ef6/9

Rails ActiveRecord where or clause

I'm looking to write an ActiveRecord query and this is what I have below. Unfortunately you can't use OR like this. What's the best way to execute? category_ids is an array of integers.
.where(:"categories.id" => category_ids).or.where(:"category_relationships.category_id" => category_ids)
One way is to revert to raw sql...
YourModel.where("categories.id IN ? OR category_relationships.category_id IN ?", category_ids, category_ids)
Keep the SQL out of it and use ARel, like this:
.where(Category.arel_table[:id].in(category_ids).
or(CategoryRelationship.arel_table[:category_id].in(category_ids))
Assuming you want to return Categories, you need to OUTER JOIN category_relationships and put a OR condition on the combined table.
Category.includes(:category_relationships).where("categories.id IN (?) OR category_relationships.category_id IN (?)",category_ids,category_ids )
This query is creating an outer join table by combining columns of categories and category_relationships. Unlike an inner join (e.g. Category.joins(:category_relationships)), outer join table would also have categories with no associated category_relationship. It would then apply the conditions in whereclause on the outer join table to return the matching records.
includes statement without conditions on the association usually makes two separate sql queries to retrieve the records and their association. However when used with conditions on the associated table, it would make a single query to create an outer join table and run conditions on the outer join table. This allows you to retrieve records with no association as well.
See this for a detailed explanation.
What you want to do is manually write the OR part of the query like this:
.where("category.id in (#{category_ids.join(',')}) OR category_relationships.category_id in (#{category_ids.join(',')})")

How to implement paging in NHibernate with a left join query

I have an NHibernate query that looks like this:
var query = Session.CreateQuery(#"
select o
from Order o
left join o.Products p
where
(o.CompanyId = :companyId) AND
(p.Status = :processing)
order by o.UpdatedOn desc")
.SetParameter("companyId", companyId)
.SetParameter("processing", Status.Processing)
.SetResultTransformer(Transformers.DistinctRootEntity);
var data = query.List<Order>();
I want to implement paging for this query, so I only return x rows instead of the entire result set.
I know about SetMaxResults() and SetFirstResult(), but because of the left join and DistinctRootEntity, that could return less than x Orders.
I tried "select distinct o" as well, but the sql that is generated for that (using the sqlserver 2008 dialect) seems to ignore the distinct for pages after the first one (I think this is the problem).
What is the best way to accomplish this?
In these cases, it's best to do it in two queries instead of one:
Load a page of orders, without joins
Load those orders with their products, using the in operator
There's a slightly more complex example at http://ayende.com/Blog/archive/2010/01/16/eagerly-loading-entity-associations-efficiently-with-nhibernate.aspx
Use SetResultTransformer(Transformers.AliasToBean()) and get the data that is not the entity.
The other solution is that you change the query.
As I see you're returning Orders that have products which are processing.
So you could use exists statement. Check nhibernate manual at 13.11. Subqueries.