Rails 5 complex joins query - sql

I have a join query that works:
The goal is to retrieve candidates not associated with campaigns with id in id (Array) through campaign_addings
Candidate.joins("LEFT OUTER JOIN campaign_addings ON campaign_addings.candidate_id = candidates.id AND campaign_addings.campaign_id IN (#{id.join(',')})").where('campaign_addings.candidate_id IS NULL')
But well.. This query is really ugly, I would like to recreate this behavior with pure ActiveRecord DSL and no SQL, I have this for now:
Candidate.left_outer_joins(:campaign_addings).where(campaign_addings: { campaign_id: id, candidate_id: nil })
The good query generates:
SELECT COUNT(*) FROM "candidates" LEFT OUTER JOIN campaign_addings ON campaign_addings.candidate_id = candidates.id AND campaign_addings.campaign_id IN (4) WHERE (campaign_addings.candidate_id IS NULL)
And the bad one:
SELECT COUNT(DISTINCT "candidates"."id") FROM "candidates" LEFT OUTER JOIN "campaign_addings" ON "campaign_addings"."candidate_id" = "candidates"."id" WHERE "campaign_addings"."campaign_id" = 4 AND "campaign_addings"."candidate_id" IS NULL
You can see this is really close, but I think I a missing something, can't find the way to join something like the first query.
Edit:
Plus, there is a SQL Injection vulnerability

Related

outer joins models that are not associations

I have the following SQL I want to create with activerecord. My problem is that I am stuck in a logic loop where I can't LEFT OUTER JOIN a table which has yet to be joined, and I can't find my entry point to the join fiasco
in activerecord I am trying to do
AdMsgs.joins("LEFT OUTER JOIN shows ON ad_msgs.user_id = shows.id OR ad_msgs.user_id = shows.b_id ")
.joins("LEFT OUTER JOIN m ON m.user_id = users.id OR m.m_id = shops.id OR m.m_id = shows.b_id")
.joins("LEFT OUTER JOIN users ON ad_msgs.to = users.email OR ad_msgs.user_id = users.id OR users.id = m.user_id")
.where("shows.id = ?", self.id)
.distinct("ad_msgs.id")
the query outputs an error saying it doesn't know what users is on the second join (probably since I haven't joined it yet) but I need to select the m records according the the users
AdMsgs doesn't have an association with neither of the tables.
Is there a way to full outer join these 3 tables and then select the ones relevant (or any better ways?)
use find_by_sql to implement such scenarios.
otherwise as a rule of thumb, if you can't use joins like this Blog.joins(articles: :comments) you are probably doing something bad or use find_by_sql instead.
in a complex case, i'm writing my query in SQL first to verify the logic involved. often times it's trivial to replace one complex query with 2 simple ones (using IN(*ids)).

SQL - Conditional Left Join Count Performance is Slow

I am using SQL Server 2012.
I have three tables. Builders, Addresses and BuilderAddresses.
I have the following query which is used to give me my total count of records during paging:
SELECT COUNT(*) FROM Builders
LEFT JOIN Addresses ON Addresses.AddressId IN
(SELECT AddressId FROM BuilderAddresses WHERE BuilderId = Builders.BuilderId AND IsPrimary = 1)
WHERE Builders.[Email] LIKE '%TEST'%
ORDER BY Builders.[Name]
This query is particularly slow when records in the table approach 100k+. Does any one have any suggestions on how to get this query to execute faster??
On a table with 120K records, it take 452ms to get the count. When it comes to returning the records used in the paging, say 100 rows, it takes 11ms. I would really like to improve this if I can.
If I need to add greater detail, please let me know and I will edit the question.
The ORDER BY is not necessary for the COUNT, and you can remove that IN validation by joining with BuilderAdresses directly.
Try something like this:
SELECT COUNT(*)
FROM Builders b
LEFT JOIN BuilderAddresses ba ON ba.BuilderId = b.BuilderId AND isPrimary = 1
LEFT JOIN Addresses a ON a.AddressId = ba.AddressId
WHERE Builders.[Email] LIKE '%TEST' %
The problem is most likely to be with using IN as part of the join predicate. What it looks like you need to do is first join the junction table BuilderAddresses and then join Addresses, so something like this
SELECT COUNT(*) FROM Builders
JOIN BuilderAddresses ON BuilderAddresses.BuilderId = Builders.BuilderId AND isPrimary = 1
JOIN Addresses ON Addresses.AddressId = BuilderAddresses.AddressId
WHERE Builders.[Email] LIKE '%TEST%'
ORDER BY Builders.[Name]

Writing this SQL in LINQ? (outer apply)

My apologies for my recent SQL/Linq questions, but i need to know what this working SQL query would look like in LINQ?
select *
from CarePlan c
outer apply (select top 1 * from Referral r
where
r.CarePlanId = c.CarePlanId order by r.ReferralDate desc) x
left outer join Specialist s on s.SpecialistId = x.SpecialistId
left outer join [User] u on u.UserId = s.UserId
This basically retrieves a list of Careplans with the newest Referral (if it exists), then joins the Specialist and User table based on any found Referrals
Thanks
Kind advice: target on what you want to express in the environment of your class model and LINQ, in stead of trying to reproduce SQL. If you do something like
context.CarePlans
.Select(cp => new { Plan = cp, FirstReferral = cp.Referrals.FirstOrDefault() }
(provided that it matches your context and ignoring ordering and other joins for clarity)
It would basically do what you want, but it may very well translate to an inline subquery, rather than an OUTER APPLY. To the same effect. And the execution plan probably won't differ very much.

SQL: Chaining Joins Efficiency

I have a query in my WordPress plugin like this:
SELECT users.*, U.`meta_value` AS first_name,M.`meta_value` AS last_name
FROM `nwp_users` AS users
LEFT JOIN `nwp_usermeta` U
ON users.`ID`=U.`user_id`
LEFT JOIN `nwp_usermeta` M
ON users.`ID`=M.`user_id`
LEFT JOIN `nwp_usermeta` C
ON users.`ID`=C.`user_id`
WHERE U.meta_key = 'first_name'
AND M.meta_key = 'last_name'
AND C.meta_key = 'nwp_capabilities'
ORDER BY users.`user_login` ASC
LIMIT 0,10
I'm new to using JOIN and I'm wondering how efficient it is to use so many JOIN in one query. Is it better to split it up into multiple queries?
The database schema can be found here.
JOIN usually isn't so bad if the keys are indexed. LEFT JOIN is almost always a performance hit and you should avoid it if possible. The difference is that LEFT JOIN will join all rows in the joined table even if the column you're joining is NULL. While a regular (straight) JOIN just joins the rows that match.
Post your table structure and we can give you a better query.
See this comment:
http://forums.mysql.com/read.php?24,205080,205274#msg-205274
For what it's worth, to find out what MySQL is doing and to see if you have indexed properly, always check the EXPLAIN plan. You do this by putting EXPLAIN before your query (literally add the word EXPLAIN before the query), then run it.
In your query, you have a filter AND C.meta_key = 'nwp_capabilities' which means that all the LEFT JOINs above it can be equally written as INNER JOINs. Because if the LEFT JOINS fail (LEFT OUTER is intended to preserve the results from the left side), the result will 100% be filtered out by the WHERE clause.
So a more optimal query would be
SELECT users.*, U.`meta_value` AS first_name,M.`meta_value` AS last_name
FROM `nwp_users` AS users
JOIN `nwp_usermeta` U
ON users.`ID`=U.`user_id`
JOIN `nwp_usermeta` M
ON users.`ID`=M.`user_id`
JOIN `nwp_usermeta` C
ON users.`ID`=C.`user_id`
WHERE U.meta_key = 'first_name'
AND M.meta_key = 'last_name'
AND C.meta_key = 'nwp_capabilities'
ORDER BY users.`user_login` ASC
LIMIT 0,10
(note: "JOIN" (alone) = "INNER JOIN")
Try explaining the query to see what is going on and if your select if optimized. If you haven't used explain before read some tutorials:
http://www.learn-mysql-tutorial.com/OptimizeQueries.cfm
http://www.databasejournal.com/features/mysql/article.php/1382791/Optimizing-MySQL-Queries-and-Indexes.htm

How do I remove a nested select from this SQL statement

I have the following SQL:
SELECT * FROM Name
INNER JOIN ( SELECT 2 AS item, NameInAddress.NameID as itemID, NameInAddress.AddressID
FROM NameInAddress
INNER JOIN Address ON Address.AddressID = NameInAddress.AddressID
WHERE (Address.Country != 'UK')
) AS Items ON (Items.itemID = Name .Name ID)
I have been asked to remove the nested select and use INNER JOINS instead, as it will improve performance, but I'm struggling.
Using SQL Server 2008
Can anyone help?
Thanks!
Your query is not correct as you're using Items.itemID while it's not in the subselect
I guess this is what you meant:
SELECT Name.*
FROM Name
INNER JOIN NameInAddress
ON Name.NameID = NameInAddress.NameID
INNER JOIN Address
ON Address.AddressID = NameInAddress.AddressID
WHERE (Address.Country != 'UK')
EDIT: The exact translation of your query would start with a SELECT Name.*, 2 as Item, NameInAddress.NameID, NameInAddress.AddressID though
It is one of those long-lived myths that nested selects are slower than joins. It depends completely on what the nested select says. SQL is just a declarative language to tell what you want done, the database will transform it into completely different things. Both MSSQL and Oracle (and I suspect other major engines as well) are perfectly able to transform correlated subqueries and nested views into joins if it is beneficial (unless you do really complex things which would be very hard, if possible, to describe with normal joins.
SELECT 2 AS Item, *
FROM Name
INNER JOIN NameInAddress
ON Name.NameID = NameInAddress.NameID
INNER JOIN Address
ON Address.AddressID = NameInAddress.AddressID
WHERE Address.Country != 'UK'
PS: Don't use "*". This will increase performance too. :)