outer joins models that are not associations - sql

I have the following SQL I want to create with activerecord. My problem is that I am stuck in a logic loop where I can't LEFT OUTER JOIN a table which has yet to be joined, and I can't find my entry point to the join fiasco
in activerecord I am trying to do
AdMsgs.joins("LEFT OUTER JOIN shows ON ad_msgs.user_id = shows.id OR ad_msgs.user_id = shows.b_id ")
.joins("LEFT OUTER JOIN m ON m.user_id = users.id OR m.m_id = shops.id OR m.m_id = shows.b_id")
.joins("LEFT OUTER JOIN users ON ad_msgs.to = users.email OR ad_msgs.user_id = users.id OR users.id = m.user_id")
.where("shows.id = ?", self.id)
.distinct("ad_msgs.id")
the query outputs an error saying it doesn't know what users is on the second join (probably since I haven't joined it yet) but I need to select the m records according the the users
AdMsgs doesn't have an association with neither of the tables.
Is there a way to full outer join these 3 tables and then select the ones relevant (or any better ways?)

use find_by_sql to implement such scenarios.
otherwise as a rule of thumb, if you can't use joins like this Blog.joins(articles: :comments) you are probably doing something bad or use find_by_sql instead.
in a complex case, i'm writing my query in SQL first to verify the logic involved. often times it's trivial to replace one complex query with 2 simple ones (using IN(*ids)).

Related

Access SQL subquery access field from parent

I have a SQL query that works in Access 2016:
SELECT
Count(*) AS total_tests,
Sum(IIf(score>=securing_threshold And score<mastering_threshold,1,0)) AS total_securing,
Sum(IIf(score>=mastering_threshold,1,0)) AS total_mastering,
total_securing/Count(*) AS percent_securing,
total_mastering/Count(*) AS percent_mastering,
(Count(*)-total_securing-total_mastering)/Count(*) AS percent_below,
subjects.subject,
students.year_entered,
IIf(Month(Date())<9,Year(Date())-students.year_entered+6,Year(Date())-students.year_entered+7) AS current_form,
groups.group
FROM
((subjects
INNER JOIN tests ON subjects.ID = tests.subject)
INNER JOIN (students
INNER JOIN test_results ON students.ID = test_results.student) ON tests.ID = test_results.test)
LEFT JOIN
(SELECT * FROM group_membership LEFT JOIN groups ON group_membership.group = groups.ID) As g
ON students.ID = g.student
GROUP BY subjects.subject, students.year_entered, groups.group;
However, I wish to filter out irrelevant groups before joining them to my table. The table groups has a column subject which is a foreign key.
When I try changing ON students.ID = g.student to ON students.ID = g.student And subjects.ID = g.subject I get the error 'JOIN expression not supported'.
Alternatively, when I try adding WHERE subjects.ID = groups.subject to the subquery, it asks me for the parameter value of subjects.ID, although it is a column in the parent query.
Googling reveals many similar errors but they were all resolved by changing the brackets. That didn't help.
Just in case the table relationships help:
Thank you.
EDIT: Sample database at https://www.dropbox.com/s/yh80oooem6gsni7/student%20tracker.ACCDB?dl=0
MS Access queries with many joins are difficult to update by SQL alone as parenetheses pairings are required unlike other RDBMS's and these pairings must follow an order. Moreover, some pairings can even be nested. Hence, for beginners it is advised to build queries with complex and many joins using the query design GUI in the MS Office application and let it build out the SQL.
For a simple filter on the g derived table, you could filter subject on the derived table, g, but likely you want want all subjects:
...
(SELECT * FROM group_membership
LEFT JOIN groups ON group_membership.group = groups.ID
WHERE groups.subject='Earth Science') As g
...
So for all subjects, consider re-building query from scratch in GUI that nearly mirrors your table relationships which actually auto-links joins in the GUI. Then, drop unneeded tables.
Usually you want to begin with the join table or set like groups and group_membership or tests and test_results. In fact, consider saving the g derived table as its own query.
Then add the distinct record primary source tables like students and subjects.
You may even need to play around with order in FROM and JOIN clauses to attain desired results, and maybe even add the same table in query. And be careful with adding join tables like group_membership (two one-to-many links), to GROUP BY queries as it leads to the duplicate record aggregation. So you may need to join aggregates queries by subject.
Unless you can post content of all tables, from our perspective it is difficult to help from here.
Your subquery g uses a LEFT JOIN, but there is a enforced 1:n relation between the two tables, so there will always be a matching group. Use a INNER JOIN instead.
With g.subject you are trying to join on a column that is on the right side of a left join, that cannot really work.
Also you shouldn't use SELECT * on a join of tables with identical column names. Include only the qualified column names that you need.
LEFT JOIN
(SELECT group_membership.student, groups.group, groups.subject
FROM group_membership INNER JOIN groups
ON group_membership.group = groups.ID) As g
ON (students.ID = g.student AND subjects.ID = g.subject)
I would call the columns in group_membership group_ID and student_ID to avoid confusion.
I don't have the database to test, but I would use subject table as subquery:
(SELECT * FROM subject WHERE filter out what you don't need) Subj
Then INNER JOIN this new Subj Table in your query which would exclude irrelevant groups.
Also I would never create join in WHERE clause (WHERE subjects.ID = groups.subject), what this does it creates cartesian product (table with all the possible combinations of subjects.ID and groups.subject) then it filters out records to satisfy your join. When dealing with huge data it might take forever or crash.
Error related to "Join expression may not be supported"; do datatypes match in those fields?
I solved it by (a lot of trial and error and) taking the advice here to make queries in the GUI and joining them. The end result is 4 queries deep! If the database was bigger, performance would be awful... but now its working I can tweak it eventually.
Thank you everybody!

How can I optimise this slow PostgreSQL query

I have tracked the cause of a slow API endpoint to an SQL query and cannot for the life of me optimise it.
The query is as follows:
SELECT "Views".*, "Show".*, "Episode".*, "User".* FROM "Views"
LEFT OUTER JOIN "Shows" AS "Show" ON "Show"."id" = "Views"."ShowId"
LEFT OUTER JOIN "Episodes" AS "Episode" ON "Episode"."id" = "Views"."EpisodeId"
LEFT OUTER JOIN "Users" AS "User" ON "User"."id" = "Views"."UserId"
WHERE "Show"."tvdb_id" IN (259063,82066,258823,265766,261742,82283,205281,182061,121361)
AND "User"."id"=29;
It takes between 3000-4200ms to complete. The result of an EXPLAIN ANALYZE on the query can be found here: http://explain.depesz.com/s/J9R
EDIT: I have also tried separating the IN() into ORs:
SELECT "Views".*, "Show".*, "Episode".*, "User".* FROM "Views"
LEFT OUTER JOIN "Shows" AS "Show" ON "Show"."id" = "Views"."ShowId"
LEFT OUTER JOIN "Episodes" AS "Episode" ON "Episode"."id" = "Views"."EpisodeId"
LEFT OUTER JOIN "Users" AS "User" ON "User"."id" = "Views"."UserId"
WHERE
"Show"."tvdb_id"=259063 OR
"Show"."tvdb_id"=82066 OR
"Show"."tvdb_id"=258823 OR
"Show"."tvdb_id"=265766 OR
"Show"."tvdb_id"=261742 OR
"Show"."tvdb_id"=82283 OR
"Show"."tvdb_id"=205281 OR
"Show"."tvdb_id"=182061 OR
"Show"."tvdb_id"=121361
AND "User"."id"=29;
I have tried creating indexes on the columns referenced in the LEFT OUTER JOIN but that resulted in marginal gains (if anything). What's interesting is I also tried reducing the WHERE x IN to a single WHERE x = y and this instantly improved the situation, responding instead in 200-300ms. So how would I go about optimizing the IN statement?
Thanks!
Try to make combined index on table Views on fields ShowId, EpisodeId, UserId together.
CREATE INDEX Views_idx ON Views
USING btree (ShowId, EpisodeId, UserId);
ps. explain link is wrong, it's for some other query
Best regards,
nele

Writing this SQL in LINQ? (outer apply)

My apologies for my recent SQL/Linq questions, but i need to know what this working SQL query would look like in LINQ?
select *
from CarePlan c
outer apply (select top 1 * from Referral r
where
r.CarePlanId = c.CarePlanId order by r.ReferralDate desc) x
left outer join Specialist s on s.SpecialistId = x.SpecialistId
left outer join [User] u on u.UserId = s.UserId
This basically retrieves a list of Careplans with the newest Referral (if it exists), then joins the Specialist and User table based on any found Referrals
Thanks
Kind advice: target on what you want to express in the environment of your class model and LINQ, in stead of trying to reproduce SQL. If you do something like
context.CarePlans
.Select(cp => new { Plan = cp, FirstReferral = cp.Referrals.FirstOrDefault() }
(provided that it matches your context and ignoring ordering and other joins for clarity)
It would basically do what you want, but it may very well translate to an inline subquery, rather than an OUTER APPLY. To the same effect. And the execution plan probably won't differ very much.

SQL: Chaining Joins Efficiency

I have a query in my WordPress plugin like this:
SELECT users.*, U.`meta_value` AS first_name,M.`meta_value` AS last_name
FROM `nwp_users` AS users
LEFT JOIN `nwp_usermeta` U
ON users.`ID`=U.`user_id`
LEFT JOIN `nwp_usermeta` M
ON users.`ID`=M.`user_id`
LEFT JOIN `nwp_usermeta` C
ON users.`ID`=C.`user_id`
WHERE U.meta_key = 'first_name'
AND M.meta_key = 'last_name'
AND C.meta_key = 'nwp_capabilities'
ORDER BY users.`user_login` ASC
LIMIT 0,10
I'm new to using JOIN and I'm wondering how efficient it is to use so many JOIN in one query. Is it better to split it up into multiple queries?
The database schema can be found here.
JOIN usually isn't so bad if the keys are indexed. LEFT JOIN is almost always a performance hit and you should avoid it if possible. The difference is that LEFT JOIN will join all rows in the joined table even if the column you're joining is NULL. While a regular (straight) JOIN just joins the rows that match.
Post your table structure and we can give you a better query.
See this comment:
http://forums.mysql.com/read.php?24,205080,205274#msg-205274
For what it's worth, to find out what MySQL is doing and to see if you have indexed properly, always check the EXPLAIN plan. You do this by putting EXPLAIN before your query (literally add the word EXPLAIN before the query), then run it.
In your query, you have a filter AND C.meta_key = 'nwp_capabilities' which means that all the LEFT JOINs above it can be equally written as INNER JOINs. Because if the LEFT JOINS fail (LEFT OUTER is intended to preserve the results from the left side), the result will 100% be filtered out by the WHERE clause.
So a more optimal query would be
SELECT users.*, U.`meta_value` AS first_name,M.`meta_value` AS last_name
FROM `nwp_users` AS users
JOIN `nwp_usermeta` U
ON users.`ID`=U.`user_id`
JOIN `nwp_usermeta` M
ON users.`ID`=M.`user_id`
JOIN `nwp_usermeta` C
ON users.`ID`=C.`user_id`
WHERE U.meta_key = 'first_name'
AND M.meta_key = 'last_name'
AND C.meta_key = 'nwp_capabilities'
ORDER BY users.`user_login` ASC
LIMIT 0,10
(note: "JOIN" (alone) = "INNER JOIN")
Try explaining the query to see what is going on and if your select if optimized. If you haven't used explain before read some tutorials:
http://www.learn-mysql-tutorial.com/OptimizeQueries.cfm
http://www.databasejournal.com/features/mysql/article.php/1382791/Optimizing-MySQL-Queries-and-Indexes.htm

What's "wrong" with my DQL query?

I have the following SQL query:
select bank.*
from bank
join branch on branch.bank_id = bank.id
join account a on a.branch_id = branch.id
join import i on a.import_id = i.id
It returns exactly what I expect.
Now consider the following two DQL queries:
$q = Doctrine_Query::create()
->select('Bank.*')
->from('Bank')
->leftJoin('Branch')
->leftJoin('Account')
->leftJoin('Import');
-
$q = Doctrine_Query::create()
->select('Bank.*')
->from('Bank')
->innerJoin('Branch')
->innerJoin('Account')
->innerJoin('Import');
It would have been nice to have been able to use a "join()" method but, from the official Doctrine join documentation here, it says, "DQL supports two kinds of joins INNER JOINs and LEFT JOINs." For some reason that thoroughly escapes me, they chose not to support natural joins. Anyway, what that means is that the two queries above are my only options. Well, that's unfortunate because neither of them work.
The first query - the one with left joins - doesn't work because, of course, a left join and a natural join are two different things.
The second query doesn't work, either. Bafflingly, I get an error: "Unknown relation alias."
Why should Doctrine be able to figure out the alias for a LEFT JOIN but not for an INNER JOIN?
By the way, I realize INNER JOIN and JOIN are only nominally different but why implement the more specific one and not the canonical, natural one?
->select('b.*')
->from('Bank b')
->leftJoin('b.Branch h')
->select('b.*')
->from('Bank b')
->innerJoin('b.Branch h')
http://www.doctrine-project.org/documentation/manual/1_1/en/dql-doctrine-query-language:join-syntax