Weird LEFT OUTER JOIN on Includes eager loading of rails 3 - ruby-on-rails-3

I'm not sure it is a active record bug or not.
Or is there a way to do .includes and disable the LEFT OUTER JOIN strategy explicitly.
Here is my cases:
Given the keyword is 'abc' without a 'dot'
Post.where(:name => "abc").includes(:author)
There are two sqls used as normal
Post Load (0.8ms) SELECT `posts`.* FROM `posts` WHERE `posts`.`name` = 'abc'
Author Load (0.4ms) SELECT `authors`.* FROM `authors` WHERE `authors`.`id` IN (1)
Given the keyword is 'abc.' with a 'dot'
Post.where(:name => "abc.").includes(:author)
The sql is using LEFT OUTER JOIN strategy, that's confusing.
SELECT `posts`.`id` AS t0_r0, `posts`.`name` AS t0_r1, `posts`.`author_id` AS t0_r2, `posts`.`created_at` AS t0_r3, `posts`.`updated_at` AS t0_r4, `authors`.`id` AS t1_r0, `authors`.`created_at` AS t1_r1, `authors`.`updated_at` AS t1_r2
FROM `posts` LEFT OUTER JOIN `authors` ON `authors`.`id` = `posts`.`author_id`
WHERE `posts`.`name` = 'abc.'
I know eager loading with includes is realized with LEFT OUTER JOIN strategy when there are conditions on the association, like
Post.includes(:author).where(:authors => {:name => 'zhougn' })
But in my test cases, there is no such a condition. Basically both Multi-SQL strategy and LEFT OUTER JOIN strategy can give me correct result, but when Posts and Authors are stored in different databases, LEFT OUTER JOIN strategy will fail.

Yes, this is a bug. Issue #950: ActiveRecord query changing when a dot/period is in condition value. Fixed in Rails 4, it appears, by deprecating the feature that was getting hung up on this (no solution except to do it different and require the programmer to be explicit when they want a left outer join -- as well as which tables are referenced when using string conditions with includes).
My current dirty workaround (in Rails 3) is to encode/decode the dots like this:
.where("posts.name = REPLACE(?, '˙', '.')", somename.gsub(".", "˙"))

Related

outer joins models that are not associations

I have the following SQL I want to create with activerecord. My problem is that I am stuck in a logic loop where I can't LEFT OUTER JOIN a table which has yet to be joined, and I can't find my entry point to the join fiasco
in activerecord I am trying to do
AdMsgs.joins("LEFT OUTER JOIN shows ON ad_msgs.user_id = shows.id OR ad_msgs.user_id = shows.b_id ")
.joins("LEFT OUTER JOIN m ON m.user_id = users.id OR m.m_id = shops.id OR m.m_id = shows.b_id")
.joins("LEFT OUTER JOIN users ON ad_msgs.to = users.email OR ad_msgs.user_id = users.id OR users.id = m.user_id")
.where("shows.id = ?", self.id)
.distinct("ad_msgs.id")
the query outputs an error saying it doesn't know what users is on the second join (probably since I haven't joined it yet) but I need to select the m records according the the users
AdMsgs doesn't have an association with neither of the tables.
Is there a way to full outer join these 3 tables and then select the ones relevant (or any better ways?)
use find_by_sql to implement such scenarios.
otherwise as a rule of thumb, if you can't use joins like this Blog.joins(articles: :comments) you are probably doing something bad or use find_by_sql instead.
in a complex case, i'm writing my query in SQL first to verify the logic involved. often times it's trivial to replace one complex query with 2 simple ones (using IN(*ids)).

How can I optimise this slow PostgreSQL query

I have tracked the cause of a slow API endpoint to an SQL query and cannot for the life of me optimise it.
The query is as follows:
SELECT "Views".*, "Show".*, "Episode".*, "User".* FROM "Views"
LEFT OUTER JOIN "Shows" AS "Show" ON "Show"."id" = "Views"."ShowId"
LEFT OUTER JOIN "Episodes" AS "Episode" ON "Episode"."id" = "Views"."EpisodeId"
LEFT OUTER JOIN "Users" AS "User" ON "User"."id" = "Views"."UserId"
WHERE "Show"."tvdb_id" IN (259063,82066,258823,265766,261742,82283,205281,182061,121361)
AND "User"."id"=29;
It takes between 3000-4200ms to complete. The result of an EXPLAIN ANALYZE on the query can be found here: http://explain.depesz.com/s/J9R
EDIT: I have also tried separating the IN() into ORs:
SELECT "Views".*, "Show".*, "Episode".*, "User".* FROM "Views"
LEFT OUTER JOIN "Shows" AS "Show" ON "Show"."id" = "Views"."ShowId"
LEFT OUTER JOIN "Episodes" AS "Episode" ON "Episode"."id" = "Views"."EpisodeId"
LEFT OUTER JOIN "Users" AS "User" ON "User"."id" = "Views"."UserId"
WHERE
"Show"."tvdb_id"=259063 OR
"Show"."tvdb_id"=82066 OR
"Show"."tvdb_id"=258823 OR
"Show"."tvdb_id"=265766 OR
"Show"."tvdb_id"=261742 OR
"Show"."tvdb_id"=82283 OR
"Show"."tvdb_id"=205281 OR
"Show"."tvdb_id"=182061 OR
"Show"."tvdb_id"=121361
AND "User"."id"=29;
I have tried creating indexes on the columns referenced in the LEFT OUTER JOIN but that resulted in marginal gains (if anything). What's interesting is I also tried reducing the WHERE x IN to a single WHERE x = y and this instantly improved the situation, responding instead in 200-300ms. So how would I go about optimizing the IN statement?
Thanks!
Try to make combined index on table Views on fields ShowId, EpisodeId, UserId together.
CREATE INDEX Views_idx ON Views
USING btree (ShowId, EpisodeId, UserId);
ps. explain link is wrong, it's for some other query
Best regards,
nele

Writing this SQL in LINQ? (outer apply)

My apologies for my recent SQL/Linq questions, but i need to know what this working SQL query would look like in LINQ?
select *
from CarePlan c
outer apply (select top 1 * from Referral r
where
r.CarePlanId = c.CarePlanId order by r.ReferralDate desc) x
left outer join Specialist s on s.SpecialistId = x.SpecialistId
left outer join [User] u on u.UserId = s.UserId
This basically retrieves a list of Careplans with the newest Referral (if it exists), then joins the Specialist and User table based on any found Referrals
Thanks
Kind advice: target on what you want to express in the environment of your class model and LINQ, in stead of trying to reproduce SQL. If you do something like
context.CarePlans
.Select(cp => new { Plan = cp, FirstReferral = cp.Referrals.FirstOrDefault() }
(provided that it matches your context and ignoring ordering and other joins for clarity)
It would basically do what you want, but it may very well translate to an inline subquery, rather than an OUTER APPLY. To the same effect. And the execution plan probably won't differ very much.

SQL: Chaining Joins Efficiency

I have a query in my WordPress plugin like this:
SELECT users.*, U.`meta_value` AS first_name,M.`meta_value` AS last_name
FROM `nwp_users` AS users
LEFT JOIN `nwp_usermeta` U
ON users.`ID`=U.`user_id`
LEFT JOIN `nwp_usermeta` M
ON users.`ID`=M.`user_id`
LEFT JOIN `nwp_usermeta` C
ON users.`ID`=C.`user_id`
WHERE U.meta_key = 'first_name'
AND M.meta_key = 'last_name'
AND C.meta_key = 'nwp_capabilities'
ORDER BY users.`user_login` ASC
LIMIT 0,10
I'm new to using JOIN and I'm wondering how efficient it is to use so many JOIN in one query. Is it better to split it up into multiple queries?
The database schema can be found here.
JOIN usually isn't so bad if the keys are indexed. LEFT JOIN is almost always a performance hit and you should avoid it if possible. The difference is that LEFT JOIN will join all rows in the joined table even if the column you're joining is NULL. While a regular (straight) JOIN just joins the rows that match.
Post your table structure and we can give you a better query.
See this comment:
http://forums.mysql.com/read.php?24,205080,205274#msg-205274
For what it's worth, to find out what MySQL is doing and to see if you have indexed properly, always check the EXPLAIN plan. You do this by putting EXPLAIN before your query (literally add the word EXPLAIN before the query), then run it.
In your query, you have a filter AND C.meta_key = 'nwp_capabilities' which means that all the LEFT JOINs above it can be equally written as INNER JOINs. Because if the LEFT JOINS fail (LEFT OUTER is intended to preserve the results from the left side), the result will 100% be filtered out by the WHERE clause.
So a more optimal query would be
SELECT users.*, U.`meta_value` AS first_name,M.`meta_value` AS last_name
FROM `nwp_users` AS users
JOIN `nwp_usermeta` U
ON users.`ID`=U.`user_id`
JOIN `nwp_usermeta` M
ON users.`ID`=M.`user_id`
JOIN `nwp_usermeta` C
ON users.`ID`=C.`user_id`
WHERE U.meta_key = 'first_name'
AND M.meta_key = 'last_name'
AND C.meta_key = 'nwp_capabilities'
ORDER BY users.`user_login` ASC
LIMIT 0,10
(note: "JOIN" (alone) = "INNER JOIN")
Try explaining the query to see what is going on and if your select if optimized. If you haven't used explain before read some tutorials:
http://www.learn-mysql-tutorial.com/OptimizeQueries.cfm
http://www.databasejournal.com/features/mysql/article.php/1382791/Optimizing-MySQL-Queries-and-Indexes.htm

What's "wrong" with my DQL query?

I have the following SQL query:
select bank.*
from bank
join branch on branch.bank_id = bank.id
join account a on a.branch_id = branch.id
join import i on a.import_id = i.id
It returns exactly what I expect.
Now consider the following two DQL queries:
$q = Doctrine_Query::create()
->select('Bank.*')
->from('Bank')
->leftJoin('Branch')
->leftJoin('Account')
->leftJoin('Import');
-
$q = Doctrine_Query::create()
->select('Bank.*')
->from('Bank')
->innerJoin('Branch')
->innerJoin('Account')
->innerJoin('Import');
It would have been nice to have been able to use a "join()" method but, from the official Doctrine join documentation here, it says, "DQL supports two kinds of joins INNER JOINs and LEFT JOINs." For some reason that thoroughly escapes me, they chose not to support natural joins. Anyway, what that means is that the two queries above are my only options. Well, that's unfortunate because neither of them work.
The first query - the one with left joins - doesn't work because, of course, a left join and a natural join are two different things.
The second query doesn't work, either. Bafflingly, I get an error: "Unknown relation alias."
Why should Doctrine be able to figure out the alias for a LEFT JOIN but not for an INNER JOIN?
By the way, I realize INNER JOIN and JOIN are only nominally different but why implement the more specific one and not the canonical, natural one?
->select('b.*')
->from('Bank b')
->leftJoin('b.Branch h')
->select('b.*')
->from('Bank b')
->innerJoin('b.Branch h')
http://www.doctrine-project.org/documentation/manual/1_1/en/dql-doctrine-query-language:join-syntax