What .merge() is doing in this query? - sql

I'm a little confused about how to interpret this query, all because of merge situation, even after reading the documentation.
I would like to know what is the corresponding SQL query for the below
Analise.joins(dape: [empresa: :area_atuacao])
.merge(#dapes)
.where(analises: { atual: true })
.pluck('analises.img')
Output from calling to_sql on this query:
=> "SELECT \"analises\".*
FROM \"analises\"
INNER JOIN \"dapes\" ON \"dapes\".\"id\" = \"analises\".\"dape_id\"
INNER JOIN \"empresas\" ON \"empresas\".\"id\" = \"dapes\".\"empresa_id\"
INNER JOIN \"areas_atuacao\" ON \"areas_atuacao\".\"id\" = \"empresas\".\"area_atuacao_id\"
WHERE \"analises\".\"atual\" = 't'"

merge(other)
Merges in the conditions from other, if other is an ActiveRecord::Relation.
- Rails API Docs
A common example is using it for example to merge search conditions together:
#cities = City.all
#cities = #cities.merge(City.where(country: params[:country])) if params[:country]
#cities = #cities.merge(City.where(name: params[:name])) if params[:name]
You can also use it to create conditions on joined tables like in this example from the docs:
Post.where(published: true)
.joins(:comments)
.merge( Comment.where(spam: false) )
This creates the same query as:
Post.where(published: true)
.joins(:comments)
.where(comments: { spam: false })
The exact query in you example depends on the scope defined in the instance variable #dapes. But judging from the SQL generated .merge(#dapes) seems to do nothing. This could be the case if #dapes = Dape.all for example.
Merging a condition with no where clause does nothing:
irb(main):003:0> User.merge(User.all)
User Load (0.6ms) SELECT "users".* FROM "users" LIMIT ? [["LIMIT", 11]]
=> #<ActiveRecord::Relation []>
irb(main):004:0>

Here is your formatted better for a better readability. The merge is used to transfer over your conditions so that nothing gets overriden.
SELECT analises.*
FROM analises
INNER JOIN dapes ON dapes.id = analises.dape_id
INNER JOIN empresas ON empresas.id = dapes.empresa_id
INNER JOIN areas_atuacao ON areas_atuacao.id = empresas.area_atuacao_id
WHERE analises.atual = 't'
It seems .merge() is used when you are joining tables to be more specific of what exactly you are joining.
In this case you are .merge(#dapes) seems to be merging your tables on all values of #dapes.
One way to get a better understanding of what impact .merge(#dapes) is having on your query is to run the to_sql command again how the sql has changed.
Footnote
I took the sql that was generated from the first to_sql you ran and entered it into the Scuttle Editor and got the following rails commands. I don't know if this helps but I just thought it was food for thought!
Analise.select(Analise.arel_table[Arel.star]).where(Analise.arel_table[:atual].eq('t')).joins(
Analise.arel_table.join(Dape.arel_table).on(
Dape.arel_table[:id].eq(Analise.arel_table[:dape_id])
).join_sources
).joins(
Analise.arel_table.join(Empresa.arel_table).on(
Empresa.arel_table[:id].eq(Dape.arel_table[:empresa_id])
).join_sources
).joins(
Analise.arel_table.join(AreasAtuacao.arel_table).on(
AreasAtuacao.arel_table[:id].eq(Empresa.arel_table[:area_atuacao_id])
).join_sources
)

Related

Query with a sub query that requires multiple values

I can't really think of a title so let me explain the problem:
Problem: I want to return an array of Posts with each Post containing a Like Count. The Like Count is for a specific post but for all users who have liked it
For example:
const posts = [
{
post_id: 1,
like_count: 100
},
{
post_id: 2,
like_count: 50
}
]
Now with my current solution, I don't think it's possible but here is what I have so far.
My query currently looks like this (produced by TypeORM):
SELECT
"p"."uid" AS "p_uid",
"p"."created_at" AS "post_created_at",
"l"."uid" AS "like_uid",
"l"."post_liked" AS "post_liked",
"ph"."path" AS "path",
"ph"."title" AS "photo_title",
"u"."name" AS "post_author",
(
SELECT
COUNT(like_id) AS "like_count"
FROM
"likes" "l"
INNER JOIN
"posts" "p"
ON "p"."post_id" = "l"."post_id"
WHERE
"l"."post_liked" = true
AND l.post_id = $1
)
AS "like_count"
FROM
"posts" "p"
LEFT JOIN
"likes" "l"
ON "l"."post_id" = "p"."post_id"
INNER JOIN
"photos" "ph"
ON "ph"."photo_id" = "p"."photo_id"
INNER JOIN
"users" "u"
ON "u"."user_id" = "p"."user_id"
At $1 is where the post.post_id should go (but for the sake of testing I stuck the first post's id in there), assuming I have an array of post_ids ready to put in there.
My TypeORM query looks like this
async findAll(): Promise<Post[]> {
return await getRepository(Post)
.createQueryBuilder('p')
.select(['p.uid'])
.addSelect(subQuery =>
subQuery
.select('COUNT(like_id)', 'like_count')
.from(Like, 'l')
.innerJoin('l.post', 'p')
.where('l.post_liked = true AND l.post_id = :post_id', {post_id: 'a16f0c3e-5aa0-4cf8-82da-dfe27d3f991a'}), 'like_count'
)
.addSelect('p.created_at', 'post_created_at')
.addSelect('u.name', 'post_author')
.addSelect('l.uid', 'like_uid')
.addSelect('l.post_liked', 'post_liked')
.addSelect('ph.title', 'photo_title')
.addSelect('ph.path', 'path')
.leftJoin('p.likes', 'l')
.innerJoin('p.photo', 'ph')
.innerJoin('p.user', 'u')
.getRawMany()
}
Why am I doing this? What I am trying to avoid is calling count for every single post on my page to return the number of likes for each post. I thought I could somehow do this in a subquery but now I am not sure if it's possible.
Can someone suggest a more efficient way of doing something like this? Or is this approach completely wrong?
I find working with ORMs terrible and cannot help you with this. But the query itself has flaws:
You want one row per post, but you are joining likes, thus getting one row per post and like.
Your subquery is not related to your main query. It should instead relate to the main query's post.
The corrected query:
SELECT
p.uid,
p.created_at,
ph.path AS photo_path,
ph.title AS photo_title,
u.name AS post_author,
(
SELECT COUNT(*)
FROM likes l
WHERE l.post_id = p.post_id
AND l.post_liked = true
) AS like_count
FROM posts p
JOIN photos ph ON ph.photo_id = p.photo_id
JOIN users u ON u.user_id = p.user_id
ORDER BY p.uid;
I suppose it's quite easy for you to convert this to TypeORM. There is nothing wrong with counting for every single post, by the way. It is even necessary to get the result you are after.
The subquery could also be moved to the FROM clause using GROUP BY l.post_id within. As is, you are getting all posts, regardless of them having likes or not. By moving the subquery to the FROM clause, you could instead decide between INNER JOIN and LEFT OUTER JOIN.
The query would benefit from the following index:
CREATE INDEX idx ON likes (post_id, post_liked);
Provide this index, if the query seems too slow.

Find_by_sql as a Rails scope

r937 from Sitepoint was kind enough to help me figure out the query I need to return correct results from my database.
What I need is to be able to use this query as a scope and to be able to chain other scopes onto this one.
The query is:
SELECT coasters.*
FROM (
SELECT order_ridden,
MAX(version) AS max_version
FROM coasters
GROUP BY order_ridden
) AS m
INNER JOIN coasters
ON coasters.order_ridden = m.order_ridden
AND COALESCE(coasters.version,0) = COALESCE(m.max_version,0)
I tried making a scope like so:
scope :uniques, lambda {
find_by_sql('SELECT coasters.*
FROM (
SELECT order_ridden,
MAX(version) AS max_version
FROM coasters
GROUP BY order_ridden
) AS m
INNER JOIN coasters
ON coasters.order_ridden = m.order_ridden
AND COALESCE(coasters.version,0) = COALESCE(m.max_version,0)')
}
But when I tried chaining another one of my scopes onto it, it failed. Is there a way I can run this query like a normal scope?
find_by_sql returns an Array. But you need an ActiveRecord::Relation to chain additional scopes.
One way to rewrite your query using ActiveRecord methods that will return an ActiveRecord::Relation would be to rearrange it a little bit so that the nesting happens in the INNER JOIN portion.
You may want to try something like:
scope :uniques, lambda {
max_rows = select("order_ridden, MAX(version) AS max_version").group(:order_ridden)
joins("INNER JOIN (#{max_rows.to_sql}) AS m
ON coasters.order_ridden = m.order_ridden
AND COALESCE(coasters.version,0) = COALESCE(m.max_version,0)")
}

ActiveRecord left outer join with and clause

I am using Rails 3 with ActiveRecord and I cannot get it to generate the query I want without inserting almost plain SQL in it.
I simply want to execute this SQL query (actually the query is a little more complex but it's really this part that I cannot get right).
SELECT DISTINCT users.*, possible_dates.*
FROM users
LEFT OUTER JOIN possible_dates
ON possible_dates.user_id = users.id
AND possible_dates.event_id = MY_EVENT_ID;
which I managed to using
User.includes(:possible_dates)
.joins("LEFT OUTER JOIN possible_dates
ON possible_dates.user_id = users.id
AND possible_dates.event_id = #{ActiveRecord::Base.sanitize(self.id)}"
)
.uniq
but I feel that I am missing something to simply add the AND condition of the left join using ActiveRecord query methods.
I also tried to do something like this
User.includes(:possible_dates)
.where('possible_dates.event_id' => self.id)
.uniq
but this yields (as expected), a query with a WHERE clause at the end and not the AND clause I want on the join.
By the way, self in the two snippets above is an instance of my Event class.
Thank you for your help.
(1 year later...)
After looking around, I found out that the best was probably to use arel tables.
For the above example, the following code would work.
ut = User.arel_table
pt = PossibleDate.arel_table
et = Event.arel_table
User.joins(
ut.join(pt, Arel::Nodes::OuterJoin)
.on(pt[:user_id].eq(u[:id])
.and(pt[:event_id].eq(1))
).join_sql
).includes(:possible_dates).uniq
the 1 in eq(1) should be replaced by the correct event id.

How to join on subqueries using ARel?

I have a few massive SQL request involving join across various models in my rails application.
A single request can involve 6 to 10 tables.
To run the request faster I want to use sub-queries in the joins (that way I can filter these tables before the join and reduce the columns to the ones I need). I'm trying to achieve this using ARel.
I thought I found the solution to my problem there: How to do joins on subqueries in AREL within Rails,
but things must have changed because I get undefined method '[]' for Arel::SelectManager.
Does anybody have any idea how to achieve this (without using strings) ?
Pierre, I thought a better solution could be the following (inspiration from this gist):
a = A.arel_table
b = B.arel_table
subquery = b.project(b[:a_id].as('A_id')).where{c > 4}
subquery = subquery.as('intm_table')
query = A.join(subquery).on(subquery[:A_id].eq(a[:id]))
No particular reason for naming the alias as "intm_table", I just thought it would be less confusing.
OK so my main problem was that you can't join a Arel::SelectManager ... BUT you can join a table aliasing.
So to generate the request in my comment above:
a = A.arel_table
b = B.arel_table
subquery = B.select(:a_id).where{c > 4}
query = A.join(subquery.as('B')).on(b[:a_id].eq(a[:id])
query.to_sql # SELECT A.* INNER JOIN (SELECT B.a_id FROM B WHERE B.c > 4) B ON A.id = B.a_id
Was looking for this, and was helped by the other answers, but there are some error in both, e.g. A.join(... should be a.join(....
And I also missed how to build an ActiveRecord::Relation.
Here is how to build an ActiveRecord::Relation, in Rails 4
a = A.arel_table
b = B.arel_table
subsel = B.select(b[:a_id]).where(b[:c].gt('4')).as('sub_select')
joins = a.join(subsel).on(subsel[:a_id].eq(a[:id])).join_sources
rel = A.joins(joins)
rel.to_sql
#=> "SELECT `a`.* FROM `a` INNER JOIN (SELECT `b`.`a_id` FROM `b` WHERE (`b`.`c` > 4)) sub_select ON sub_select.`a_id` = `a`.`id`"

Need help in SQL and Sequel involving inner join and where/filter

Need help transfer sql to sequel:
SQL:
SELECT table_t.curr_id FROM table_t
INNER JOIN table_c ON table_c.curr_id = table_t.curr_id
INNER JOIN table_b ON table_b.bic = table_t.bic
WHERE table_c.alpha_id = 'XXX' AND table_b.name='Foo';
I'm stuck in the sequel, I don't know how to filter, so far like this:
cid= table_t.select(:curr_id).
join(:table_c, :curr_id=>:curr_id).
join(:table_b, :bic=>:bic).
filter( ????? )
Answer with better idiom than above is appreciated as well.Tnx.
UPDATE:
I have to modify a little to make it works
cid = DB[:table_t].select(:table_t__curr_id).
join(:table_c, :curr_id=>:curr_id).
join(:table_b, :bic=>:table_t__bic). #add table_t or else ERROR: column table_c.bic does not exist
filter(:table_c__alpha_id => 'XXX',
:table_b__name => 'Foo')
without filter,
cid = DB[:table_t].select(:table_t__curr_id).
join(:table_c, :curr_id=>:curr_id, :alpha_id=>'XXX').
join(:table_b, :bic=>:table_t__bic, :name=>'Foo')
btw I use pgsql 9.0
This is the pure Sequel way:
cid = DB[:table_t].select(:table_t__curr_id).
join(:table_c, :curr_id=>:curr_id).
join(:table_b, :bic=>:bic).
filter(:table_c__alpha_id => 'XXX',
:table_b__name => 'Foo')
Note that you can also do this without a WHERE, since you are using INNER JOIN:
cid = DB[:table_t].select(:table_t__curr_id).
join(:table_c, :curr_id=>:curr_id, :alpha_id=>'XXX').
join(:table_b, :bic=>:bic, :name=>'Foo')
I think you can always use something like
.filter('table_c.alpha_id = ? AND table_b.name = ?', 'XXX', 'Foo')
One thing to remember is that Sequel is plenty happy to let you use raw SQL. If you find it easier to express the query as SQL go ahead, just be sure to comment it so you can find it later. Then you can return to that line and adjust it so it's taking advantage of Sequel's awesomeness.
Try to avoid anything that is specific to a particular DBM though, because you'll be reducing portability which is one of the big reasons for using an ORM to generate the queries.