Sequel -- can I alias subqueries in a join? - sql

Using Sequel I'd like to join two subqueries together that share some column names, and then table-qualify those columns in the select.
I understand how to do this if the two datasets are just tables. E.g. if I have a users table and an items table, with items belonging to users, and I want to list the items' names and their owners' names:
#db[:items].join(:users, :id => :user_id).
select{[items__name, users__name.as(user_name)]}
produces
SELECT "items"."name", "users"."name" AS "user_name"
FROM "items"
INNER JOIN "users" ON ("users"."id" = "items"."user_id")
as desired.
However, I'm unsure how to do this if I'm joining two arbitrary datasets representing subqueries (call them my_items and my_users)
The syntax would presumably take the form
my_items.join(my_users, :id => :user_id).
select{[ ... , ... ]}
where I would supply qualified column names to access my_users.name and my_items.name. What's the appropriate syntax to do this?
A partial solution is to use t1__name for the first argument, as it seems that the dataset supplied to a join is aliased with t1, t2, etc. But that doesn't help me qualify the item name, which I need to supply to the second argument.
I think the most desirable solution would enable me to provide aliases for the datasets in a join, e.g. like the following (though of course this doesn't work for a number of reasons)
my_items.as(alias1).join(my_users.as(alias2), :id => :user_id).
select{[alias1__name, alias2__name ]}
Is there any way to do this?
Thanks!
Update
I think from_self gets me part of the way there, e.g.
my_items.from_self(:alias => :alias1).join(my_users, :id => :user_id).
select{[alias1__name, t1__name]}
seems to do the right thing.

OK, thanks to Ronald Holshausen's hint, got it. The key is to use .from_self on the first dataset, and provide the :table_alias option in the join:
my_items.from_self(:alias => :alias1).
join(my_users, {:id => :user_id}, :table_alias => :alias2).
select(:alias1__name, :alias2__name)
yields the SQL
SELECT "alias1"."name", "alias2"."name"
FROM ( <my_items dataset> ) AS "alias1"
INNER JOIN ( <my_users dataset> ) AS "alias2"
ON ("alias2"."id" = "alias1"."user_id")
Note that the join hash (the second argument of join) needs explicit curly braces to distinguish it from the option hash that includes :table_alias.

The only way I found was to use the from method on the DB, and the :table_alias on the join method, but these don't work with models so I had to use the table_name from the model class. I.e.,
1.9.3p125 :018 > #db.from(Dw::Models::Contract.table_name => 'C1')
=> #<Sequel::SQLite::Dataset: "SELECT * FROM `vDimContract` AS 'C1'">
1.9.3p125 :019 > #db.from(Dw::Models::Contract.table_name => 'C1').join(Dw::Models::Contract.table_name, {:c1__id => :c2__id}, :table_alias => 'C2')
=> #<Sequel::SQLite::Dataset: "SELECT * FROM `vDimContract` AS 'C1' INNER JOIN `vDimContract` AS 'C2' ON (`C1`.`Id` = `C2`.`Id`)">
1.9.3p125 :020 > #db.from(Dw::Models::Contract.table_name => 'C1').join(Dw::Models::Product.table_name, {:product_id => :c1__product_id}, :table_alias => 'P1')
=> #<Sequel::SQLite::Dataset: "SELECT * FROM `vDimContract` AS 'C1' INNER JOIN `vDimProduct` AS 'P1' ON (`P1`.`ProductId` = `C1`.`ProductId`)">
The only thing I don't like about from_self is it uses a subquery:
1.9.3p125 :021 > Dw::Models::Contract.from_self(:alias => 'C1')
=> #<Sequel::SQLite::Dataset: "SELECT * FROM (SELECT * FROM `vDimContract`) AS 'C1'">

Related

Is it the right way to do union query after joins in rails 3.2?

There are 3 models log (which belongs to customer), customer and project in rails 3.2 app. Both customer and project have sales_id field. Here is the query we want to do:
return the following logs for customers 1) logs for customers whose sales_id is equal to session[:user_id] and 2) logs for customers whose projects' sales_id is equal to session[:user_id]
The rails query for 1) could be:
Log.joins(:customer).where(:customers => {:sales_id => session[:user_id]})
Rails query for 2) could be:
Log.joins(:customer => :projects).where(:projects => {:sales_id => session[:user_id})
To combine the queries above, is it the right way to do the following?
Log.joins([:customer, {:customer => :projects}]).where('customers.sales_id = id OR projects.sales_id = id', id: session[:user_id])
Chapter 11.2.4 in http://guides.rubyonrails.org/v3.2.13/active_record_querying.html talks about an interesting query case. We haven't tested the query above yet. We would like to know if the union query above is indeed correct.
Rails doesn't support union natively. In your case, I think it doesn't need union, just use left outer join.
Log.joins('left outer JOIN `customers` ON `customers`.`id` = `logs`.`customer_id`
left outer JOIN `projects` ON `projects`.`customer_id` = `customers`.`id`').where('customers.sales_id = :id OR projects.sales_id = :id', id: session[:user_id]).distinct

Is it possible to use NH QueryOver to fetch joined entities in one query?

SQL query is:
select B.* from A inner join B on A.b_id = B.id where A.x in (1,2,3)
A <-> B relation is many-to-one
I need to filter by A but fetch related B.
UPDATE:
I tried this NH QueryOver
Session.QueryOver<A>.Where(a => a.x.IsIn(array)).JoinQueryOver(a => a.B).Select(a => a.B).List<B>()
but it results in a N+1 sequence of queries: the first one fetches IDs of related Bs, and others fetch related Bs one by one by ID (analyzed via NHProf). I want it to fetch a list of Bs in one go.
UPDATE 2:
for now I worked around this with subquery
Session.QueryOver(() => b).WithSubquery.WhereExists(QueryOver.Of<A>().Where(a => a.x.IsIn(array)).And(a => a.b_id == b.id).Select(a => a.id)).List<B>()
but I still hope to see an example of QueryOver without subquery as I tend to think subquery is less efficient.
This works (at least in my test application):
var list = session.QueryOver<A>()
.Where(a => a.X.IsIn(array))
.Fetch(x => x.B).Eager
.List<A>()
.Select(x => x.B);
Note that the .Select() statement is a normal LINQ statement, not part of NHibernate.
Generated SQL:
SELECT
this_.Id as Id0_1_,
this_.B as B0_1_,
this_.X as X0_1_,
b2_.Id as Id1_0_,
b2_.SomeValue as SomeValue1_0_
FROM A this_ left outer join B b2_ on this_.B=b2_.Id
WHERE this_.X in (?, ?, ?)
It's not optimal if A is a very large class, of course.
An NHibernate-only solution with a subquery:
var candidates = QueryOver.Of<A>()
.Where(a => a.X.IsIn(array))
.Select(x => x.B.Id);
var list = session.QueryOver<B>()
.WithSubquery.WhereProperty(x => x.Id).In(candidates).List();
I'll try to find the reason why the most obvious solution (just adding Fetch().Eager) doesn't work as expected. Stay tuned!

SQL problems when migrating from MySQL to PostgreSQL

I have a Ruby on Rails 2.3.x application that I'm trying to migrate from my own VPS to Heroku, including porting from SQLite (development) and MySQL (production) to Postgres.
This is a typical Rails call I'm using:
spots = Spot.paginate(:all, :include => [:thing, :user, :store, {:thing => :tags}, {:thing => :brand}], :group => :thing_id, :order => order, :conditions => conditions, :page => page, :per_page => per_page)
Question 1: I get a lot of errors like PG::Error: ERROR: column "spots.id" must appear in the GROUP BY clause or be used in an aggregate function. SQLite/MySQL was evidently more forgiving here. Of course I can easily fix these by adding the specified fields to my :group parameter, but I feel I'm messing up my code. Is there a better way?
Question 2: If I throw in all the GROUP BY columns that Postgres is missing I end up with the following statement (only :group has changed):
spots = Spot.paginate(:all, :include => [:thing, :user, :store, {:thing => :tags}, {:thing => :brand}], :group => 'thing_id,things.id,users.id,spots.id', :order => order, :conditions => conditions, :page => page, :per_page => per_page)
This in turn produces the following SQL code:
SELECT * FROM (SELECT DISTINCT ON ("spots".id) "spots".id, spots.created_at AS alias_0 FROM "spots"
LEFT OUTER JOIN "things" ON "things".id = "spots".thing_id
WHERE (spots.recommended_to_user_id = 1 OR spots.user_id IN (1) OR things.is_featured = 't')
GROUP BY thing_id,things.id,users.id,spots.id) AS id_list
ORDER BY id_list.alias_0 DESC LIMIT 16 OFFSET 0;
...which produces the error PG::Error: ERROR: missing FROM-clause entry for table "users". How can I solve this?
Question 1:
...Is there a better way?
Yes. Since PostgreSQL 9.1 the primary key of a table logically covers all columns of a table in the GROUP BY clause. I quote the release notes for version 9.1:
Allow non-GROUP BY columns in the query target list when the primary
key is specified in the GROUP BY clause (Peter Eisentraut)
Question 2:
The following statement ... produces the error
PG::Error: ERROR: missing FROM-clause entry for table "users"
How can I solve this?
First (as always!), I formatted your query to make it easier to understand. The culprit has bold emphasis:
SELECT *
FROM (
SELECT DISTINCT ON (spots.id)
spots.id, spots.created_at AS alias_0
FROM spots
LEFT JOIN things ON things.id = spots.thing_id
WHERE (spots.recommended_to_user_id = 1 OR
spots.user_id IN (1) OR
things.is_featured = 't')
GROUP BY thing_id, things.id, users.id, spots.id
) id_list
ORDER BY id_list.alias_0 DESC
LIMIT 16
OFFSET 0;
It's all obvious now, right?
Well, not all of it. There is a lot more. DISTINCT ON and GROUP BY in the same query for one, which has its uses, but not here. Radically simplify to:
SELECT s.id, s.created_at AS alias_0
FROM spots s
WHERE s.recommended_to_user_id = 1 OR
s.user_id = 1 OR
EXISTS (
SELECT 1 FROM things t
WHERE t.id = s.thing_id
AND t.is_featured = 't')
ORDER BY s.created_at DESC
LIMIT 16;
The EXISTS semi-join avoids the later need to GROUP BY a priori. This should be much faster (besides being correct) - if my assumptions about the missing table definitions hold.
Going the "pure SQL" route opened up a can of worms for me, so I tried keeping the will_paginate gem and tweak the Spot.paginate parameters instead. The :joins parameter turned out to be very helpful.
This is currently working for me:
spots = Spot.paginate(:all, :include => [:thing, {:thing => :tags}, {:thing => :brand}], :joins => [:user, :store, :thing], :group => 'thing_id,things.id,users.id,spots.id', :order => order, :conditions => conditions, :page => page, :per_page => per_page)

Rails 3.0 One-One Association Using associated model in WHERE clause

When I do:
conditions = {:first_name => 'Chris'}
Patient.joins(:user).find(:all, :conditions => conditions)
It Produces (and fails because the first_name is not in the patients table)
SELECT "patients".* FROM "patients" INNER JOIN "users" ON "users"."id" = "patients"."user_id" WHERE "patients"."first_name" = 'Chris'
I need to be able to query the User model's fields also and get back Patient objects. Is this possible?
Try this:
conditions = ['users.first_name = ?', 'Chris']
Patient.joins(:user).find(:all, :conditions => conditions)
Try changing you conditions hash to:
conditions = {'users.first_name' => 'Chris'}
I've used this style in Rails 2.3, and it worked great for me. Cheers!

Pull back rows from multiple tables with a sub-select?

I have a script which generates queries in the following fashion (based on user input):
SELECT * FROM articles
WHERE (articles.skeywords_auto ilike '%pm2%')
AND spubid IN (
SELECT people.spubid FROM people
WHERE (people.slast ilike 'chow')
GROUP BY people.spubid)
LIMIT 1;
The resulting data set:
Array ( [0] =>
Array (
[spubid] => A00603
[bactive] => t
[bbatch_import] => t
[bincomplete] => t
[scitation_vis] => I,X
[dentered] => 2009-07-24 17:07:27.241975
[sentered_by] => pubs_batchadd.php
[drev] => 2009-07-24 17:07:27.241975
[srev_by] => pubs_batchadd.php
[bpeer_reviewed] => t
[sarticle] => Errata: PM2.5 and PM10 concentrations from the Qalabotjha low-smoke fuels macro-scale experiment in South Africa (vol 69, pg 1, 2001)
[spublication] => Environmental Monitoring and Assessment
[ipublisher] =>
[svolume] => 71
[sissue] =>
[spage_start] => 207
[spage_end] => 210
[bon_cover] => f
[scover_location] =>
[scover_vis] => I,X
[sabstract] =>
[sabstract_vis] => I,X
[sarticle_url] =>
[sdoi] =>
[sfile_location] =>
[sfile_name] =>
[sfile_vis] => I
[sscience_codes] =>
[skeywords_manual] =>
[skeywords_auto] => 1,5,69,2001,africa,assessment,concentrations,environmental,errata,experiment,fuels,low-smoke,macro-scale,monitoring,pg,pm10,pm2,qalabotjha,south,vol
[saward_number] =>
[snotes] =>
)
The problem is that I also need all the columns from the 'people' table (as referenced in the sub select) to come back as part of the data set. I haven't (obviously) done much with sub selects in the past so this approach is very new to me. How do I pull back all the matching rows/columns from the articles table AS WELL as the rows/column from the people table?
Are you familiar with joins? Using ANSI syntax:
SELECT DISTINCT *
FROM ARTICLES t
JOIN PEOPLE p ON p.spubid = t.spudid AND p.slast ILIKE 'chow'
WHERE t.skeywords_auto ILIKE'%pm2%'
LIMIT 1;
The DISTINCT saves from having to define a GROUP BY for every column returned from both tables. I included it because you had the GROUP BY on your subquery; I don't know if it was actually necessary.
Could you not use a join instead of a sub-select in this case?
SELECT a.*, p.*
FROM articles as a
INNER JOIN people as p ON a.spubid = p.spubid
WHERE a.skeywords_auto ilike '%pm2%'
AND p.slast ilike 'chow'
LIMIT 1;
Lets start from the beginning
You shouldn't need a group by. Use distinct instead (you aren't doing any aggregating in the inner query).
To see the contents of the inner table, you actually have to join it. The contents are not exposed unless it shows up in the from section. A left outer join from the people table to the articles table should be equivalent to an IN query :
SELECT *
FROM people
LEFT OUTER JOIN articles ON articles.spubid = people.spubid
WHERE people.slast ilike 'chow'
AND articles.skeywords_auto ilike '%pm2%'
LIMIT 1