How do I access the joined columns when using custom Arel joins? - sql

I have a simple database with the following schema:
Book has many Tags through Taggings
Book has many Users through ReadingStatuses
What I want to do is to list all of the books, their tags, and a reading status of the currently logged in user with each book. I've managed to write this using Arel (with the arel-helpers gem), but I don't know how to access the results in each book entry while iterating over the books array.
Here's the query
join_params = Book.arel_table.join(ReadingStatus.arel_table, Arel::OuterJoin)
.on(Book[:id].eq(ReadingStatus[:book_id])
.and(ReadingStatus[:user_id].eq(User.first.id)))
.join_sources
books = Book.all.includes(:tags).joins(join_params)
and the respective SQL it generates
SELECT "books".* FROM "books"
LEFT OUTER JOIN "reading_statuses"
ON "books"."id" = "reading_statuses"."book_id"
AND "reading_statuses"."user_id" = 'XXX'
There's nothing really to be done with the tags, since includes will automatically make everything work when calling book.tags, but what I don't know is how to access the ReadingStatus that is joined to each Book when iterating over the books result?

Try using the includes instead of joins. "includes" does eager fetching, but if you don't mind that it might make you query look a lot simpler.
You will also not have to explicitly mention the left outer join.
See if that helps:
Pulling multiple levels of data efficiently in Rails
Rails: How to fetch records that are 2 'has_many' levels deep?
Eager loading
Make sure you include the call to references
"ReadingStatus[:user_id].eq(User.first.id)" can be shifted into the where clause

Related

Automatically connect SQL tables based on keys

Is there a method to automatically join tables that have primary to foreign relationship rather then designate joining on those values?
The out and out answer is "no" - no RDBMS I know of will allow you to get away with not specifying columns in an ON clause intended to join two tables in a non-cartesian fashion, but it might not matter...
...because typically multi tier applications these days are built with data access libraries that DO take into account the relationships defined in a database. Picking on something like entity framework, if your database exists already, then you can scaffold a context in EF from it, and it will make a set of objects that obey the relationships in the frontend code side of things
Technically, you'll never write an ON clause yourself, because if you say something to EF like:
context.Customers.Find(c => c.id = 1) //this finds a customer
.Orders //this gets all the customer's orders
.Where(o => o.date> DateTIme.UtcNow.AddMonths(-1)); //this filters the orders
You've got all the orders raised by customer id 1 in the last month, without writing a single ON clause yourself... EF has, behind the scenes, written it but in the spirit of your question where there are tables related by relation, we've used a framework that uses that relation to relate the data for the purposes thtat the frontend put it to.. All you have to do is use the data access library that does this, if you have an aversion to writing ON clauses yourself :)
It's a virtual certaintythat there will be some similar ORM/mapping/data access library for your front end language of choice - I just picked on EF in C# because it's what I know. If you're after scouting out what's out there, google for {language of choice} ORM (if you're using an OO language) - you mentioned python,. seems SQLAlchemy is a popular one (but note, SO answers are not for recommending particular softwares)
If you mean can you write a JOIN at query time that doesn't need an ON clause, then no.
There is no way to do this in SQL Server.
I am not sure if you are aware of dbForge; it may help. It recognises joinable tables automatically in following cases:
The database contains information that specifies that the tables are related.
If two columns, one in each table, have the same name and data type.
Forge Studio detects that a search condition (e.g. the WHERE clause) is actually a join condition.

How to prevent rails `has_many` relation joining two huge tables

I am using Ruby on Rails 3.1.10 in developing a web application.
Objective is to find all users that a user is following.
Let there be two models User and Following
In User model:
has_many :following_users, :through => :followings
When calling user.following_users, rails help generates a query that INNER JOIN between users and followings table by its magical default.
When users table has over 50,000 records while followings table has over 10,000,000 records, the inner join generated is resource demanding.
Any thoughts on how to optimize the performance by avoiding inner joining two big tables?
To avoid a single query with inner join, you can do 2 select queries by using the following method
# User.rb
# assuming that Following has a followed_id column for user that is being followed
def following_users_nojoin
#following_users_nojoin ||= User.where("id IN (?)", followings.map(&:followed_id))
end
This will not create a join table but would make two sql queries. One to get all the followings that belong to the user (unless it is already in the cache) and second query to find all the followed users. A user_id index on following, as suggested in the comment, would speed up the first query where we get all the followings for the user.
The above method would be faster than a single join query if the followings of a user have already been retrieved.
Read this for details on whether it is faster to make multiple select queries over a single query with join. The best way to find out which one is faster is to benchmark both methods on your production database.

Hibernate Criteria queries and how to efficiently join data?

"THIS IS MY SQL THAT I WANT TO CONVERT TO CRTIERIA:
select be.* from BlogEntry be join Blog b on be.blog=b.id join Follower f on b.id=f.blogId where be.publishStatus='published' and be.secured=false and f.user=? union select be1.* from BlogEntry be1 join SecureUser s on be1.id=s.blogEntryId join User u on s.userProfile=u.userProfile and u.id=? order by publishDate desc";
hello, folks. i have been trying to use HQL and native SQL to execute the above query and i have been frustrated at every turn, for the most part because doing UNION is super awkward in Hibernate. even if you try a SQLQuery you still have the whole mess of establishing your entity relationships by being FORCED to include every single attribute of every subclass referenced in the SQL. this is proving to be a total pain to get past.
SO, i am moving on to a possible criteria query solution, but i think i need some help. the query below is totally fine in MySQL workbench and fast as lightning. the hump i am trying to get over with the criteria query is that some of my entity relationships are defined by foreign key references in the tables and some are not. when they ARE, i can, of course, do something like this (which evaluates part of my query, before the UNION):
ExtendedDetachedCriteria entryDetachedCriteria = extendedDetachedCriteria.forClass(BlogEntry.class);
entryDetachedCriteria.createAlias("blogEntry","blogEntry");
entryDetachedCriteria.createAlias("blogEntry.blog", "blog");
etc, etc...
HOWEVER, when i am joining data in a different way, like in this portion of the SQL:
select be1.* from BlogEntry be1 join SecureUser s on be1.id=s.blogEntryId (no actual foreign key relationship defined in the tables, SecureUser entities are just stamped with the relevant BlogEntry ID when they are created)
how should i write the criteria queries differently from the way demonstrated above?
i realize that questions like these are a total pain to get your head around if you are not already knee deep in trying to solve - please excuse the convoluted-ness of the question i am asking. i would deeply appreciate any guidance someone could offer, even if it's "get your hibernate act together, ya doofus!". sort of stuck at the moment.
createAlias creates an inner join using an association between entities. So,
criteria.createAlias("blogEntry","blogEntry");
is equivalent to the following HQL:
inner join rootEntity.blogEntry as blogEntry.
The root entity is BlogEntry. And I guess you don't have a blogEntry field in the BlogEntry entity. So this line doesn't make sense.
If you don't have any association between two entities, you can't make a join. You're reduced to making an inner join in the form of an equality between two fields in the where clause:
select be1 from BlogEntry be1, SecureUser s
where be1.id = s.blogEntryId
But since Criteria only allows to select from one root entity, and a series of joined associations, it's impossible to do this using Criteria.
Your best bets are:
to do it in SQL
to do it using 2 separate HQL queries, and join the results using Java.

Is eager loading same as join fetch?

Is eager fetch same as join fetch?
I mean whether eagerly fetching a has-many relation fires 2 queries or a single join query?
How does rails active record implement a join fetch of associations as it doesnt know the table's meta-data in first hand (I mean columns in the table)? Say for example i have
people - id, name
things - id, person_id, name
person has one-to-many relation with the things. So how does it generate the query with all the column aliases even though it cannot know it when i do a join fetch on people?
An answer hasn't been accepted so I will try to answer your questions as I understand them:
"how does it know all the fields available in a table?"
It does a SQL query for every class that inherits from ActiveRecord::Base. If the class is 'Dog', it will do a query to find the column names of the table 'dogs'. In production mode it should only do this query once per run of the server -- in development mode it does it a lot. The query will differ depending on the database you use, and it is usually an expensive query.
"Say if i have a same name for column in a table and in an associated table how does it resolve this?"
If you are doing a join, it generates sql using the table names as prefixes to avoid ambiguities. In fact, if you are doing a join in Rails and want to add a condition (using custom SQL) for name, but both the main table and join table have a name column, you need to specify the table name in your sql. (e.g. Human.join(:pets).where("humans.name = 'John'"))
"I mean whether eagerly fetching a has-many relation fires 2 queries or a single join query?"
Different Rails versions are different. I think that early versions did a single join query at all times. Later versions would sometimes do multiple queries and sometimes a single join query, based on the realization that a single join query isn't always as performant as multiple queries. I'm not sure of the exact logic that it uses to decide. Recently, in Rails 3, I am seeing multiple queries happening in my current codebase -- but maybe it sometimes does a join as well, I'm not sure.
It knows the columns through a type of reflection. Ruby is very flexible and allows you to build functionality that will be used/defined during runtime and doesn't need to be stated ahead of time. It learns the associated "person_id" column by interpreting the "belongs_to :person" and knowing that "person_id" is the field that would be associated and the table would be called "people".
If you do People.includes(:things) then it will generate 2 queries, 1 that gets the people and a second that gets the things that have a relation to the people that exist.
http://guides.rubyonrails.org/active_record_querying.html

SQL Modeling / Query Question

I currently have this database structure:
One entry can have multiple items of the type "file", "text" and "url".
Everyone of these items has exactly one corresponding item in either the texts, urls or files table - where data is stored.
I need a query to efficiently select an entry with all its corresponding items and their data.
So my first approach was someting like
SELECT * FROM entries LEFT JOIN entries_items LEFT JOIN texts LEFT JOIN urls LEFT JOIN files
and then loop through it and do the post processing in my application.
But the thing is that its very unlikely that multiple items of different types exist. Its even a rare case that more then one item exists per entry. And in most cases it will be a file. But I need It anways...
So not to scan all 3 tables for eveyr item I thought I could do something like case/switch and scan the corresponding table based on the value of "type" in entries_items.
But I couldn't get it working.
I also thought about making the case/switch logic in the application, but then I would have multiple queries which would probabably be slower as the mysql server will be external.
I can also change the structure if you have a better approach!
I also having all the fields of "texts", "urls" and "files" in side the table entries_items, as its only a 1:1 relation and just have everything that is not needed null.
What would be the pros/cons of that? I think it needs more storage space and i cant do my cosntraints as i have them now. Everything needs also to be null...
Well I am open to all sorts of ideas. The application is not written yet, so I can basically change whatever I like.
You have three different entity types (URL, TEXT, FILE) being linked to the primary ENTRIES table via the intermediary table ENTRIES_ITEMS, and you are violating normal form with this "conditional join" approach. Given your structure, it is impossible to declare a foreign key constraint on ENTRIES_ITEMS.id because the id column could reference the URLS, the TEXTS, or the FILES table. To normalize the ENTRIES_ITEMS table you would have to add three separate fields, urlid, textid, and fileid and allow them to be nullable, and then you could join each of the three entities tables to the ENTRIES table via your linking table. The approach you are taking is very commonly found in legacy databases that were not SQL92-compliant, where the values were grabbed from the entities tables programmatically/procedurally rather than declaratively using SQL selects.
I would first consider adding a column to your "entries_items" table that contains an XML representation of texts, urls, and files. I can't speak for MySQL, but SQL Server has fantastic facilities for handling XML. I bet MySQL does too.
If not a state-of-the-art technique like that, then I would consider going retro and just having one items table with many nulls, as you already considered.
This may get you started, but wil not resolve hierarchical structure (parent_id) of entries and entries_items.
select *
from entries as e
join entries_items as i on i.entry_id = e.id
left join texts as t on t.item_id = i.id and i.type = 'text'
left join urls as u on u.item_id = i.id and i.type = 'url'
left join files as f on f.file_id = i.id and i.type = 'file'
;
If considering the model cleanup, this may be a starting point.