How would I do this JOIN in Rails? - sql

Here's my SQL statement:
SELECT *
FROM `message_users`
LEFT JOIN `messages` ON message_users.message_id = messages.id
WHERE (message_users.user_id = 1 AND message_users.hidden = 0) AND message_users.last_read_at > messages.updated_at
ORDER BY messages.updated_at DESC LIMIT 0, 20
How would I pull that off with proper Rails joins/includes/whatever?

You don't typically load all the data from multiple tables into one model in rails. More common is to replace the joins below with include, which will preload the associated model so you hit the cache when calling message.message_users. At any rate, this should duplicate what your sql was doing, as long as there are no column name clashes between messages and messages_users.
If you don't need the data from messages_users after the query is performed, you can remove the select fragment.
Message.find(:all,
:joins=>:message_users,
:select=>"message_users.*, messages.*",
:conditions=>['message_users.user_id = ? and message_users.hidden = ? and message_users.last_read_at > messages.updated_at', 1,0],
:order=>"messages.updated_at desc",
:limit=>20)
There's a good screencast about the difference between joins and include here

Related

The "where" condition worked not as expected ("or" issue)

I have a problem to join thoses 4 tables
Model of my database
I want to count the number of reservations with different sorts (user [mrbs_users.id], room [mrbs_room.room_id], area [mrbs_area.area_id]).
Howewer when I execute this query (for the user (id=1) )
SELECT count(*)
FROM mrbs_users JOIN mrbs_entry ON mrbs_users.name=mrbs_entry.create_by
JOIN mrbs_room ON mrbs_entry.room_id = mrbs_room.id
JOIN mrbs_area ON mrbs_room.area_id = mrbs_area.id
WHERE mrbs_entry.start_time BETWEEN "145811700" and "1463985000"
or
mrbs_entry.end_time BETWEEN "1458120600" and "1463992200" and mrbs_users.id = 1
The result is the total number of reservations of every user, not just the user who has the id = 1.
So if anyone could help me.. Thanks in advance.
Use parentheses in the where clause whenever you have more than one condition. Your where is parsed as:
WHERE (mrbs_entry.start_time BETWEEN "145811700" and "1463985000" ) or
(mrbs_entry.end_time BETWEEN "1458120600" and "1463992200" and
mrbs_users.id = 1
)
Presumably, you intend:
WHERE (mrbs_entry.start_time BETWEEN 145811700 and 1463985000 or
mrbs_entry.end_time BETWEEN 1458120600 and 1463992200
) and
mrbs_users.id = 1
Also, I removed the quotes around the string constants. It is bad practice to mix data types, and in some databases, the conversion between types can make the query less efficient.
The problem you've faced caused by the incorrect condition WHERE.
So, should be:
WHERE (mrbs_entry.start_time BETWEEN 145811700 AND 1463985000 )
OR
(mrbs_entry.end_time BETWEEN 1458120600 AND 1463992200 AND mrbs_users.id = 1)
Moreover, when you use only INNER JOIN (JOIN) then it be better to avoid WHERE clause, because the ON clause is executed before the WHERE clause, so criteria there would perform faster.
Your query in this case should be like this:
SELECT COUNT(*)
FROM mrbs_users
JOIN mrbs_entry ON mrbs_users.name=mrbs_entry.create_by
JOIN mrbs_room ON mrbs_entry.room_id = mrbs_room.id
AND
(mrbs_entry.start_time BETWEEN 145811700 AND 1463985000
OR ( mrbs_entry.end_time BETWEEN 1458120600 AND 1463992200 AND mrbs_users.id = 1)
)
JOIN mrbs_area ON mrbs_room.area_id = mrbs_area.id

Oracle Sub-select taking a long time

I have a SQL Query that comprise of two level sub-select. This is taking too much time.
The Query goes like:
select * from DALDBO.V_COUNTRY_DERIV_SUMMARY_XREF
where calculation_context_key = 130205268077
and DERIV_POSITION_KEY in
(select ctry_risk_derivs_psn_key
from DALDBO.V_COUNTRY_DERIV_PSN
where calculation_context_key = 130111216755
--and ctry_risk_derivs_psn_key = 76296412
and CREDIT_PRODUCT_TYPE = 'SWP OP'
and CALC_OBLIGOR_COUNTRY_OF_ASSETS in
(select ctry_cd
from DALDBO.V_PSN_COUNTRY
where calculation_context_key = 130134216755
--and ctry_risk_derivs_psn_key = 76296412
)
)
These tables are huge! Is there any optimizations available?
Without knowing anything about your table or view definitions, indexing, etc. I would start by looking at the sub-selects and ensuring that they are performing optimally. I would also want to know how many values are being returned by each sub-select as this can impact performance.
How is calculation_context_key used to retrieve rows from V_COUNTRY_DERIV_PSN and V_PSN_COUNTRY? Is it an optimal execution plan?
How is DERIV_POSITION_KEY and CALC_OBLIGOR_COUNTRY_OF_ASSETS used in V_COUNTRY_DERIV_SUMMARY_XREF to retrieve rows? Again, look at the explain plan.
first of all, can you write this query using inner joins (and not subselect) ??
select A.*
from DALDBO.V_COUNTRY_DERIV_SUMMARY_XREF a,
DALDBO.V_COUNTRY_DERIV_PSN b,
DALDBO.V_PSN_COUNTRY c
where calculation_context_key = 130205268077
and a.DERIV_POSITION_KEY = b.ctry_risk_derivs_psn_key
and b.calculation_context_key = 130111216755
--and b.ctry_risk_derivs_psn_key = 76296412
and b.CREDIT_PRODUCT_TYPE = 'SWP OP'
and b.CALC_OBLIGOR_COUNTRY_OF_ASSETS = c.ctry_cd
and c.calculation_context_key = 130134216755
--and c.ctry_risk_derivs_psn_key = 76296412
second, best practice says that when you don't query any data from the tables in the subselect you better of using an EXISTS instead of IN. new versions of oracle does that automatically and actually rewrite the whole thing as an inner join.
last, without any knowledge on you data and of what you are trying to do i would suggest you to try and use views as less as you can - if you can query the underling tables it would be best and you will probably see immediate performance improvement.

Applying two separate filters on a Rails database query

I am trying to apply two filters to a database query in Rails 3. The first filter shows only media of type images. The second filter shows the highest saluted stories. On their own the filters work ok, but when I try to combine both filters, I get errors.
There are 3 tables involved. Stories, memories, and salutes. The salutes table keeps track of how many times someone 'salutes' a memory. Each story is composed of multiple memories. A story's total salutes is the sum of the salutes of that story's memories. I want to retrieve records of image-only stories in the order of highest to lowest salutes.
models/story.rb
def self.where_contains_image()
joins(
'INNER JOIN memories AS wci_memories ON wci_memories.story_id = stories.id'
)
.where(
'wci_memories.media_type_cd = ?', Memory.image
).uniq
end
controllers/stories_controller.rb
if params[:filter_content] == 'image'
stories = stories.where_contains_image
end
if (params[:filter_trends] == 'most_saluted')
stories = stories.order("(SELECT COUNT(1) FROM salutes
LEFT JOIN memories AS ms_memories ON salutes.content_id = ms_memories.id
LEFT JOIN stories AS ms_stories ON ms_stories.id = ms_memories.story_id
WHERE ms_stories.id = stories.id AND salutes.content_type = 'Memory')
DESC");
end
On its own, when the 'most_saluted' param is set, the query works as expected. When both the 'most_saluted' param and the 'image' param are set, I get an error:
for SELECT DISTINCT, ORDER BY expressions must appear in select list
I understand what the error is, but I cannot figure out how to rewrite the queries so that it can return only images in the order of most saluted.
When I run this SQL query on the database, it returns the records I'm looking for. But I cannot figure out how to make rails return the same records. Furthermore, this query combines the two filters (only images and highest salutes). I want to keep them separate so that I can apply one filter individually, or both together.
SELECT DISTINCT stories.*, (SELECT COUNT(1) FROM salutes
LEFT JOIN memories AS ms_memories ON salutes.content_id = ms_memories.id
LEFT JOIN stories AS ms_stories ON ms_stories.id = ms_memories.story_id
WHERE ms_stories.id = stories.id AND salutes.content_type = 'Memory')
AS total_salutes FROM stories INNER JOIN memories AS wci_memories
ON wci_memories.story_id = stories.id WHERE wci_memories.media_type_cd = 0
ORDER BY total_salutes DESC
Any thoughts on how I can resolve this?
You can use scope to achieve this, actually the Activerecord scope are the more cleaner/moduler way to chain conditions
read here about scopes
HTH

Bad performance of SQL query due to ORDER BY clause

I have a query joining 4 tables with a lot of conditions in the WHERE clause. The query also includes ORDER BY clause on a numeric column. It takes 6 seconds to return which is too long and I need to speed it up. Surprisingly I found that if I remove the ORDER BY clause it takes 2 seconds. Why the order by makes so massive difference and how to optimize it? I am using SQL server 2005. Many thanks.
I cannot confirm that the ORDER BY makes big difference since I am clearing the execution plan cache. However can you shed light at how to speed this up a little bit? The query is as follows (for simplicity there is "SELECT *" but I am only selecting the ones I need).
SELECT *
FROM View_Product_Joined j
INNER JOIN [dbo].[OPR_PriceLookup] pl on pl.siteID = NodeSiteID and pl.skuid = j.skuid
LEFT JOIN [dbo].[OPR_InventoryRules] irp on irp.ID = pl.SkuID and irp.InventoryRulesType = 'Product'
LEFT JOIN [dbo].[OPR_InventoryRules] irs on irs.ID = pl.siteID and irs.InventoryRulesType = 'Store'
WHERE (((((SiteName = N'EcommerceSite') AND (Published = 1)) AND (DocumentCulture = N'en-GB')) AND (NodeAliasPath LIKE N'/Products/Cats/Computers/Computer-servers/%')) AND ((NodeSKUID IS NOT NULL) AND (SKUEnabled = 1) AND pl.PriceLookupID in (select TOP 1 PriceLookupID from OPR_PriceLookup pl2 where pl.skuid = pl2.skuid and (pl2.RoleID = -1 or pl2.RoleId = 13) order by pl2.RoleID desc)))
ORDER BY NodeOrder ASC
Why the order by makes so massive difference and how to optimize it?
The ORDER BY needs to sort the resultset which may take long if it's big.
To optimize it, you may need to index the tables properly.
The index access path, however, has its drawbacks so it can even take longer.
If you have something other than equijoins in your query, or the ranged predicates (like <, > or BETWEEN, or GROUP BY clause), then the index used for ORDER BY may prevent the other indexes from being used.
If you post the query, I'll probably be able to tell you how to optimize it.
Update:
Rewrite the query:
SELECT *
FROM View_Product_Joined j
LEFT JOIN
[dbo].[OPR_InventoryRules] irp
ON irp.ID = j.skuid
AND irp.InventoryRulesType = 'Product'
LEFT JOIN
[dbo].[OPR_InventoryRules] irs
ON irs.ID = j.NodeSiteID
AND irs.InventoryRulesType = 'Store'
CROSS APPLY
(
SELECT TOP 1 *
FROM OPR_PriceLookup pl
WHERE pl.siteID = j.NodeSiteID
AND pl.skuid = j.skuid
AND pl.RoleID IN (-1, 13)
ORDER BY
pl.RoleID desc
) pl
WHERE SiteName = N'EcommerceSite'
AND Published = 1
AND DocumentCulture = N'en-GB'
AND NodeAliasPath LIKE N'/Products/Cats/Computers/Computer-servers/%'
AND NodeSKUID IS NOT NULL
AND SKUEnabled = 1
ORDER BY
NodeOrder ASC
The relation View_Product_Joined, as the name suggests, is probably a view.
Could you please post its definition?
If it is indexable, you may benefit from creating an index on View_Product_Joined (SiteName, Published, DocumentCulture, SKUEnabled, NodeOrder).

Problem with adding custom sql to finder condition

I am trying to add the following custom sql to a finder condition and there is something not quite right.. I am not an sql expert but had this worked out with a friend who is..(yet they are not familiar with rubyonrails or activerecord or finder)
status_search = "select p.*
from policies p
where exists
(select 0 from status_changes sc
where sc.policy_id = p.id
and sc.status_id = '"+search[:status_id].to_s+"'
and sc.created_at between "+status_date_start.to_s+" and "+status_date_end.to_s+")
or exists
(select 0 from status_changes sc
where sc.created_at =
(select max(sc2.created_at)
from status_changes sc2
where sc2.policy_id = p.id
and sc2.created_at < "+status_date_start.to_s+")
and sc.status_id = '"+search[:status_id].to_s+"'
and sc.policy_id = p.id)" unless search[:status_id].blank?
My find statement:
Policy.find(:all,:include=>[{:client=>[:agent,:source_id,:source_code]},{:status_changes=>:status}],
:conditions=>[status_search])
and I am getting this error message in my log:
ActiveRecord::StatementInvalid (Mysql::Error: Operand should contain 1 column(s): SELECT DISTINCT `policies`.id FROM `policies` LEFT OUTER JOIN `clients` ON `clients`.id = `policies`.client_id WHERE ((((policies.created_at BETWEEN '2009-01-01' AND '2009-03-10' OR policies.created_at = '2009-01-01' OR policies.created_at = '2009-03-10')))) AND (select p.*
from policies p
where exists
(select 0 from status_changes sc
where sc.policy_id = p.id
and sc.status_id = '2'
and sc.created_at between 2009-03-10 and 2009-03-10)
or exists
(select 0 from status_changes sc
where sc.created_at =
(select max(sc2.created_at)
from status_changes sc2
where sc2.policy_id = p.id
and sc2.created_at < 2009-03-10)
and sc.status_id = '2'
and sc.policy_id = p.id)) ORDER BY clients.created_at DESC LIMIT 0, 25):
what is the major malfunction here - why is it complaining about the columns?
The conditions modifier is expecting a condition (e.g. a boolean expression that could go in a where clause) and you are passing it an entire query (a select statement).
It looks as if you are trying to do too much in one go here, and should break it down into smaller steps. A few suggestions:
use the query with find_by_sql and don't mess with the conditions.
use the rails finders and filter the records in the rails code
Also, note that constructing a query this way isn't secure if the values like status_date_start can come from users. Look up "sql injection attacks" to see what the problem is, and read the rails documentation & examples for find_by_sql to see how to avoid them.
Ok, I've managed to retool this so it is more friendly to a conditions modifier and I think it is doing the sql query correctly.. however, it is returning policies that when I try to list the current status (the policy.status_change.last.status) it is set to the same status used in the query - which is not correct
here is my updated condition string..
status_search = "status_changes.created_at between ? and ? and status_changes.status_id = ?) or
(status_changes.created_at = (SELECT MAX(sc2.created_at) FROM status_changes sc2
WHERE sc2.policy_id = policies.id and sc2.created_at < ?) and status_changes.status_id = ?"
is there something obvious to this that is not returning all of the remaining associated status changes once it finds the one in the query?
here is the updated find..
Policy.find(:all,:include=>[{:client=>[:agent,:source_id,:source_code]},:status_changes],
:conditions=>[status_search,status_date_start,status_date_end,search[:status_id].to_s,status_date_start,search[:status_id].to_s])