Using raw sql queries in Rails 3 application? - sql

I am working on migrating a legacy database into my Rails application (3.2.3). The original database comes with quite a few long sql queries for reports. For now, what I would like to do it use the sql queries in the Rails application and then one by one (when time allows) swap the sql queries to 'proper' Rails queries.
I have a clinical model and the controller has the following code:
#clinical_income_by_year = Clinical.find_all_by_sql(SELECT date_format(c.transactiondate,'%Y') as Year,
date_format(c.transactiondate,'%b') as Month,
sum(c.LineBalance) as "Income"
FROM clinical c
WHERE c.Payments = 0 AND c.LineBalance <> 0
AND c.analysiscode <> 213
GROUP BY c.MonthYear;)
However, when I run that code I get a few errors to do with the formatting.
Started GET "/clinicals" for 127.0.0.1 at 2012-04-29 18:00:45 +0100
SyntaxError (/Users/dannymcclelland/Projects/premvet/app/controllers/clinicals_controller.rb:6: syntax error, unexpected tIDENTIFIER, expecting ')'
...rmat(c.transactiondate,'%Y') as Year,
... ^
/Users/dannymcclelland/Projects/premvet/app/controllers/clinicals_controller.rb:7: syntax error, unexpected tIDENTIFIER, expecting keyword_end
...rmat(c.transactiondate,'%b') as Month,
... ^
/Users/dannymcclelland/Projects/premvet/app/controllers/clinicals_controller.rb:8: syntax error, unexpected tIDENTIFIER, expecting keyword_end
... sum(c.LineBalance) as "Income"
... ^
/Users/dannymcclelland/Projects/premvet/app/controllers/clinicals_controller.rb:10: syntax error, unexpected tCONSTANT, expecting keyword_end
... WHERE c.Payments = 0 AND c.LineBalance <> 0
... ^
/Users/dannymcclelland/Projects/premvet/app/controllers/clinicals_controller.rb:10: syntax error, unexpected '>'
...yments = 0 AND c.LineBalance <> 0
... ^
/Users/dannymcclelland/Projects/premvet/app/controllers/clinicals_controller.rb:11: syntax error, unexpected '>'
... AND c.analysiscode <> 213
... ^
Is there something I should be doing to the sql query before importing it into the controller? Although it's possible there is something wrong with the query (It was written quite some time ago), it does work as expected when run directly within the database. It returns an array like this:
----------------------------------------------
| Year | Month | Income |
----------------------------------------------
----------------------------------------------
| 2012 | January | 20,000 |
| 2012 | February | 20,000 |
| 2012 | March | 20,000 |
| 2012 | April | 20,000 |
----------------------------------------------
etc..
Any help, advice or general pointers would be appreciated!
I'm reading through http://guides.rubyonrails.org/active_record_querying.html trying to convert the sql query to a correct Rails query.
So far I have matched the second to last line:
AND c.analysiscode <> 213
with
#clinical_income_by_year = Clinical.where("AnalysisCode != 213")
baby steps!
UPDATE
I've got the filtering sorted now, thanks to the Rails guide site but I'm stuck on the grouping and sum part of the sql query. I have the following so far:
#clinical_income_by_year = Clinical.where("AnalysisCode != 213 AND Payments != 0 AND LineBalance != 0").page(params[:page]).per_page(15)
I'm struggling to build in the following two lines of the sql query:
sum(c.LineBalance) as "Income"
and
GROUP BY c.MonthYear;)
My view code looks like this:
<% #clinical_income_by_year.each do |clinical| %>
<tr>
<td><%= clinical.TransactionDate.strftime("%Y") %></td>
<td><%= clinical.TransactionDate.strftime("%B") %></td>
<td><%= Clinical.sum(:LineBalance) %></td>
</tr>
<% end %>
</table>
<%= will_paginate #clinical_income_by_year %>

The Ruby parser doesn't understand SQL, you need to use a string:
#clinical_income_by_year = Clinical.find_by_sql(%q{ ... })
I'd recommend using %q or %Q (if you need interpolation) for this so that you don't have to worry about embedded quotes so much. You should also move that into a class method in the model to keep your controllers from worrying about things that aren't their business, this will also give you easy access to connection.quote and friends so that you can properly use string interpolation:
find_by_sql(%Q{
select ...
from ...
where x = #{connection.quote(some_string)}
})
Also, the semicolon in your SQL:
GROUP BY c.MonthYear;})
isn't necessary. Some databases will let it through but you should get rid of it anyway.
Depending on your database, the identifiers (table names, column names, ...) should be case insensitive (unless some hateful person quoted them when they were created) so you might be able to use lower case column names to make things fit into Rails better.
Also note that some databases won't like that GROUP BY as you have columns in your SELECT that aren't aggregated or grouped so there is ambiguity about which c.transactiondate to use for each group.
A more "Railsy" version of your query would look something like this:
#c = Clinical.select(%q{date_format(transactiondate, '%Y') as year, date_format(transactiondate, '%b') as month, sum(LineBalance) as income})
.where(:payments => 0)
.where('linebalance <> ?', 0)
.where('analysiscode <> ?', 213)
.group(:monthyear)
Then you could do things like this:
#c.each do |c|
puts c.year
puts c.month
puts c.income
end
to access the results. You could also simplify a little bit by pushing the date mangling into Ruby:
#c = Clinical.select(%q{c.transactiondate, sum(c.LineBalance) as income})
.where(:payments => 0)
.where('linebalance <> ?', 0)
.where('analysiscode <> ?', 213)
.group(:monthyear)
Then pull apart c.transactiondate in Ruby rather than calling c.year and c.month.

Related

Multiple conditions during iteration, rails

So I have
<% #invites.where(accept: 1).where(user_id: #user_id).each do |invite| %>
The idea is to only display the user in question, and where the accept value is equal to 1.
This causes the SQL query to be:
SELECT `invites`.* FROM `invites` WHERE `invites`.`accept` = 0 AND `invites`.`user_id` IS NULL
How do I solve this?

Rails: Optimize querying maximum values from associated table

I need to show a list of partners and the maximum value from the reservation_limit column from Klass table.
Partner has_many :klasses
Klass belongs_to :partner
# Partner controller
def index
#partners = Partner.includes(:klasses)
end
# view
<% #partners.each do |partner| %>
Up to <%= partner.klasses.maximum("reservation_limit") %> visits per month
<% end %>
Unfortunately the query below runs for every single Partner.
SELECT MAX("klasses"."reservation_limit") FROM "klasses" WHERE "klasses"."partner_id" = $1 [["partner_id", 1]]
If there are 40 partners then the query will run 40 times. How do I optimize this?
edit: Looks like there's a limit method in rails so I'm changing the limit in question to reservation_limit to prevent confusion.
You can use two forms of SQL to efficiently retrieve this information, and I'm assuming here that you want a result for a partner even where there is no klass record for it
The first is:
select partners.*,
max(klasses.limit) as max_klasses_limit
from partners
left join klasses on klasses.partner_id = partners.id
group by partner.id
Some RDBMSs require that you use "group by partner.*", though, which is potentially expensive in terms of the required sort and the possibility of it spilling to disk.
On the other hand you can add a clause such as:
having("max(klasses.limit) > ?", 3)
... to efficiently filter the partners by their value of maximum klass.limit
The other is:
select partners.*,
(Select max(klasses.limit)
from klasses
where klasses.partner_id = partners.id) as max_klasses_limit
from partners
The second one does not rely on a group by, and in some RDBMSs may be effectively transformed internally to the first form, but may execute less efficiently by the subquery being executed once per row in the partners table (which would stil be much faster than the raw Rails way of actually submitting a query per row).
The Rails ActiveRecord forms of these would be:
Partner.joins("left join klasses on klasses.partner_id = partners.id").
select("partners.*, max(klasses.limit) as max_klasses_limit").
group(:id)
... and ...
Partner.select("partners.*, (select max(klasses.limit)
from klasses
where klasses.partner_id = partners.id) as max_klasses_limit")
Which of these is actually the most efficient is probably going to depend on the RDBMS and even the RDBMS version.
If you don't need a result when there is no klass for the partner, or there is always guaranteed to be one, then:
Partner.joins(:klasses).
select("partners.*, max(klasses.limit) as max_klasses_limit").
group(:id)
Either way, you can then reference
partner.max_klasses_limit
Your initial query brings all the information you need. You only need to work with it as you would work with a regular array of objects.
Change
Up to <%= partner.klasses.maximum("reservation_limit") %> visits per month
to
Up to <%= partner.klasses.empty? ? 0 : partner.klasses.max_by { |k| k.reservation_limit }.reservation_limit %> visits per month
What maximum("reservation_limit") does it to trigger an Active Record query SELECT MAX.... But you don't need this, as you already have all the information you need to process the maximum in your array.
Note
Using .count on an Active Record result will trigger an extra SELECT COUNT... query!
Using .length will not.
It generally helps if you start writing the query in pure SQL and then extract it into ActiveRecord or Arel code.
ActiveRecord is powerful, but it tends to force you to write highly inefficient queries as soon as you derail from the standard CRUD operations.
Here's your query
Partner
.select('partners.*, (SELECT MAX(klasses.reservation_limit) FROM klasses WHERE klasses.partner_id = partners.id) AS maximum_limit')
.joins(:klasses).group('partners.id')
It is a single query, with a subquery. However the subquery is optimized to run only once as it can be parsed ahead and it doesn't run N+1 times.
The code above fetches all the partners, joins them with the klasses records and thanks to the join it can compute the aggregate maximum. Since the join effectively creates a cartesian product of the records, you then need to group by the partners.id (which in fact is required in any case by the MAX aggregate function).
The key here is the AS maximum_limit that will assign a new attribute to the Partner instances returned with the value of the count.
partners = Partner.select ...
partners.each do |partner|
puts partner.maximum_limit
end
This will return max. limits in one select for an array of parthner_ids:
parthner_ids = #partners.map{|p| p.id}
data = Klass.select('MAX("limit") as limit', 'partner_id').where(partner_id: parthner_ids).group('partner_id')
#limits = data.to_a.group_by{|d| d.id}
You can now integrate it into your view:
<% #partners.each do |partner| %>
Up to <%= #limits[partner.id].limit %> visits per month
<% end %>

Get all records with child records OR field length > 250

I have a Comment model which has-many attachments. What I want to return, is all of the comments which either have one or more attachment records, OR whose comment is longer than 250 characters.
Is there any way I can do this without writing it entirely in pure SQL? I'm struggling to build up a WHERE clause in just the rails method. It's not quite as simple as I'd hoped :(
Ideally I want this to be a scope but whatever will work is fine
You could try:
Comment.includes(:attachments).where('attachments.comment_id IS NOT NULL OR LEN(comments.content) > 250')
The WHERE clause should follow the pattern o the following pseudo-code
WHERE Length(Comment_field) > 250
OR EXISTS (Select COMMENT_ID from attachments)
Jump into the irb or rails c (console) do this from command-line to get it then plug it in.
c = YourCommentModel.where('attachments > ?', 1)
len250 = c = YourCommentModel.where('attachments.length> ?', 250)
first one gives comments of greater than 1, second gives comments > 250

Complex SQL Query in Rails - User.where(hash)

My starting point is basically Ryan Bates Railscast.
I have User model that I need to do some queries on. The model has a couple hourly rate attributes as follows:
#User migration
...
t.decimal :hour_price_high
t.decimal :hour_price_low
...
I have the query working in the User.where(Array) format where Array is formatted
["users.hour_price_high <= ? OR users.hour_price_low >= ? OR users.hour_price_low <= ? AND users.hour_price_high >= ?", hour_price_high, hour_price_low, hour_price_low, hour_price_high]
#This is simply a search of two ranges. Search for $40/h - $60/h.
#It will return an User who charge any overlapping price range. Such as one who charges $45/h - $65/h, etc.
I simply wish to convert this into Ruby syntax in the where statement.
My problem is how to represent the OR.
#This isn't the full query shown above..
User.where(:high_price_high => hour_price_low..hour_price_high, :hour_price_low => hour_price_low..hour_price_high)
Produces this SQL:
=> "SELECT `users`.* FROM `users` WHERE (`users`.`hour_price_high` BETWEEN 45 AND 60) AND (`users`.`hour_price_low` BETWEEN 45 AND 60)"
Which, of course, is wrong because of the AND. I need it to be:
=> "SELECT `users`.* FROM `users` WHERE (`users`.`hour_price_high` BETWEEN 45 AND 60) OR (`users`.`hour_price_low` BETWEEN 45 AND 60)"
How can I make this be an OR statement without busting out the old truth table from my sophomore E.E. classes?
Thanks!
When you chain several where methods or have a few arguments in one where method, then you always get AND between them, so when you want to have OR you need to use this syntax:
User.where("users.hour_price_high <= ? OR users.hour_price_low >= ? OR users.hour_price_low <= ? AND users.hour_price_high >= ?", hour_price_high, hour_price_low, hour_price_low, hour_price_high)
Note: please watch http://railscasts.com/episodes/202-active-record-queries-in-rails-3 in order to get more information about active record queries.

rails and sql where IS NOT NULL not working properly with join. Rails 2.3.5 and Ruby 1.8.7 - mysql 5

I have 3 tables & models:
brands
brand_data_records
and
brand_data_records_brands - the join table
In rails i want all brand_data_records for a given date range for a given brand where a given attribute is not null in the db.
So I have:
BrandDataRecord.find(:all, :select => column_match, :joins => :brands, :conditions => ["brand_data_records_brands.brand_id = ? and date_retrieved >= ? AND date_retrieved <= ? and ? IS NOT NULL",brand.id,start_date,end_date,column_match])
This generates this sql:
SELECT sentiment FROM `brand_data_records` INNER JOIN `brand_data_records_brands` ON `brand_data_records_brands`.brand_data_record_id = `brand_data_records`.id INNER JOIN `brands` ON `brands`.id = `brand_data_records_brands`.brand_id WHERE (brand_data_records_brands.brand_id = 330516084 and date_retrieved >= '2011-05-02' AND date_retrieved <= '2011-06-01' and 'sentiment' IS NOT NULL)
Which generally works, but it gives back a bunch of extra records that have a null value. I think its something to do with the joins, if I remove them with sql only it works fine, but im not sure how to fix in rails (or even in sql for that fact)
You probably mean to reference the column:
`sentiment` IS NOT NULL
What you're doing inadvertently is asserting that the string 'sentiment' is not null, which of course it will never be. Passing in :sentiment or 'sentiment'.to_sym' in your conditions should fix this as symbols get escaped with backquotes on conversion.