Active Record - How to perform a nested select on a second table? - sql

I need to list all customers along with their latest order date (plus pagination).
How can I write the following SQL query using Active Record?
select *,
(
select max(created_at)
from orders
where orders.customer_id = customers.id
) as latest_order_date
from customers
limit 25 offset 0
I tried this but it complains missing FROM-clause entry for table "customers":
Customer
.select('*')
.select(
Order
.where('customer_id = customers.id')
.maximum(:created_at)
).page(params[:page])
# Generates this (clearly only the Order query):
SELECT MAX("orders"."created_at")
FROM "orders"
WHERE (customer_id = customers.id)
EDIT: it would be good to keep AR's parameterization and kaminari's pagination goodness.

You haven't given us any information about the relationship between these two tables, so I will assume Customer has_many Orders.
While ActiveRecord doesn't support what you are trying to do, it is built on top of Arel, which does.
Every Rails model has a method named arel_table that will return its corresponding Arel::Table. You might want a helper library to make this cleaner because the default way is a little cumbersome. I will use the plain Arel syntax to maximize compatibility.
ActiveRecord understands Arel objects and can accept them alongside its own syntax.
orders = Order.arel_table
customers = Customer.arel_table
Customer.joins(:orders).group(:id).select([
customers[Arel.star],
orders[:created_at].maximum.as('latest_order_date')
])
Which produces
SELECT "customers".*, MAX("orders"."created_at") AS "latest_order_date"
FROM "customers"
INNER JOIN "orders" ON "orders"."customer_id" = "customers"."id"
GROUP BY "customers"."id"
This is the customary way of doing this, but if you still want to do it as a subquery, you can do this
Customer.select([
customers[Arel.star],
orders.project(orders[:created_at].maximum)
.where(orders[:customer_id].eq(customers[:id]))
.as('latest_order_date')
])
Which gives us
SELECT "customers".*, (
SELECT MAX("orders"."created_at")
FROM "orders"
WHERE "orders"."customer_id" = "customers"."id" ) "latest_order_date"
FROM "customers"

The most Active Record-ish way I've come up with so far is:
Customer
.page(params[:page])
.select('*')
.select(<<-SQL.squish)
(
SELECT MAX(created_at) AS latest_order_date
FROM orders
WHERE orders.customer_id = customers.id
)
SQL
I still wish I could make the string part more Active Record-ish.
The <<-SQL is just heredoc.

Here is the same answer #adam was giving, but not using AREL and just straight ActiveRecord. Not sure it's really much better than #João Marcelo Souza
Customer.select("customers.*, max(orders.created_at)").joins(:orders).group("customers.id").page(params[:page])
(The group by avoids list all the customers columns by using this feature of Postgres 9.1 and higher.)
The OP doesn't say, but the query doesn't handle the case where the customer has no orders. This version does that:
Customer.select("customers.*, coalesce(max(orders.created_at),0)").joins("left outer join orders on orders.customer_id=customers.id").group("customers.id").page(params[:page])

Related

How to join products and their characteristics

How to join products and their characteristics
I have two tables.
Product (id, title, price, created_at, updated_at etc)
and
ProductCharacteristic(id, product_id, sold_quantity, date, craated_at, updated_at etc).
I should show products table (header is product.id, product.title, product.price, sold_quantity) for some period of time and ordered by any fields from header.
And I can't write query
Now I have following query
> current_project.products.includes(:product_characteristics).group('products.id').pluck(:title, 'SUM(product_characteristics.sold_quantity) AS sold_quantity')
(45.4ms) SELECT "products"."title", SUM(product_characteristics.sold_quantity) AS sold_quantity FROM "products" LEFT OUTER JOIN "product_characteristics" ON "product_characteristics"."product_id" = "products"."id" WHERE "products"."project_id" = $1 GROUP BY products.id [["project_id", 20]]
Please help me to write query through orm(to add where with dates and ordering) or write raw sql query.
I used pluck. It returns array of arrays (not array of hashes). It's no so good of course.
product_characteristics.date field is unique by scope product_id. But please give me two examples (with this condition and without it to satisfy my curiosity).
And I use postgresql and rails 4.2.x
P.S. By the way the ProductCharacteristic table will have a lot of records(mote than one million). Should I use postgresql table partitioning. Can it improve performance?
Thank you.
You can use select instead of count in that case, and the property will be accessible as product.sold_quantity
The query becomes
products = current_project.products.joins(:product_characteristics).group('products.id').select(:title, 'SUM(product_characteristics.sold_quantity) AS sold_quantity')
products.first.sold_quantity # => works
To order, you can just add an order clause
products = products.order(id: :asc)
or
products = products.order(id: :desc)
for instance
And for the where
products = products.where("created_at > ?", 2.days.ago)
for instance.
You can chain sql clauses after the first line, it does not matter cause the query will only be launched when you actually use the retrieved set.
And so you can also do stuff like
if params[:foo]
products = products.order(:id)
end

Writing a query that can be used to find which orders shipped to a different country from the customer

So I am currently in a Database 2 class at my University. We are using the Northwinds db. I haven't used SQL in a few years so I am a little rusty.
I changed a few of the pieces in the Orders table so that instead of 'Germany' it was 'Tahiti'. Now I need to write a query to find which orders shipped where.
I know that I will need to use an Join but I am not exactly sure how. I have gone to W3Schools and looked at the Joins SQL page but still haven't found the correct answer I am looking for.
This is what I have currently (which I am also not sure if it is correct):
SELECT Customers.Country
FROM Customer
WHERE Customer.Country = 'Germany'
INNER JOIN
SELECT Orders.ShipCountry
FROM Orders
WHERE Orders.ShipCountry = 'Tahiti'
So if anyone could give me help I would really appreciate it.
EDIT
So this is the actual question I was given which I think is also kind of poorly worded.
"Suspicious e-commerce transactions include orders placed by a customer in one country that are shipped to another country. In fact there are no such orders in the Northwind db, so create a few by modifying some of the "Germany" shipcountry entries in the orders table to "Tahiti". Then write a query which finds orders shipped to a different country from the customer. Hint: in order to do this, you will need to join the Customers and Orders table."
Is this what you are looking for:
SELECT *
FROM Customer AS C
INNER JOIN
Orders AS O
ON C.CustomerID = O.CustomerID
WHERE C.Country = 'Germany' AND O.ShipCountry = 'Tahiti';
The above query is based on the Schema as defined in CodePlex
i hope that this can help you
SELECT Orders.*
FROM Orders -- table name, also can use alias
inner join Customer -- table name, also can use alias
on Orders.ShipCountry = Customer.Country -- you must declare what is the field to use by join
where Customer.Country in ('Germany','Tahiti')

Activerecord query returning doubles while using uniq

I am running the following query with the goal of returning a unique set of customer objects:
Customer.joins(:projects).select('customers.*, projects.finish_date').where("projects.closed = false").uniq
However, this code will generate duplicates if a customer has more than one project active (e.g. closed = true). If I remove the projects.finish_date from the select clause this query works as intended. However, I need this to be in there to be able to sort on that column.
How can I make this query return a unique set of customers?
How can I make this query return a unique set of customers?
This doesn't completely make sense, and probably isn't what you want.
The problem is that you're joining against the projects table, at which point there may be several rows for the same customer with different project finish_dates. These rows are unique and will be returned as multiple unique Customer objects, each with different a finish_date.
If you only want one of these, how is Rails to determine which one? Wouldn't it be a problem if you only had one customer object with one finish_date returned if there are really 10 projects for that customer, each with a different finish_date?
Instead, you probably want something like this:
customers = Customer.joins(:projects).select('customers.*, projects.finish_date').where("projects.closed = false").uniq
customers.group_by(&:id)
This groups all of your same customers together.
OR, you might want:
projects = Project.where(closed: false).includes(:user)
users = projects.map(&:user).uniq
In either case, you're producing a unique set of users from the superset of all user-project joins.
RE Your comments:
If you want to get a list of customers with their most recent associated project, you could use a sub query in your where:
select customers.*, projects.finish_date from customers
inner join projects on projects.customer_id = customers.id
where projects.id = (
select id from projects
where customer.id = project.customer_id
and closed = false
order by finish_date desc
limit 1
)
You can express this using ActiveRecord by embedding the sub-query in a where:
Customer.joins(:projects)
.select('customers.*, projects.finish_date as finish_date')
.where('select id from projects where customer.id = project.customer_id and closed = false order by finish_date desc limit 1')
I have no idea how this will perform for you, but I suspect poorly.
I would always stick to a simple includes and in-Ruby filter before attempting to optimize with SQL.

Why is my WHERE clause not return the CustomerIDs that are also in the Orders table?

My SQL statements are not returning any results. I am using the table that are on the www.w3schools.com web site. I want to have all of the customer IDS in my Customers table match all of the CustomerIDS in the Orders table. The SQL statement works that it goes through the table and checks every CustomerID to every Orders.CustomersID and when it finds a match doesn't it return that record.
Question: Why does the SQL not return the row when both customerIDS are equal because there are values that will return true?
SQL Statement:
SELECT * FROM Customers WHERE Customers.CustomersID = Orders.CustomersID;
One last thought: What is the best free SQL database that can be downloaded without a lot of hassle for home use?
Try this:
SELECT Customers.*
FROM Customers
JOIN Orders ON Customers.CustomerID = Orders.CustomerID;
If you just want customers that have orders then use:
SELECT *
FROM Customers
WHERE CustomerID IN
(SELECT CustomerID FROM Orders)
If you use a JOIN you may get multiple records per customer (if the customer has multiple orders). You can solve that with DISTINCT but IMHO a IN clause is cleaner (and likely faster)
You need a query like this: SELECT *
FROM Customers c
INNER JOIN Orders o on c.CustomerID = o.CustomerID;
w3schools is a very bad resource.
Selecting the best database is a bad question for stackoverflow. But the answer is PostgreSQL. (And lots of other DBMSs would fit the bill for you as well.)

Select based on the number of appearances of an id in another table

I have a table B with cids and cities. I also have a table C that has these cids with extra information. I want to list all the cids in table C that are associated with ALL appearances of a given city in Table B.
My current solution relies on counting the number of times the given city appears in Table B and selecting only the cids that appear that many times. I don't know all the SQL syntax yet, but is there a way to select for this kind of pattern?
My current solution:
SELECT Agents.aid
FROM Agents, Customers, Orders
WHERE (Customers.city='Duluth')
AND (Agents.aid = Orders.aid)
AND (Customers.cid = Orders.cid)
GROUP BY Agents.aid
HAVING count(Agents.aid) > 1
It only works because I know right now with the HAVING statement.
Thanks for the help. I wasn't sure how to google this problem, since it's pretty specific.
EDIT: I'm pinpointing my problem a bit. I need to know how to determine if EVERY row in a table has a certain value for a field. Declaring a variable and counting the rows in a sub-selection and filtering out my results by IDs that appear that many times works, but It's really ugly.
There HAS to be a way to do this without explicitly count()ing rows. I hope.
Not an answer to your question, but a general improvement.
I'd recommend using JOIN syntax to join your tables together.
This would change your query to be:
SELECT Agents.aid
FROM Agents
INNER JOIN Orders
ON Agents.aid = Orders.aid
INNER JOIN Customers
ON Customers.cid = Orders.cid
WHERE Customers.city='Duluth'
GROUP BY Agents.aid
HAVING count(Agents.aid) > 1
What variant of SQL are you using?
To start with, you can (and should) use JOIN instead of doing it in the WHERE clause, e.g.,
select Agents.aid
from Agents
join Orders on Agents.aid = Orders.aid
join Customers on Customers.cid = Orders.cid
where Customers.city = 'Duluth'
group by Agents.aid
having count(Agents.aid) > 1
After that, I'm afraid I might be a little lost. Using the table names in your example query, what (in English, not pseudocode) are you trying to retrieve? For example, I think your sample query is retrieving the PK for all Agents that have been involved in at least 2 Orders involving Customers in Duluth.
Also, some table definitions for Agents, Orders, and Customers might help (then again, they might be irrelevant).
I'm not sure if I understood you problem, but I think the following query is what you want:
SELECT *
FROM customers b
INNER JOIN orders c USING (cid)
WHERE b.city = 'Duluth'
AND NOT EXISTS (SELECT 1
FROM customers b2
WHERE b2.city = b.city
AND b2.cid <> cid);
Probably you will need some indexes on these columns.