I have the following table
Orders_All
Account
Orders all contains every order line many thousand records and the 2 records are order_date and order_account_id.
This need to join into the Account so other queries can be run as well but I want a report that shows the account_id and the last order date but only one record per account.
How can I create a query to acgieve this.
SELECT account_id, MAX(order_date) as last_order_date FROM Orders_All INNER JOIN Account ON order_account_id = account_id GROUP BY account_id
That will give you the account ID and the maximum (furthest in the future) date. The GROUP BY is what limits it - it's the maximum date "for each" account_id.
If an account has no orders and you still want that account to show, with a NULL in the date column, use a RIGHT OUTER JOIN instead of INNER JOIN there.
Try this:
Select accound_id,Max(OrderDate) from Order_All t1,AccountID t2
where t1.AccountID = t2.AccountID
group by account_ID
Subtitute * for the columns you want to show or leave it and try the next code.
select distinct * from Account A
join Orders_All O on O.order_account_id = A.account_id
--so far we have join both table, so we need the last order date from it
where o.order_date in (select MAX(order_date) from Orders_All)
--in where clause im getting max order date from table Orders_All
but what if I want last order date by customer?
Well then you could write this (not the best but it works):
select *
from Account A
join (select account_id as Acc_id, MAX(order_date) as Date
from Orders_All
group by account_id)
as OD on A.account_id = OD.Acc_id
Related
I am working on an SQL query in order to define customer types, the goal is to differenciate the old active customers from the churn customers (churn = customers that stopped using your company's product or service during a certain time frame)
In order to do that, i came up with this query that works perfectly :
WITH customers AS (
SELECT
DATE(ord.delivery_date) AS date,
ord.customer_id
FROM table_template AS ord
WHERE cancel_date IS NULL
AND order_type_id IN (1,3)
GROUP BY DATE(ord.delivery_date), ord.customer_id, ord.delivery_date),
days AS (SELECT DISTINCT date FROM customers),
recap AS (
SELECT * FROM (
SELECT
a1.date,
a2.customer_id,
MAX(a2.date) AS last_order,
DATE_DIFF(a1.date, MAX(a2.date), day) AS days_since_last,
MIN(a2.date) AS first_order,
DATE_DIFF(a1.date, MIN(a2.date), day) AS days_since_first
FROM days AS a1
CROSS JOIN customers AS a2 WHERE a2.date <= a1.date
GROUP BY a1.date, customer_id)
)
SELECT * FROM recap
The result of the query :
The only issue of this query is that the calculation is too heavy (it uses a lot of CPU seconds) I think that it is due to the CROSS JOIN.
I need some of your help in order to find another way to come with the same result, a way that doesn't need a CROSS JOIN to have the same output, do you guys think it is possible ?
As you have mentioned the problem of query taking a long time to load was because of the internet issue. Also, I will try to explain Inner Join further with a sample query as below:
SELECT distinct a1.id,a1.date
FROM `table1` AS a1
INNER JOIN `table2` AS a2
ON a2.date <= a1.date
The INNER JOIN selects all rows from both the tables as long as the condition satisfies. In this sample query it gives the result based on condition a2.date <= a1.date only if date values in table1 are greater than or equal to date values in table2.
Input Table 1:
Input Table 2:
Output Table:
I'm trying to find the number of customers who have ordered more than one product, with the same subscription.
I've first selected the count of the id_customer from customer. Then joined on subscription and order (on the correct keys). This was done so that I have all the data available to me from all 3 tables. Then grouped by the id_customer to get just the unique customers. And lastly filtered to have a fk_product (products a customer has) greater than 1.
SELECT COUNT(t1.id_customer)
FROM customer t1
INNER JOIN subscription t2 ON t1.id_customer = t2.fk_customer
INNER JOIN order t3 ON t2.id_subscription = t3.fk_subscription
GROUP BY t1.id_customer
HAVING COUNT(t3.fk_product) > 1
I'd like to better understand if this is the correct syntax to obtain the data I'm looking for. Since I have t2.id_subscription and t3.fk_subscription linked, wouldn't this be correct? I'm still getting the wrong output. I'm thinking its perhaps the way I have my scopes, or some subtle aspect of SQL that I'm not using/understanding.
Thank you for your help!!
Use two levels of aggregation. Your data model is a bit hard to follow, but I think:
SELECT COUNT(DISTINCT so.fk_customer)
FROM (SELECT s.fk_customer, s.id_subscription
FROM subscription s
order o
ON s.id_subscription = o.fk_subscription
GROUP BY s.fk_customer, s.id_subscription
HAVING MIN(o.fk_product) <> MAX(o.fk_product)
) so
select count(distinct s.id_customer)
from (
SELECT t1.id_customer
FROM customer t1
INNER JOIN subscription t2 ON t1.id_customer = t2.fk_customer
INNER JOIN order t3 ON t2.id_subscription = t3.fk_subscription
GROUP BY t1.id_customer, t3.fk_subscription
HAVING COUNT(1) > 1
) s
I have a query that includes a subquery that references one of the tables that my query joins on, but I also need to do an evaluation on the field returned from the subquery in my WHERE clause.
Here's the current query (rough example) -
SELECT t1.first_name, t1.last_name,
(SELECT created_at FROM customer_order_status_history WHERE order_id=t2.order_id AND order_status=t2.order_status ORDER BY created_at DESC LIMIT 1) AS order_date
FROM customers AS t1
INNER JOIN customer_orders as t2 on t2.customer_id=t1.customer_id
My subquery is currently returning the latest date from the customer_order_status_history table, but in my query I want to do an evaluation on the subquery in the WHERE clause such that I only want it if the the most recent created_at date is greater than a specific date condition (i.e. system date - 5 days). So in a way this is a conditional join on the customer_orders and customer_order_status_history tables where the final result should only be returned if the most recent record in customer_order_status_history (sorted by created_at in descending order) is greater than system date - 5 days.
Apologies in advance for the bad explanation but hopefully it is clear what I am trying to achieve here. Also I did not come up with this database schema and given the project constraints, I can not alter the schema.
Thanks!
Use a lateral join:
SELECT c.first_name, c.last_name, cosh.created_at
FROM customers c INNER JOIN
customer_orders co
ON co.customer_id = c.customer_id CROSS JOIN LATERAL
(SELECT cosh.*
FROM customer_order_status_history cosh
WHERE cosh.order_id = co.order_id AND
cosh.order_status = co.order_status AND
cosh.created_at > now() - INTERVAL '5 DAY'
ORDER BY cosh.created_at DESC
LIMIT 1
) cosh
I have joined two tables to pull the data I need. I'm having trouble only displaying the most current record from one table. What I'm trying to do is look for the last updated value. I have tried to incorporate max() and row_num but have not had any success.
Here is what I currently have:
select distinct t1.CaId,t1.Enrolled,t1.Plan,t2.Category,t2.updateddate
from table.one(nolock) t1
inner join table.two(nolock) t2 on t1.CaId=t2.CaID
where t1.coverageyear=2016
and right(t1.Plan,2)<>left(t2.Category,2)
order by 5 desc
You can join your main query with a subquery that just grabs the last update date for each ID, like this:
select all_rec.CaId, all_rec.Enrolled, all_rec.[Plan], all_rec.Category, all_rec.updateddate
from
(select distinct t1.CaId,t1.Enrolled,t1.[Plan],t2.Category,t2.updateddate
from [table.one](nolock) t1
inner join [table.two](nolock) t2 on t1.CaId=t2.CaID
where t1.coverageyear=2016
and right(t1.[Plan],2)<>left(t2.Category,2)
) as all_rec
inner join
(SELECT max(updateddate) AS LAST_DATE, CaId
FROM [table.two](nolock)
GROUP BY CaId)
AS GRAB_DATE
on (all_rec.Ca_Id = GRAB_DATE.Ca_Id)
and (all_rec.updateddate = GRAB_DATE.updateddate)
order by 5 desc
I added brackets around your usages of table and Plan because those are SQL reserved words.
If you are trying to get last updated value, then just simply add to your query:
order by t2.updateddate desc
It will show most current record from tables.
I have two tables
Bills: id amount reference
Transactions: id reference amount
The following SQL query
SELECT
*,
(SELECT SUM(amount)
FROM transactions
WHERE transactions.reference = bils.reference) AS paid
FROM bills
GROUP BY id HAVING paid<amount
was meant to some rows from table Bills, adding a column paid with the sum of amount of related transactions.
However, it only works when there is at least one transaction for each bill. Otherwise, no line for a transaction-less bill is returned.
Probably, that's because I should have done an inner join!
So I try the following:
SELECT
*,
(SELECT SUM(transactions.amount)
FROM transactions
INNER JOIN bills ON transactions.reference = bills.reference) AS paid
FROM bills
GROUP BY id
HAVING paid < amount
However, this returns the same value of paid for all rows! What am I doing wrong ?
Use a left join instead of a subquery:
select b.id, b.amount, b.paid, sum(t.amount) as transactionamount
from bills b
left join transactions t on t.reference = b.reference
group by b.id, b.amount, b.paid
having b.paid < b.amount
Edit:
To compare the sum of transactions to the amount, handle the null value that you get when there are no transactions:
having isnull(sum(t.amount), 0) < b.amount
You need a RIGHT JOIN to include all bill rows.
EDIT
So the final query will be
SELECT
*,
(SELECT SUM(transactions.amount)
FROM transactions
WHERE transactions.reference = bills.reference) AS paid
FROM bills
WHERE paid < amount
I knows this thread is old, but I came here today because I encountering the same problem.
Please see another post with same question:
Sum on a left join SQL
As the answer says, use GROUP BY on the left table. This way you get all the records out from left table, and it sums the corresponding rows from right table.
Try to use this:
SELECT
*,
SUM(transactions.sum)
FROM
bills
RIGHT JOIN
transactions
ON
bills.reference = transactions.reference
WHERE
transactions.sum > 0
GROUP BY
bills.id