SQL Server: Making this rank for efficient - sql

I have the following query to pull a customer's most recent purchase. I tried to use a subselect for performance reasons, but I ran into a wall and kept getting back ALL the customers' orders. I just need the most recent for each individual customer.
SELECT *
FROM (SELECT od.*, ord.OrderName, ord.OrderDate, RowN =
Row_Number()
OVER (PARTITION BY ord.CustomerOrderGUID ORDER BY ord.OrderDate DESC)
FROM #OrderData od
JOIN CV3Orders ord ON ord.CustomerOrderGUID = od.CustomerOrderGUID
WHERE ord.ProductName = 'Product 10') rnk
WHERE rnk.RowN = 1

CustomerOrderGuid would seem to represent every order not every customer. So, you need to partition by the correct column. I might guess:
SELECT co.*
FROM (SELECT od.*, ord.OrderName, ord.OrderDate,
Row_Number() OVER (PARTITION BY ord.CustomerGUID ORDER BY ord.OrderDate DESC) as seqnum
FROM #OrderData od JOIN
CV3Orders ord
ON ord.CustomerOrderGUID = od.CustomerOrderGUID
WHERE ord.ProductName = 'Product 10'
) co
WHERE seqnum = 1;

Related

How do I show the previous 2 orders a customer made?

I am learning SQL server and I am stuck on a question.
I need to write a query that shows each customers last order that he placed and the order before the last one he made.
Thank you for your help!
Edit: So far I have this:
SELECT SalesOrderID, CustomerID, per.FirstName, per.LastName, OrderDate as "Latest Order Date"
FROM (
SELECT *,
Row_Number() OVER (PARTITION BY CustomerID ORDER BY OrderDate desc) as 'Rank'
FROM sales.SalesOrderHeader head
) a join Person.Person per
on CustomerID = per.BusinessEntityID
WHERE Rank = 1
As you can see, I am pretty close. I just need to add a column that shows the order before the latest order date.
Sorry, I'm new to the site (long time viewer, first time poster)
ty!
You need to combine both ROW_NUMBER and LEAD
LEAD is better in this case, because LAG needs the rows sorted in the opposite direction from the ROW_NUMBER
SELECT head.SalesOrderID, CustomerID, per.FirstName, per.LastName, OrderDate as LastOrder, head.PreviousOrder
FROM (
SELECT *,
ROW_NUMBER() OVER (PARTITION BY CustomerID ORDER BY OrderDate DESC) as rnk,
LEAD(OrderDate) OVER (PARTITION BY CustomerID ORDER BY OrderDate DESC) as PreviousOrder
FROM sales.SalesOrderHeader head
) head
JOIN Person.Person per ON head.CustomerID = per.BusinessEntityID
WHERE head.rnk = 1;

Nested query missing expression

SELECT customer_id, company_code
FROM customer, commercial_cust
WHERE commercial_cust.FK_customer_id = customer.customer_id
(
SELECT payment_method, payment_date
FROM payment, cust_order
WHERE payment_link.FK_order_id = cust_order.order_id
(
SELECT order_id, payment_date, SUM(payment_ammount) payment_ammount
FROM cust_order, payment_link, payment
WHERE cust_order.FK_customer_id = customer.customer_id AND payment_link.FK_payment_id = payment.payment_id
)
GROUP BY payment_ammount.DESC
)
WHERE ROWNUM <=(SELECT COUNT(*) FROM cust_order)/4;
For a database assignment I have been asked to display a list of 25% more lucrative commercial customers. I have written this out but I keep getting a missing element error and I'm not sure where I am meant to put the semicolon (if it is the semicolon). I've tried moving it around and removing parts of the script but it doesn't seem to be working.
This will be hugely appreciated if somebody can help. The rest of the code is correct in terms of names etc.
You should have shown what your tables contain, how they are related, and should have given sample data and expected result. So my answer may not match your requirement completely.
Let's simply look at how much a customer ordered: sum the order amount per customer, make sure the customer is a "commercial customer" and divide the results into 4 blocks keeping only the first (i.e. highest ranking) block.
select customer_id, sum_amount
from
(
select
fk_customer_id as customer_id,
sum(order_amount) as sum_amount,
ntile(4) over (order by sum(order_amount) desc) as block
from cust_order
where fk_customer_id in (select fk_customer_id from commercial_cust)
group by fk_customer_id
)
where block = 1
order by sum_amount desc;
If you want to use the payments instead, then do the same but join the payments to the orders and use that amount:
select customer_id, sum_amount
from
(
select
o.fk_customer_id as customer_id,
sum(p.payment_ammount) as sum_amount,
ntile(4) over (order by sum(p.payment_ammount) desc) as block
from cust_order co
join payment_link pl on pl.fk_order_id = o.order_id
join payment p on p.payment_id = pl.fk_payment_id
where o.fk_customer_id in (select fk_customer_id from commercial_cust)
group by o.fk_customer_id
)
where block = 1
order by sum_amount desc;
Try this query instead, you ahve a lot of semantic and syntax errors:
SELECT customer_id, company_code, payment, cust_order, order_id, payment_date, SUM(payment_ammount) payment_ammount
FROM customer, commercial_cust, cust_order, payment_link, payment
WHERE commercial_cust.FK_customer_id = customer.customer_id
AND payment_link.FK_order_id = cust_order.order_id
AND cust_order.FK_customer_id = customer.customer_id AND payment_link.FK_payment_id = payment.payment_id
GROUP BY payment_ammount.DESC
HAVING ROWNUM <=(SELECT COUNT(*) FROM cust_order)/4;

Summing the most recent rows, grouped by the id

SELECT distinct on (prices.item_id) *
FROM prices
ORDER BY prices.item_id, prices.updated_at DESC
The above query retrieves the most recent prices, how would I get the total sum of all the current prices?
Is it possible without using a subselect?
This is trivial using a subquery:
select sum(p.price)
from (select distinct on (p.item_id) p.*
from prices p
order by p.item_id, p.updated_at desc
) p
If you don't mind repeated rows, I think the following might work:
select distinct on (p.item_id) sum(prices.price) over ()
from prices p
order by p.item_id, p.updated_at desc
You might be able to add a limit clause to this to get what you want. By the way, I would write this as:
select sum(p.price)
from (select p.*,
row_number() over (partition by p.item_id order by updated_at desc) as seqnum
from prices p
order by p.item_id, p.updated_at desc
) p
where seqnum = 1
ROW_NUMBER() is standard SQL. The DISTINCT ON clause is specific to Postgres.

SQL question about GROUP BY

I've been using SQL for a few years, and this type of problem comes up here and there, and I haven't found an answer. But perhaps I've been looking in the wrong places - I'm not really sure what to call it.
For the sake of brevity, let's say I have a table with 3 columns: Customer, Order_Amount, Order_Date. Each customer may have multiple orders, with one row for each order with the amount and date.
My Question: Is there a simple way in SQL to get the DATE of the maximum order per customer?
I can get the amount of the maximum order for each customer (and which customer made it) by doing something like:
SELECT Customer, MAX(Order_Amount) FROM orders GROUP BY Customer;
But I also want to get the date of the max order, which I haven't figured out a way to easily get. I would have thought that this would be a common type of question for a database, and would therefore be easy to do in SQL, but I haven't found an easy way to do it yet. Once I add Order_Date to the list of columns to select, I need to add it to the Group By clause, which I don't think will give me what I want.
Apart from self-join you can do:
SELECT o1.*
FROM orders o1 JOIN orders o2 ON o1.Customer = o2.Customer
GROUP BY o1.Customer, o1.Order_Amount
HAVING o1.Order_Amount = MAX(o2.Order_Amount);
There's a good article reviewing various approaches.
And in Oracle, db2, Sybase, SQL Server 2005+ you would use RANK() OVER.
SELECT * FROM (
SELECT *
RANK() OVER (PARTITION BY Customer ORDER BY Order_Amount DESC) r
FROM orders) o
WHERE r = 1;
Note: If Customer has more than one order with maximum Order_Amount (i.e. ties), using RANK() function would get you all such orders; to get only first one, replace RANK() with ROW_NUMBER().
There's no short-cut... the easiest way is probably to join to a sub-query:
SELECT
*
FROM
orders JOIN
(
SELECT Customer, MAX(Order_Amount) AS Max_Order_Amount
FROM orders
GROUP BY Customer
) maxOrder
ON maxOrder.Customer = orders.Customer
AND maxOrder.Max_Order_Amount = orders.Order_Amount
you will want to join on the same table...
SELECT Customer, order_date, amt
FROM orders o,
( SELECT Customer, MAX(Order_Amount) amt FROM orders GROUP BY Customer ) o2
WHERE o.customer = o2.customer
AND o.order_amount = o2.amt
;
Another approach for the collection:
WITH tempquery AS
(
SELECT
Customer
,Order_Amount
,Order_Date
,row_number() OVER (PARTITION BY Customer ORDER BY Order_Amount DESC) AS rn
FROM
orders
)
SELECT
Customer
,Order_Amount
,Order_Date
FROM
tempquery
WHERE
rn = 1
If your DB Supports CROSS APPLY you can do this as well, but it doesn't handle ties correctly
SELECT [....]
FROM Customer c
CROSS APPLY
(SELECT TOP 1 [...]
FROM Orders o
WHERE c.customerID = o.CustomerID
ORDER BY o.Order_Amount DESC) o
See this data.SE query
You could try something like this:
SELECT Customer, MAX(Order_Amount), Order_Date
FROM orders O
WHERE ORDER_AMOUNT = (SELECT MAX(ORDER_AMOUNT) FROM orders WHERE CUSTOMER = O.CUSTOMER)
GROUP BY CUSTOMER, Order_Date
with t as
(
select CUSTOMER,Order_Date ,Order_Amount,max(Order_Amount) over (partition
by Customer) as
max_amount from orders
)
select * from t where t.Order_Amount=max_amount

How to get the max row number per group/partition in SQL Server?

I'm using SQL Server 2005. I have a payments table with payment id's, user id's, and timestamps. I want to find the most recent payment for each user. This is easy to search and find an answer for. What I also want to know though is if the most recent payment is the user's first payment or not.
I have the following which will number each user's payments:
SELECT
p.payment_id,
p.user_id,
ROW_NUMBER() OVER (PARTITION BY p.user_id ORDER BY p.payment_date) AS paymentNumber
FROM
payment p
I'm not making the mental leap which then lets me then pick the highest paymentNumber per user. If I use the above as a subselect by using MAX(paymentNumber) and then grouping by user_id, I lose the payment_id which I need. But if I also add the payment_id into the group by clause, I'm back to one row per payment. I'm sure I'm overlooking the obvious. Any help?
Try this:
SELECT a.*, CASE WHEN totalPayments>1 THEN 'NO' ELSE 'YES' END IsFirstPayment
FROM(
SELECT p.payment_id,
p.user_id,
ROW_NUMBER() OVER (PARTITION BY p.user_id ORDER BY p.payment_date DESC) AS paymentNumber,
SUM(1) OVER (PARTITION BY p.user_id) AS totalPayments
FROM payment p
) a
WHERE paymentNumber = 1
Do the same thing again.
SELECT
p.payment_id,
p.user_id,
ROW_NUMBER() OVER (PARTITION BY p.user_id ORDER BY p.payment_date) AS paymentNumber,
ROW_NUMBER() OVER (PARTITION BY p.user_id ORDER BY p.payment_date DESC) AS reversePaymentNumber,
FROM
payment p
Now the most recent payment has reversePaymentNumber 1, and the number of payments will be paymentNumber.
The query provided by OP does most of the work. All we need to do is change the ORDER BY clause provided to ROW_NUMBER() to descending at which point the most recent record will have a value of 1. I'm choosing to use a CTE as a matter of personal preference - a subquery would also be fine.
with cte as (
SELECT
p.payment_id,
p.user_id,
ROW_NUMBER() OVER (
PARTITION BY p.user_id
ORDER BY p.payment_date desc
) AS paymentNumber
FROM
payment p
)
select * from cte where paymentNumber = 1
a less cool way i suppose
; with maxp as
(
select
p.user_id,
max(p.payment_date) as MaxPaymentDate
from payment p
group by p.userid
),
nump as
(
select
p.payment_id,
p.user_id,
p.payment_date,
ROW_NUMBER() OVER (PARTITION BY p.user_id ORDER BY p.payment_date) AS paymentNumber
FROM payment p
),
a as
(
select
nump.payment_id,
nump.user_id,
nump.paymentNumber
case when maxp.MaxPaymentDate is null then 'Old' else 'New' end as NewState
from nump
left outer join maxp
on nump.user_id=maxp.user_id
and nump.payment_date=maxp.MaxPaymentDate
)
select
*
from a
where NewState='New'
SELECT * FROM (
SELECT ROW_NUMBER() OVER(PARTITION BY OS.ContactId ORDER BY OS.Date ASC) AS FirstRow#,
ROW_NUMBER() OVER(PARTITION BY OS.ContactId ORDER BY OS.Date DESC) AS LastRow#,
OS.Contactid,CONVERT(VARCHAR,OS.Date,106) 'Purchase Month',
OS.ProductId 'MyCII Subscription/Directory', OS.Charges 'Amount(INR)',OS.Date 'RAWDate'
FROM tblOnlineServices OS
WHERE Date IS NOT NULL AND Contactid IN('C000013112','C000010859')
) FirstPurchase
WHERE FirstRow# = 1 OR LastRow# = 1
ORDER BY Contactid, RAWDate
How about this?
SELECT
p.user_id,
MAX(p.payment_date) as lastPayment,
CASE COUNT(p.payment_id) WHEN 1 THEN 1 ELSE 0 END as isFirstPayment
FROM
payment p
GROUP BY
p.user_id