Max function on date giving multiple records - sql

I have got 3 tables: order, customer and invoice. I need to get the latest invoice number for each customer.
I am using the max function on order date and then grouping by customer number and invoice number, where order status was confirmed or shipped.
select max(o.order_date), c.customer_number, i.invoice_number
from orders o , invoices i , customer c
where o.order_oid = i.order_oid
and c.customer_oid = i.customer_oid
and o.status_oid in ( 4,6)
group by c.customer_number, i.invoice_number;
I am getting duplicate records like:
Date cust_num invc#
1/22/2018 479 I128
4/23/2018 479 I287
5/18/2018 479 I433
It should have returned me only last record. What am I doing wrong?

From your description and comments you seem to want:
select max(o.order_date), c.customer_number,
max(i.invoice_number) keep (dense_rank last order by o.order_date) as invoice_number
from orders o , invoices i , customer c
where o.order_oid = i.order_oid
and c.customer_oid = i.customer_oid
and o.status_oid in ( 4,6)
group by c.customer_number;
The group by no longer includes the invoice number; instead the last invoice number, based on date, is found using last.
If the invoice numbers are strictly in date order, and fixed length, you could potentially just do:
select max(o.order_date), c.customer_number, max(i.invoice_number) as invoice_number
but if if it's possible to go from say invoice I999 to I1000 then it isn't safe to sort those just as strings, since - as a string 'I1000' will sort before 'I999'.
Not related, but you might want to consider moving to modern join syntax:
select max(o.order_date), c.customer_number,
max(i.invoice_number) keep (dense_rank last order by o.order_date) as invoice_number
from orders o
join invoices i on i.order_oid = o.order_oid
join customer c on c.customer_oid = i.customer_oid
where o.status_oid in (4, 6)
group by c.customer_number;

You need max(invoice_number) to avoid getting a record per invoice

You can use row_number() window analytic function
select order_date, customer_number, invoice_number
from
(
select o.order_date, c.customer_number, i.invoice_number,
row_number() over (partition by c.customer_number order by o.order_date desc) as rn
from orders o
join invoices i on o.order_oid = i.order_oid
join customer c on c.customer_oid = i.customer_oid
where o.status_oid in (4,6)
)
where rn = 1;
P.S. : Of course, proper to give up old-style comma seperated join for the queries

Related

SQL Server : select only last record per customer from a join query

Assume I have these 3 tables :
The first 2 tables define customers of different types ,i.e second table has other columns which are not included in table 1 i just left them the same to save complexity.
The third table defines orders for both types of customers . Each customer has more than one orders
I want to select the last order for every customer, i.e the order with order_id 4 for customer 1 which was created on 23/12/2016 and the order with order_id 5 for customer 2 which was created on 26/12/2016
I tried something like this :
select *
from customertype1
left join order on order.customer_id = customertype1.customer_id
order by order_id desc;
But this gives me multiple records for every customer, as I have stated above I want only the last order for every customertype1.
If you want the last order for each customer, then you only need the orders table:
select o.*
from (select o.*,
row_number() over (partition by customer_id order by datecreated desc) as seqnum
from orders o
) o
where seqnum = 1;
If you want to include all customers, then you need to combine the two tables. Assuming they are mutually exclusive:
with c as (
select customer_id from customers1 union all
select customer_id from customers2
)
select o.*
from c left join
(select o.*,
row_number() over (partition by customer_id order by datecreated desc) as seqnum
from orders o
) o
on c.customer_id = o.customer_id and seqnum = 1;
A note about your data structure: You should have one table for all customers. You can then define a foreign key constraint between orders and customers. For the additional columns, you can have additional tables for the different types of customers.
Use ROW_NUMBER() and PARTITION BY.
ROW_NUMBER(): it will give sequence no to your each row
PARTITION BY: it will group your data by given column
When you use ROW_NUMBER() and PARTITION BY both together then first partition by group your records and then row_number give then sequence no by each group, so for each group you have start sequence from 1
Help Link: Example of ROW_NUMBER() and PARTITION BY
This is the general idea. You can work out the details.
with customers as
(select customer_id, customer_name
from table1
union
select customer_id, customer_name
from table2)
, lastOrder as
(select customer_id, max(order_id) maxOrderId
from orders
group by customer_id)
select *
from lastOrder join customers on lastOrder.Customer_id = customers.customer_id
join orders on order_id = maxOrderId

Nested query missing expression

SELECT customer_id, company_code
FROM customer, commercial_cust
WHERE commercial_cust.FK_customer_id = customer.customer_id
(
SELECT payment_method, payment_date
FROM payment, cust_order
WHERE payment_link.FK_order_id = cust_order.order_id
(
SELECT order_id, payment_date, SUM(payment_ammount) payment_ammount
FROM cust_order, payment_link, payment
WHERE cust_order.FK_customer_id = customer.customer_id AND payment_link.FK_payment_id = payment.payment_id
)
GROUP BY payment_ammount.DESC
)
WHERE ROWNUM <=(SELECT COUNT(*) FROM cust_order)/4;
For a database assignment I have been asked to display a list of 25% more lucrative commercial customers. I have written this out but I keep getting a missing element error and I'm not sure where I am meant to put the semicolon (if it is the semicolon). I've tried moving it around and removing parts of the script but it doesn't seem to be working.
This will be hugely appreciated if somebody can help. The rest of the code is correct in terms of names etc.
You should have shown what your tables contain, how they are related, and should have given sample data and expected result. So my answer may not match your requirement completely.
Let's simply look at how much a customer ordered: sum the order amount per customer, make sure the customer is a "commercial customer" and divide the results into 4 blocks keeping only the first (i.e. highest ranking) block.
select customer_id, sum_amount
from
(
select
fk_customer_id as customer_id,
sum(order_amount) as sum_amount,
ntile(4) over (order by sum(order_amount) desc) as block
from cust_order
where fk_customer_id in (select fk_customer_id from commercial_cust)
group by fk_customer_id
)
where block = 1
order by sum_amount desc;
If you want to use the payments instead, then do the same but join the payments to the orders and use that amount:
select customer_id, sum_amount
from
(
select
o.fk_customer_id as customer_id,
sum(p.payment_ammount) as sum_amount,
ntile(4) over (order by sum(p.payment_ammount) desc) as block
from cust_order co
join payment_link pl on pl.fk_order_id = o.order_id
join payment p on p.payment_id = pl.fk_payment_id
where o.fk_customer_id in (select fk_customer_id from commercial_cust)
group by o.fk_customer_id
)
where block = 1
order by sum_amount desc;
Try this query instead, you ahve a lot of semantic and syntax errors:
SELECT customer_id, company_code, payment, cust_order, order_id, payment_date, SUM(payment_ammount) payment_ammount
FROM customer, commercial_cust, cust_order, payment_link, payment
WHERE commercial_cust.FK_customer_id = customer.customer_id
AND payment_link.FK_order_id = cust_order.order_id
AND cust_order.FK_customer_id = customer.customer_id AND payment_link.FK_payment_id = payment.payment_id
GROUP BY payment_ammount.DESC
HAVING ROWNUM <=(SELECT COUNT(*) FROM cust_order)/4;

ORACLE SQL Return only duplicated values (not the original)

I have a database with the following info
Customer_id, plan_id, plan_start_dte,
Since some customer switch plans, there are customers with several duplicated customer_ids, but with different plan_start_dte. I'm trying to count how many times a day members switch to the premium plan from any other plan ( plan_id = 'premium').
That is, I'm trying to do roughly this: return all rows with duplicate customer_id, except for the original plan (min(plan_start_dte)), where plan_id = 'premium', and group them by plan_start_dte.
I'm able to get all duplicate records with their count:
with plan_counts as (
select c.*, count(*) over (partition by CUSTOMER_ID) ct
from CUSTOMERS c
)
select *
from plan_counts
where ct > 1
The other steps have me stuck. First I tried to select everything except the original plan:
SELECT CUSTOMERS c
where START_DTE not in (
select min(PLAN_START_DTE)
from CUSTOMERS i
where c.CUSTOMER_ID = i.CUSTOMER_ID
)
But this failed. If I can solve this I believe all I have to add is an additional condition where c.PLAN_ID = 'premium' and then group by date and do a count. Anyone have any ideas?
I think you want lag():
select c.*
from (select c.*,
lag(plan_id) over (partition by customer_id order by plan_start_date) as prev_plan_id
from customers c
) c
where prev_plan_id <> 'premium' and plan_id = 'premium';
I'm not sure what output you want. For the number of times this occurs per day:
select plan_start_date, count(*)
from (select c.*, lag(plan_id) over (partition by customer_id order by plan_start_date) as prev_plan_id
from customers c
) c
where prev_plan_id <> 'premium' and plan_id = 'premium'
group by plan_start_date
order by plan_start_date;

select last order date for each customer id

I have a list of customerids, orderids and order dates that I want to use in another query to determine if the customer has ordered again since this date.
Example Data:
CustomerID OrderID OrderDate
6619 16034 2012-11-15 10:23:02.603
6858 18482 2013-03-25 11:07:14.680
4784 17897 2013-02-20 14:45:43.640
5522 16188 2012-11-22 14:53:49.840
6803 18016 2013-02-28 10:41:16.713
Query:
SELECT dbo.[Order].CustomerID, dbo.[Order].OrderID, dbo.[Order].OrderDate
FROM dbo.[Order] INNER JOIN
dbo.OrderLine ON dbo.[Order].OrderID = dbo.OrderLine.OrderID
WHERE (dbo.OrderLine.ProductID in (42, 44, 45, 46,47,48))
If you need anything else, just ask.
UPDATE::
This query brings back the results as shown above
Need to know if the customer has ordered again since, for any product id after ordering one of the products in the query above..
Mike
If you are only interested in last order date for each customer
select customerid, max(orderdate) from theTable group by customerid;
In MS SQL you can use TOP 1 for this, you also need to order by your order date column in descending order.
see here SQL Server - How to select the most recent record per user?
SELECT dbo.[Order].CustomerID, MAX(dbo.[Order].OrderDate)
FROM dbo.[Order] INNER JOIN
dbo.OrderLine ON dbo.[Order].OrderID = dbo.OrderLine.OrderID
WHERE (dbo.OrderLine.ProductID in (42, 44, 45, 46,47,48))
GROUP BY dbo.[Order].CustomerID
Gets the latest orderdate of a customer.
ROW_NUMBER in a CTE should work:
WITH cte
AS (SELECT customerid,
orderid,
orderdate,
rn = Row_number()
OVER(
partition BY customerid
ORDER BY orderdate DESC)
FROM dbo.tblorder
WHERE orderdate >= #orderDate
AND customerid = #customerID)
SELECT customerid, orderid, orderdate
FROM cte
WHERE rn = 1
DEMO
(i've omitted the join since no column from the other table was needed, simply add it)
CustomerID and latest OrderDate for customers that have ordered any product after ordering any of a set of products
I suspect they were promotional products
SELECT [Order].[CustomerID], max([Order].[OrderDate])
FROM [Order]
JOIN [Order] as [OrderBase]
ON [OrderBase].[CustomerID] = [Order].[CustomerID]
AND [OrderBase].[OrderDate] < [Order].[OrderDate]
JOIN [OrderLine]
ON [OrderLine].[OrderID] = [OrderBase].[OrderID]
AND [OrderLine].[ProductID] in (42,44,45,46,47,48)
GROUP BY [Order].[CustomerID]

SQL question about GROUP BY

I've been using SQL for a few years, and this type of problem comes up here and there, and I haven't found an answer. But perhaps I've been looking in the wrong places - I'm not really sure what to call it.
For the sake of brevity, let's say I have a table with 3 columns: Customer, Order_Amount, Order_Date. Each customer may have multiple orders, with one row for each order with the amount and date.
My Question: Is there a simple way in SQL to get the DATE of the maximum order per customer?
I can get the amount of the maximum order for each customer (and which customer made it) by doing something like:
SELECT Customer, MAX(Order_Amount) FROM orders GROUP BY Customer;
But I also want to get the date of the max order, which I haven't figured out a way to easily get. I would have thought that this would be a common type of question for a database, and would therefore be easy to do in SQL, but I haven't found an easy way to do it yet. Once I add Order_Date to the list of columns to select, I need to add it to the Group By clause, which I don't think will give me what I want.
Apart from self-join you can do:
SELECT o1.*
FROM orders o1 JOIN orders o2 ON o1.Customer = o2.Customer
GROUP BY o1.Customer, o1.Order_Amount
HAVING o1.Order_Amount = MAX(o2.Order_Amount);
There's a good article reviewing various approaches.
And in Oracle, db2, Sybase, SQL Server 2005+ you would use RANK() OVER.
SELECT * FROM (
SELECT *
RANK() OVER (PARTITION BY Customer ORDER BY Order_Amount DESC) r
FROM orders) o
WHERE r = 1;
Note: If Customer has more than one order with maximum Order_Amount (i.e. ties), using RANK() function would get you all such orders; to get only first one, replace RANK() with ROW_NUMBER().
There's no short-cut... the easiest way is probably to join to a sub-query:
SELECT
*
FROM
orders JOIN
(
SELECT Customer, MAX(Order_Amount) AS Max_Order_Amount
FROM orders
GROUP BY Customer
) maxOrder
ON maxOrder.Customer = orders.Customer
AND maxOrder.Max_Order_Amount = orders.Order_Amount
you will want to join on the same table...
SELECT Customer, order_date, amt
FROM orders o,
( SELECT Customer, MAX(Order_Amount) amt FROM orders GROUP BY Customer ) o2
WHERE o.customer = o2.customer
AND o.order_amount = o2.amt
;
Another approach for the collection:
WITH tempquery AS
(
SELECT
Customer
,Order_Amount
,Order_Date
,row_number() OVER (PARTITION BY Customer ORDER BY Order_Amount DESC) AS rn
FROM
orders
)
SELECT
Customer
,Order_Amount
,Order_Date
FROM
tempquery
WHERE
rn = 1
If your DB Supports CROSS APPLY you can do this as well, but it doesn't handle ties correctly
SELECT [....]
FROM Customer c
CROSS APPLY
(SELECT TOP 1 [...]
FROM Orders o
WHERE c.customerID = o.CustomerID
ORDER BY o.Order_Amount DESC) o
See this data.SE query
You could try something like this:
SELECT Customer, MAX(Order_Amount), Order_Date
FROM orders O
WHERE ORDER_AMOUNT = (SELECT MAX(ORDER_AMOUNT) FROM orders WHERE CUSTOMER = O.CUSTOMER)
GROUP BY CUSTOMER, Order_Date
with t as
(
select CUSTOMER,Order_Date ,Order_Amount,max(Order_Amount) over (partition
by Customer) as
max_amount from orders
)
select * from t where t.Order_Amount=max_amount