SQL - Remove duplicates to show the latest date record - sql

I have a view which ultimately I want to return 1 row per customer.
Currently its a Select as follows;
SELECT
Customerid,
MAX(purchasedate) AS purchasedate,
paymenttype,
delivery,
amount,
discountrate
FROM
Customer
GROUP BY
Customerid,
paymenttype,
delivery,
amount,
discountrate
I was hoping the MAX(purchasedate) would work but when I do my groupings it breaks as sometimes there could be a discountrate, sometimes its NULL, paymenttype can differ for each customer also, is there anyway just to show the last purchase a customer makes?

since SQL Server 2008 r2 supports windows function,
SELECT Customerid,
purchasedate,
paymenttype,
delivery,
amount,
discountrate
FROM
(
SELECT Customerid,
purchasedate,
paymenttype,
delivery,
amount,
discountrate,
ROW_NUMBER() OVER (Partition By CustomerID
ORDER BY purchasedate DESC) rn
FROM Customer
) derivedTable
WHERE derivedTable.rn = 1
or by using Common Table Expression
WITH derivedTable
AS
(
SELECT Customerid,
purchasedate,
paymenttype,
delivery,
amount,
discountrate,
ROW_NUMBER() OVER (Partition By CustomerID
ORDER BY purchasedate DESC) rn
FROM Customer
)
SELECT Customerid,
purchasedate,
paymenttype,
delivery,
amount,
discountrate
FROM derivedTable
WHERE derivedTable.rn = 1
or by using join with subquery which works in other DBMS
SELECT a.*
FROM Customer a
INNER JOIN
(
SELECT CustomerID, MAX(purchasedate) maxDate
FROM Customer
GROUP BY CustomerID
) b ON a.CustomerID = b.CustomerID AND
a.purchasedate = b.maxDate

Related

Combining multiple queries

I want a table with all customers and their last charge transaction date and their last invoice date. I have the first two, but don't know how to add the last invoice date to the table. Here's what I have so far:
WITH
--Last customer transaction
cust_trans AS (
SELECT customer_id, created
FROM charges a
WHERE created = (
SELECT MAX(created) AS last_trans
FROM charges b
WHERE a.customer_id = b.customer_id)),
--All customers
all_cust AS (
SELECT customers.id AS customer, customers.email, CAST(customers.created AS DATE) AS join_date, ((1.0 * customers.account_balance)/100) AS balance
FROM customers),
--Last customer invoice
cust_inv AS (
SELECT customer_id, date
FROM invoices a
WHERE date = (
SELECT MAX(date) AS last_inv
FROM invoices b
WHERE a.customer_id = b.customer_id))
SELECT * FROM cust_trans
RIGHT JOIN all_cust ON all_cust.customer = cust_trans.customer_id
ORDER BY join_date;
This should get what you need. Notice each individual subquery is left-joined to the customer table, so you always START with the customer, and IF there is a corresponding record in each subquery for max charge date or max invoice date, it will be pulled in. Now, you may want to apply a COALESCE() for the max dates to prevent showing nulls, such as
COALESCE(maxCharges.LastChargeDate, '') AS LastChargeDate
but your call.
SELECT
c.id AS customer,
c.email,
CAST(c.created AS DATE) AS join_date,
((1.0 * c.account_balance) / 100) AS balance,
maxCharges.LastChargeDate,
maxInvoices.LastInvoiceDate
FROM
customers c
LEFT JOIN
(SELECT
customer_id,
MAX(created) LastChargeDate
FROM
charges
GROUP BY
customer_id) maxCharges ON c.id = maxCharges.customer_id
LEFT JOIN
(SELECT
customer_id,
MAX(date) LastInvoiceDate
FROM
invoices
GROUP BY
customer_id) maxInvoices ON c.id = maxInvoices.customer_id
ORDER BY
c.created

Psql : Get Min, Max and Count records for each partner' invoice, and last payment

I have a table invoice like this :
id, partner_id, number, invoice_date
And a Payment Table like this:
id, payment_date, partner_id
I want to get min and max for both number & invoice_date, and count invoices, and last payment for each partner, something like this :
partner_id, min number, min date, max number, max date, count, last_pay
1, INV-2017-003, 02-01-2017, INV-2020-010, 01-01-2020, 142, 02-12-2019
5, INV-2019-124, 05-03-2019, INV-2020-005, 01-01-2020, 150, 01-01-2020
....
You can join those three tables including partners and grouping by parners' id column along with related aggregations :
select pr.id, min(invoice_date), max(invoice_date), count(*), max(payment_date) as last_pay
from partners pr
left join invoices i on i.partner_id = pr.id
left join payments p on p.partner_id = pr.id
group by pr.id
Update : You can use min() over (), max() over () and row_number() analytic functions to get the desired code depending on max and min dates :
select *
from
(
select pr.id,
min(invoice_date) over (partition by pr.id order by invoice_date) as min_invoice_date,
max(invoice_date) over (partition by pr.id order by invoice_date desc) as max_invoice_date,
max(code) over (partition by pr.id order by invoice_date desc) as max_code,
min(code) over (partition by pr.id order by invoice_date) as min_code,
count(*) over (partition by pr.id) as cnt,
max(payment_date) over (partition by pr.id) as last_pay,
row_number() over (partition by pr.id order by invoice_date desc) as rn
from partners pr
left join invoices i on i.partner_id = pr.id
left join payments p on p.partner_id = pr.id
) q
where rn = 1
Why isn't this simple aggregation?
select i.partner_id,
min(i.number) as min_number,
min(i.invoice_date) as min_invoice_date,
max(i.number) as min_number,
max(i.invoice_date) as min_invoice_date,
count(distinct i.invoice_id) as num_invoices,
max(p.payment_date) as max_payment_date
from invoices i left join
payments p
on p.invoice_id = i.invoice_id
group by i.partner_id;
If you want the number on the earliest invoice (and min() doesn't work), then you can do this with a "first" aggregation function. Unfortunately, Postgres doesn't directly support one. But it does through array functions:
select i.partner_id,
(array_agg(i.number order by i.invoice_date asc))[1] as min_number,
min(i.invoice_date) as min_invoice_date,
(array_agg(i.number order by i.invoice_date desc))[1] as min_number,
max(i.invoice_date) as min_invoice_date,
count(distinct i.invoice_id) as num_invoices,
max(p.payment_date) as max_payment_date
from invoices i left join
payments p
on p.partner_id = i.partner_id
group by i.partner_id;
This is similar to #BarbarosĂ–zhan, but calculates the min/max before the join (if you got multiple rows per partner for both invoices and payments the COUNT will be wrong otherwise). Additionally there's only a single PARTTION/ORDER which should result in a more efficient plan.
SELECT i.*, p.last_pay
FROM
( -- 1st row has all the min values = filtered using row_number
SELECT
partner_id
,number AS min_code
,invoice_date AS min_invoice_date
-- value from row with max date
,Last_Value(number)
Over (PARTITION BY partner_id
ORDER BY invoice_date
ROWS BETWEEN Unbounded Preceding AND Unbounded Following) AS max_code
,Last_Value(invoice_date)
Over (PARTITION BY partner_id
ORDER BY invoice_date
ROWS BETWEEN Unbounded Preceding AND Unbounded Following) AS max_invoice_date
,Count(*)
Over (PARTITION BY partner_id) AS Cnt
,Row_Number()
Over (PARTITION BY partner_id ORDER BY invoice_date) AS rn
FROM invoices
) AS i
LEFT JOIN
( -- max payment date per partner
SELECT partner_id, Max(payment_date) AS last_pay
FROM payments
GROUP BY partner_id
) AS p
ON p.partner_id = i.partner_id
WHERE i.rn = 1

Calculate top two performing product categories from Sales data

I am trying to build a KPI of top 2 performing product categories for each customer.
I have sales data with following relevant columns -
customerid, product, product_category, order_qty, product_amt , order_date
I am using legacy SQL syntax in BQ.
This is a possible solution...
SELECT
customer_id,
product_category,
order_qty
FROM (
SELECT
customerid,
product_category,
SUM(order_qty) AS order_qty,
ROW_NUMBER() OVER(PARTITION BY customerid ORDER BY order_qty DESC) AS rn
FROM
[project:dataset.table]
GROUP BY
1, 2
)
WHERE
rn <= 2
ORDER BY
1, 3 DESC

How to get current and last order from Northwind db using Correlated queries

Using the northwind db on mssql, i am trying to retrieve the customer's last two order dates and calculate the time between the two orders.
So something like
select c.CompanyName, o.OrderDate, o2.OrderDate,
DateDiff(d, o.OrderDate, o2.OrderDate) as TimeElapsed
unfortunately not sure how to construct it from there.
i have something like this but it's still wrong.
select c.CompanyName, o.OrderDate, o2.OrderDate,
DateDiff(d, o.OrderDate, o2.OrderDate) as TimeElapsed
from Orders o
INNER JOIN Customers ON c.CustomerID = o.CustomerID
INNER JOIN (
select OrderID, OrderDate
FROM Orders
order by OrderDate
OFFSET 1 ROWS
FETCH NEXT 1 ROW ONLY
) as o2 ON o.OrderID = o2.OrderID;
can anyone assist.
Thank you
Northwind has been obsolete for years; even AdventureWorks has been replaced. The following uses the latter but you should be able to easily translate it to your schema. Two different approaches. The last 2 select statements are used to verify the results. Notice that customer 30099 has only one order.
set nocount on;
with cte as (select SalesOrderID, OrderDate, CustomerID, row_number () over (partition by CustomerID order by OrderDate desc) as rn
from Sales.SalesOrderHeader)
select top 10 * from cte
where rn <= 2
order by CustomerID, rn;
with cte as (select SalesOrderID, OrderDate, CustomerID, row_number () over (partition by CustomerID order by OrderDate desc) as rn
from Sales.SalesOrderHeader)
select cte.CustomerID, min(cte.OrderDate) as mindate, max(cte.OrderDate),
case when min(cte.OrderDate) = max(cte.OrderDate) then cast(null as int)
else datediff(day, min(cte.OrderDate), max(cte.OrderDate)) end as dif
from cte
where rn <= 2
group by cte.CustomerID
order by CustomerID;
with cte as (select SalesOrderID, OrderDate, CustomerID, row_number () over (partition by CustomerID order by OrderDate desc) as rn
from Sales.SalesOrderHeader)
select cte.CustomerID, minr.OrderDate as mindate, cte.OrderDate as maxdate,
datediff(day, minr.OrderDate, cte.OrderDate) as dif
from cte left join cte as minr on cte.CustomerID = minr.CustomerID and minr.rn = 2
where cte.rn = 1
order by cte.CustomerID;
select top 2 CustomerID, OrderDate from Sales.SalesOrderHeader where CustomerID = 30118 order by OrderDate desc;
select top 2 CustomerID, OrderDate from Sales.SalesOrderHeader where CustomerID = 30099 order by OrderDate desc;

How to sort by Total Sum calculated field in a Tablix

I have a Report in Microsoft Visual Studio 2010 that has a tablix. I have a list of Customers Sales grouped by Month. I would like to add a grand total of all the Months for each customer. I would then like to sort by descending amount of the grand total. I have added the grand total, but I can't figure out how to sort on it. Any suggestions?
Here is the initial dataset query:
SELECT
Customer, CustomerName, FiscalMonthNum, FiscalYear, SalesDlr
FROM
CustomerSalesDollars
WHERE
FiscalYear IN ('2013')
ORDER BY
SalesDlr DESC
with CSD as (
select Customer, CustomerName, FiscalMonthNum, FiscalYear, SalesDlr
from CustomerSalesDollars
WHERE FiscalYear in ('2013')
), YearlyTotals as (
select FiscalYear, Customer, CustomerName, SUM(SalesDlr) as YearlyTotal
from CSD
group by FiscalYear, Customer, CustomerName
)
select * from YearlyTotals
order by YearlyTotal desc
If you still want all the monthly breakdowns:
with CSD as (
select Customer, CustomerName, FiscalMonthNum, FiscalYear, SalesDlr
from CustomerSalesDollars
WHERE FiscalYear in ('2013')
), YearlyTotals as (
select FiscalYear, Customer, CustomerName, SUM(SalesDlr) as YearlyTotal
from CSD
group by FiscalYear, Customer, CustomerName
)
select CSD.*, YT.YearlyTotal from YearlyTotals YT
join CSD on CSD.FiscalYear = YT.FiscalYear
and CSD.Customer = YT.Customer
and CSD.CustomerName = YT.CustomerName
order by YearlyTotal desc, CSD.SalesDlr desc