How do you find the average datediff by customer? - sql

I have the following data:
CustomerID OrderDate
1 2011-02-16
1 2011-04-20
2 2011-04-25
2 2011-10-24
2 2011-11-14
How do I find the average DATEDIFF for each customer? The results I want would be the CustomerID and the average difference of dates between their orders. I really appreciate the help. This has had me stuck for months. Thank you in advance.
Additional Notes**
I cannot use the lag function because of the server I am using.

In SQLServer 2012 you can partition the lag changing the query of Binaya Regmi to
with cte as (
SELECT CustomerID, OrderDate,
LAG(OrderDate) OVER (PARTITION BY CustomerID
ORDER BY CustomerID, OrderDate) AS PrevDate
FROM T)
Select customerid, avg(datediff(d, prevdate, orderdate)) average
From cte
Group By customerid
A query, non optimized, but using mostly standard sql (as the requester has not stated his RDBMS) is
SELECT customerid, avg(datediff(d,prevdate, OrderDate)) average
FROM (SELECT ext.customerid, ext.OrderDate, max(prevdate) prevdate
FROM orders ext
INNER JOIN (SELECT customerid, orderdate prevdate
FROM orders) sub
ON ext.customerid = sub.customerid
AND ext.OrderDate > sub.prevdate
GROUP BY ext.customerid, orderdate) a
GROUP BY customerid

Assuming T is your table and you want average difference of dates in day, following is the code in SQL Server 2012:
with cte as (
SELECT CustomerID, OrderDate,
LAG(OrderDate) OVER (PARTITION BY CustomerID
ORDER BY CustomerID, OrderDate) AS PrevDate
FROM T)
select customerid, avg(datediff(d, prevdate, orderdate )) as AvgDay
from cte group by customerid;

Related

Find the max of a sum

I need some help in using the sum and max functions in SQL.
I want to display for each year, the month with the highest sales.
I have 2 tables
sales.orderline:
orderno - prodno - quantity - price - linetotal
sales.custorder:
orderno - custno - salesrep - orderdate
This is what I have:
select year(orderdate) as year, month(orderdate) as month, sum(linetotal) as sales
from sales.custorder
inner join sales.orderline on sales.custorder.orderno = sales.orderline.orderno
where year(orderdate) is not null and month(orderdate) is not null
group by month(orderdate), year(orderdate)
My problem is that this shows the total for each month of the year and I don't know how to select only the month with the highest total for each year. My only idea was max(sum()) which doesn't work.
You can use window functions, if your database supports them:
select *
from (
select
year(orderdate) as yr,
month(orderdate) as mn,
sum(linetotal) as sales,
rank() over(partition by year(orderdate) order by sum(linetotal) desc) rn
from sales.custorder
inner join sales.orderline on sales.custorder.orderno = sales.orderline.orderno
where year(orderdate) is not null and month(orderdate) is not null
group by month(orderdate), year(orderdate)
) t
where rn = 1
order by yr
Note that rank() allows top ties, if any.
Unrelated: condition year(orderdate) is not null and month(orderdate) is not null can probably be simplified as orderdate is not null.
You can use row_number(). Let's say that if you have two months with same sales in a year then you can use dense_rank().
select
year,
month,
sales
from
(
select
year(orderdate) as year,
month(orderdate) as month,
sum(linetotal) as sales,
row_numbe() over (partition by year(orderdate) order by sum(linetotal) desc) as rnk
from sales.custorder sc
inner join sales.orderline so
on sc.orderno = so.orderno
where year(orderdate) is not null
and month(orderdate) is not null
group by
month(orderdate),
year(orderdate)
) val
where rnk = 1
order by
year,
month

How to summarize information over the dynamic period in sql?

I have a table with orders and the following fields:
create table orders2 (
orderID int,
customerID int,
date DateTime,
amount int)
engine=Memory;
Each customer can make 0 or many orders each day. I need to create an SQL query that will show for each customer how many orders he/she made during the period of 3 days starting from the day when the customer has made his/her first order.
So, for each customer, the query should detect the date of the first order, then compute the date that is 3 days in the future from the first date, then filter rows to take only orders with dates in the given range, and then perform counting of orders (orderID) in that time period. At the moment, I was able to just detect the date of the first order for each customer.
SELECT
O.customerID,
O.date AS first_day,
COUNT(O.orderID) AS first_day_orders_num,
SUM(O.amount) AS first_day_amount
FROM orders2 AS O
INNER JOIN
(
SELECT
customerID,
MIN(date) AS first_date
FROM orders2
GROUP BY customerID
) AS I ON (O.customerID = I.customerID) AND (O.date = I.first_date)
GROUP BY
O.customerID,
O.date
I don't really understand what result do you need. Probably it can be solved using arrays.
Here is solution using vanilla sql
select customerID, min(first_date), sum(num_orders_per_day)
from (
select customerID, date, min(date) first_date, count() num_orders_per_day
from orders2
group by customerID, date
having date <= first_date + interval 3 days
)
group by customerID
You can use window functions to get the first order date:
select o.CustomerID, count(*) as num_orders_3_days
from (select o.*, min(date) over (partition by CustomerID) as min_date
from orders o
) o
where date < min_date + interval '3 day'
group by CustomerID;
Try this query:
SELECT customerID, orders_count
FROM (
SELECT customerID,
arraySort(x -> x.1, groupArray((date, orderID))) sorted_date_per_order_pairs,
sorted_date_per_order_pairs[1].1 + INTERVAL 3 day AS end_date,
arrayFilter(x -> x.1 < end_date, sorted_date_per_order_pairs) orders_in_period,
length(orders_in_period) orders_count
FROM orders2
GROUP BY customerID);

Issues with grouping this SQL query

So I'm trying to find the month/year that had the highest number of sale transactions.
my query currently is:
SELECT
DATENAME(M, OrderDate) as orderMonth,
year(OrderDate) as orderYear,
count(SalesOrderID) as orderCount
FROM
Sales.SalesOrderHeader soh
GROUP BY OrderDate
HAVING SUM(soh.SalesOrderID) >= ALL (
SELECT SUM(SalesOrderID) FROM Sales.SalesOrderHeader
GROUP BY OrderDate
)
however if I run everything above the HAVING line so that it returns all columns instead of just the highest column, it returns several duplicates of months/years and the orderCounts. for example, June 2011 has about 30 rows being returned in this query, each of those ranging somewhere between 2 and 11 orderCounts, in total the query returns 1124 rows, where it should only be returning 38 since the sales range from 2011 - 2014 and there's 38 months total within that range.
I'm pretty sure I need to specify a monthly group and should be changing my GROUP BYs to something like:
GROUP BY DATENAME(month, soh.OrderDate), DATENAME(YYYY, soh.OrderDate)
but then i get an error "Each GROUP BY expression must contain at least one column that is not an outer reference"
Your problem is that you are aggregating by OrderDate rather than by the month and year. So, your version of the query should look like:
SELECT DATENAME(MONTH, OrderDate) as orderMonth,
YEAR(OrderDate) as orderYear,
COUNT(*) as orderCount
FROM Sales.SalesOrderHeader soh
GROUP BY DATENAME(MONTH, OrderDate), YEAR(OrderDate)
HAVING COUNT(*) >= ALL (SELECT COUNT(*)
FROM Sales.SalesOrderHeader soh2
GROUP BY DATENAME(MONTH, OrderDate), YEAR(OrderDate)
);
However, no one would really write the query like that. It is simpler and more performant to use TOP and ORDER BY. The equivalent of your query is:
SELECT TOP (1) WITH TIES DATENAME(MONTH, OrderDate) as orderMonth,
YEAR(OrderDate) as orderYear,
COUNT(*) as orderCount
FROM Sales.SalesOrderHeader soh
GROUP BY DATENAME(MONTH, OrderDate), YEAR(OrderDate)
ORDER BY orderCount DESC;
Both these return all months with the maximum value -- if there are duplicates. If you want to guarantee only one row in the result set, use SELECT TOP (1) rather than SELECT TOP (1) WITH TIES.
Not sure which sql syntax you are using, but you could just sort by transactions and select the highest record?
SELECT top 1
DATENAME(M, OrderDate) as orderMonth,
year(OrderDate) as orderYear,
count(SalesOrderID) as orderCount
FROM
Sales.SalesOrderHeader soh
Group by orderMonth, orderYear
order by orderCount asc

Query on MAX on date column, and COUNT of another column

I performed the following query with cte's, but I was wondering if there was a simpler way of writing the code, maybe with subqueries? I'm retrieving everything from one table SALES, but I'm using 3 columns: AgentID, SaleDate, and OrderID.
WITH RECENT_SALE AS(
SELECT AGENTID,(
SALEDATE,
ROW_NUMBER() OVER (PARTITION BY AGENTID ORDER BY SALEDATE DESC) AS RN
FROM SALES
)
,
COUNT_SALE AS (
SELECT AGENTID,
COUNT(ORDERID) AS COUNTORDERS
FROM SALES
)
SELECT RECENT_SALE.MRN,
SALEDATE,
COUNTORDERS
FROM RECENT_SALE
INNER JOIN COUNT_SALE ON RECENT_SALE.AGENTID = COUNT_SALE.AGENTID;
It looks to me like you're just trying to get the total number of sales per agent as well as the date of his or her most recent sale? If I understand your structure correctly (and I may not), then it seems pretty straightforward. I'm guessing orderid is the primary key of SALES?
SELECT agentid, MAX(saledate) AS saledate -- Most recent sale date
, COUNT(orderid) AS countsales -- total sales
FROM sales
GROUP BY agentid;
There does not seem to be any need for CTEs or subqueries here.
Try this:
SELECT
saledate,
AGENTID,
count(orderid) over(partition by AGENTID order by saledate)
FROM SALES
group by
saledate,
AGENTID

How do you group the SUM of rows by year in SQL

SELECT YEAR(OrderDate) 'Year', SUM(TotalDue)
FROM Sales
GROUP BY OrderDate
Order BY OrderDate
How do I add each year together as ONE row?
I wrote the query above, but the result still has the TotalDue by Year as individual rows.
For example
You have a problem in the GROUP BY statement because it operates on a selected column:
SELECT YEAR(OrderDate) theYear, SUM(Due) TotalDue
FROM Sales
GROUP BY theYear --Here you can either order by TotalDue or by theYear,
-- otherwise you will get the errors you mentioned
You should GROUP BY Year, not OrderDate:
SELECT YEAR(OrderDate), SUM(TotalDue)
FROM Sales
GROUP BY YEAR(OrderDate)
Order BY OrderDate
You should group by the year of the order date and no the OrderDate itself.
SELECT YEAR(OrderDate) AS `Year`, SUM(TotalDue)
FROM Sales
GROUP BY YEAR(OrderDate)
Order BY Year
SELECT YEAR(OrderDate) 'Year', SUM(TotalDue)
FROM Sales
GROUP BY YEAR(OrderDate)
You need to re-group results
SELECT y, Sum(s) from
(
SELECT YEAR(OrderDate) as y , SUM(TotalDue) as s
FROM Sales
GROUP BY OrderDate
)
group by y
Order BY y