How to sum over selected dates from table - sql

Consider the following query which generates customerid and days on which they bought a particular product, clearly each customer will have different dates on which he/she bought an item. What I want to do is get total purchase made on those days that the customer bought that product.
I have the ff query.
Select customerid, eventdate
into #days
from table1
where product='chocolate'
now i want to sum all purchases made on just those days customer bought 'chocolate'.
so i have
select customerid, sum(purchases) purchases
into #pur
from table1 a
where eventdate in (select eventdate from #days where customerid=a.customerid)
group by customerid
but the above is taking to long to run so i cancelled it.
please assist with a better query.

Not very sure if this will work but try it out. If it doesnt then can you please provide sample data and expected output so that we can try it out in a better fashion
select customerid, sum(purchases) purchases
into #pur
from table1 a
inner join #days d
ON d.customerid = a.customerid
AND d.eventdate = a.eventdate
group by customerid
OR try this query
select customerid, sum(purchases) purchases
into #pur
from table1 a
inner join #days d
ON d.customerid = a.customerid
WHERE d.eventdate = a.eventdate
group by customerid
Hope this helps

The sum of all purchases by customer, counting only the dates on which he bought chocolate.
SELECT customerid, sum(purchases) purchases
FROM table1 a
WHERE eventdate IN
(SELECT eventdate
FROM table1
WHERE product = 'chocolate'
AND customerid = a.customerid)
GROUP BY customerid;

This gives sum of purchases for each customer AND for each of that day on which chocolate was purchased.
select customerid, eventdate ,sum(purchases) purchases
into #pur
from table1
where product='chocolate'
group by customerid,eventdate
If you want total purchase when chocolate was brought then do this
select customerid,sum(purchases) purchases
into #pur
from table1
where product='chocolate'
group by customerid
As per your clarification
select customerid, sum(purchases) purchases
into #pur
from table1 a
where eventdate in (select eventdate from table1 where product='chocolate')
group by customerid
I suggest you to apply indexing on eventdate column to improve query performance.

After thinking through carefully the following worked for me, and is faster.
--drop table #days
select customerid, eventdate
into #days
from table1
with(nolock, index(ix_eventdate))
WHERE EVENTDATE between 20140401 and 20140430
and product='chocolate'
--drop table #pur
select customerid, eventdate, purchases
into #pur
from table1
with(nolock, index(ix_eventdate))
where eventdate between 20140401 and 20140430
--drop table #first
select a.*, b.purchases
into #first from #days a
left join #pur b
on a.customerid=b.customerid
and a.EventDate =b.EventDate
--select * from #first
--drop table #purdays
select customerid, sum(purchases) revenue into #purdays from #first
group by customerid
order by customerid
select * from #purdays

Related

Combining multiple queries

I want a table with all customers and their last charge transaction date and their last invoice date. I have the first two, but don't know how to add the last invoice date to the table. Here's what I have so far:
WITH
--Last customer transaction
cust_trans AS (
SELECT customer_id, created
FROM charges a
WHERE created = (
SELECT MAX(created) AS last_trans
FROM charges b
WHERE a.customer_id = b.customer_id)),
--All customers
all_cust AS (
SELECT customers.id AS customer, customers.email, CAST(customers.created AS DATE) AS join_date, ((1.0 * customers.account_balance)/100) AS balance
FROM customers),
--Last customer invoice
cust_inv AS (
SELECT customer_id, date
FROM invoices a
WHERE date = (
SELECT MAX(date) AS last_inv
FROM invoices b
WHERE a.customer_id = b.customer_id))
SELECT * FROM cust_trans
RIGHT JOIN all_cust ON all_cust.customer = cust_trans.customer_id
ORDER BY join_date;
This should get what you need. Notice each individual subquery is left-joined to the customer table, so you always START with the customer, and IF there is a corresponding record in each subquery for max charge date or max invoice date, it will be pulled in. Now, you may want to apply a COALESCE() for the max dates to prevent showing nulls, such as
COALESCE(maxCharges.LastChargeDate, '') AS LastChargeDate
but your call.
SELECT
c.id AS customer,
c.email,
CAST(c.created AS DATE) AS join_date,
((1.0 * c.account_balance) / 100) AS balance,
maxCharges.LastChargeDate,
maxInvoices.LastInvoiceDate
FROM
customers c
LEFT JOIN
(SELECT
customer_id,
MAX(created) LastChargeDate
FROM
charges
GROUP BY
customer_id) maxCharges ON c.id = maxCharges.customer_id
LEFT JOIN
(SELECT
customer_id,
MAX(date) LastInvoiceDate
FROM
invoices
GROUP BY
customer_id) maxInvoices ON c.id = maxInvoices.customer_id
ORDER BY
c.created

Tweaking a Query - looking for duplicates within a certain day range

I posted a question similar to this, and got an answer, but the answer isn't configurable - my fault I should have been more clear, so I'll try again.
I have a table where TABLENAME has the following information - OrderDate, OrderNumber, CustomerID, ProductSKU, ProductName exist. This table has lines for invoices. So an order will have a data line for every item in the order.
I want to know, which customers have ordered the same item, more than once, where the order is within 90 of any other order of that same product by that customer, after a specific date. Same product in the same order number do not count. The catch is that I want "more than once" to be configurable, so if I need to see 3 or more, or 4 or more I can adjust AND I want to see the counts. Here's the query I have so far, which I think gives me the items and the counts - but not the 90 day thing:
EDITED: I don't think the former version gave me the right counts
SELECT customerid, productsku, productname, count(distinct ordernumber) FROM tablename
WHERE orderdate >'2017-11-01'
GROUP BY customerid, productsku, productname
HAVING COUNT(distinct ordernumber) > 2
Try doing this. it'll go back 90 days
declare #date date = '2017-11-01'
SELECT customerid, productsku, productname, count(distinct ordernumber) FROM tablename
WHERE orderdate >= dateadd(DD,-90,#date) and orderdate <= #date
GROUP BY customerid, productsku, productname
HAVING COUNT(distinct ordernumber) > 1
yes that is what I was doing in the first query. so this might be a really crappy way of doing it but without seeing any data it was kind of tough. this query shows gives you the order dates as well. hope it helps
WITH DupsWithin90Days (customerid,productsku,productname,orderdate,num)
as
(
select customerid,productsku,productname,orderdate ,count(*) num from (
SELECT X.customerid, X.productsku, X.productname,X.ORDERDATE,ROW_NUMBER() OVER (partition by x.customerid,x.orderdate order by x.orderdate) rownum
FROM
(
SELECT T1.customerid, T1.productsku, T1.productname,T1.ORDERDATE
FROM TABLENAME1 T1
) X
JOIN
(
SELECT T2.customerid, T2.productsku, T2.productname,T2.ORDERDATE
FROM
TABLENAME1 T2
) Y
ON X.customerid = Y.customerid AND X.orderdate >= dateadd(DD,-90,Y.orderdate)
) dup
where rownum > 1
group by customerid,productsku,productname,orderdate
)
select customerid,productsku,productname,orderdate
from DupsWithin90Days
order by customerid ,orderdate desc

Summing a column over a date range in a CTE?

I'm trying to sum a certain column over a certain date range. The kicker is that I want this to be a CTE, because I'll have to use it multiple times as part of a larger query. Since it's a CTE, it has to have the date column as well as the sum and ID columns, meaning I have to group by date AND ID. That will cause my results to be grouped by ID and date, giving me not a single sum over the date range, but a bunch of sums, one for each day.
To make it simple, say we have:
create table orders (
id int primary key,
itemID int foreign key references items.id,
datePlaced datetime,
salesRep int foreign key references salesReps.id,
price int,
amountShipped int);
Now, we want to get the total money a given sales rep made during a fiscal year, broken down by item. That is, ignoring the fiscal year bit:
select itemName, sum(price) as totalSales, sum(totalShipped) as totalShipped
from orders
join items on items.id = orders.itemID
where orders.salesRep = '1234'
group by itemName
Simple enough. But when you add anything else, even the price, the query spits out way more rows than you wanted.
select itemName, price, sum(price) as totalSales, sum(totalShipped) as totalShipped
from orders
join items on items.id = orders.itemID
where orders.salesRep = '1234'
group by itemName, price
Now, each group is (name, price) instead of just (name). This is kind of sudocode, but in my database, just this change causes my result set to jump from 13 to 32 rows. Add to that the date range, and you really have a problem:
select itemName, price, sum(price) as totalSales, sum(totalShipped) as totalShipped
from orders
join items on items.id = orders.itemID
where orders.salesRep = '1234'
and orderDate between 150101 and 151231
group by itemName, price
This is identical to the last example. The trouble is making it a CTE:
with totals as (
select itemName, price, sum(price) as totalSales, sum(totalShipped) as totalShipped, orderDate as startDate, orderDate as endDate
from orders
join items on items.id = orders.itemID
where orders.salesRep = '1234'
and orderDate between startDate and endDate
group by itemName, price, startDate, endDate
)
select totals_2015.itemName as itemName_2015, totals_2015.price as price_2015, ...
totals_2016.itemName as itemName_2016, ...
from (
select * from totals
where startDate = 150101 and endDate = 151231
) totals_2015
join (
select *
from totals
where startDate = 160101 and endDate = 160412
) totals_2016
on totals_2015.itemName = totals_2016.itemName
Now the grouping in the CTE is way off, more than adding the price made it. I've thought about breaking the price query into its own subquery inside the CTE, but I can't escape needing to group by the dates in order to get the date range. Can anyone see a way around this? I hope I've made things clear enough. This is running against an IBM iSeries machine. Thank you!
Depending on what you are looking for, this might be a better approach:
select 'by sales rep' breakdown
, salesRep
, '' year
, sum(price * amountShipped) amount
from etc
group by salesRep
union
select 'by sales rep and year' breakdown
, salesRep
, convert(char(4),orderDate, 120) year
, sum(price * amountShipped) amount
from etc
group by salesRep, convert(char(4),orderDate, 120)
etc
When possible group by the id columns or foreign keys because the columns are indexed already you'll get faster results. This applies to any database.
with cte as (
select id,rep, sum(sales) sls, count(distinct itemid) did, count(*) cnt from sommewhere
where date between x and y
group by id,rep
) select * from cte order by rep
or more fancy
with cte as (
select id,rep, sum(sales) sls, count(distinct itemid) did, count(*) cnt from sommewhere
where date between x and y
group by id,rep
) select * from cte join reps on cte.rep = reps.rep order by sls desc
I eventually found a solution, and it doesn't need a CTE at all. I wanted the CTE to avoid code duplication, but this works almost as well. Here's a thread explaining summing conditionally that does exactly what I was looking for.

How to do a group by without having to pass all the columns from the select?

I have the following select, whose goal is to select all customers who had no sales since the day X, and also bringing the date of the last sale and the number of the sale:
select s.customerId, s.saleId, max (s.date) from sales s
group by s.customerId, s.saleId
having max(s.date) <= '05-16-2013'
This way it brings me the following:
19 | 300 | 26/09/2005
19 | 356 | 29/09/2005
27 | 842 | 10/05/2012
In another words, the first 2 lines are from the same customer (id 19), I wish to get only one record for each client, which would be the record with the max date, in the case, the second record from this list.
By that logic, I should take off s.saleId from the "group by" clause, but if I do, of course, I get the error:
Invalid expression in the select list (not contained in either an
aggregate function or the GROUP BY clause)
I'm using Firebird 1.5
How can I do this?
GROUP BY summarizes data by aggregating a group of rows, returning one row per group. You're using the aggregate function max(), which will return the maximum value from one column for a group of rows.
Let's look at some data. I renamed the column you called "date".
create table sales (
customerId integer not null,
saleId integer not null,
saledate date not null
);
insert into sales values
(1, 10, '2013-05-13'),
(1, 11, '2013-05-14'),
(1, 12, '2013-05-14'),
(1, 13, '2013-05-17'),
(2, 20, '2013-05-11'),
(2, 21, '2013-05-16'),
(2, 31, '2013-05-17'),
(2, 32, '2013-03-01'),
(3, 33, '2013-05-14'),
(3, 35, '2013-05-14');
You said
In another words, the first 2 lines are from the same customer(id 19), i wish he'd get only one record for each client, which would be the record with the max date, in the case, the second record from this list.
select s.customerId, max (s.saledate)
from sales s
where s.saledate <= '2013-05-16'
group by s.customerId
order by customerId;
customerId max
--
1 2013-05-14
2 2013-05-16
3 2013-05-14
What does that table mean? It means that the latest date on or before May 16 on which customer "1" bought something was May 14; the latest date on or before May 16 on which customer "2" bought something was May 16. If you use this derived table in joins, it will return predictable results with consistent meaning.
Now let's look at a slightly different query. MySQL permits this syntax, and returns the result set below.
select s.customerId, s.saleId, max(s.saledate) max_sale
from sales s
where s.saledate <= '2013-05-16'
group by s.customerId
order by customerId;
customerId saleId max_sale
--
1 10 2013-05-14
2 20 2013-05-16
3 33 2013-05-14
The sale with ID "10" didn't happen on May 14; it happened on May 13. This query has produced a falsehood. Joining this derived table with the table of sales transactions will compound the error.
That's why Firebird correctly raises an error. The solution is to drop saleId from the SELECT clause.
Now, having said all that, you can find the customers who have had no sales since May 16 like this.
select distinct customerId from sales
where customerID not in
(select customerId
from sales
where saledate >= '2013-05-16')
And you can get the right customerId and the "right" saleId like this. (I say "right" saleId, because there could be more than one on the day in question. I just chose the max.)
select sales.customerId, sales.saledate, max(saleId)
from sales
inner join (select customerId, max(saledate) max_date
from sales
where saledate < '2013-05-16'
group by customerId) max_dates
on sales.customerId = max_dates.customerId
and sales.saledate = max_dates.max_date
inner join (select distinct customerId
from sales
where customerID not in
(select customerId
from sales
where saledate >= '2013-05-16')) no_sales
on sales.customerId = no_sales.customerId
group by sales.customerId, sales.saledate
Personally, I find common table expressions make it easier for me to read SQL statements like that without getting lost in the SELECTs.
with no_sales as (
select distinct customerId
from sales
where customerID not in
(select customerId
from sales
where saledate >= '2013-05-16')
),
max_dates as (
select customerId, max(saledate) max_date
from sales
where saledate < '2013-05-16'
group by customerId
)
select sales.customerId, sales.saledate, max(saleId)
from sales
inner join max_dates
on sales.customerId = max_dates.customerId
and sales.saledate = max_dates.max_date
inner join no_sales
on sales.customerId = no_sales.customerId
group by sales.customerId, sales.saledate
then you can use following query ..
EDIT changes made after comment by likeitlikeit for only one row per CustomerID even when we will have one case where we have multiple saleID for customer with certain condition -
select x.customerID, max(x.saleID), max(x.x_date) from (
select s.customerId, s.saleId, max (s.date) x_date from sales s
group by s.customerId, s.saleId
having max(s.date) <= '05-16-2013'
and max(s.date) = ( select max(s1.date)
from sales s1
where s1.customeId = s.customerId))x
group by x.customerID
You can Try Maxing the s.saleId (Max(s.saleId)) and removing it from the Group By clause
A subquery should do the job, I can't test it right now but it seems ok:
SELECT s.customerId, s.saleId, subq.maxdate
FROM sales AS s
INNER JOIN (SELECT customerId, MAX(date) AS maxdate
FROM sales
GROUP BY customerId, saleId
HAVING MAX(s.date) <= '05-16-2013'
) AS subq
ON s.customerId = subq.customerId AND s.date = subq.maxdate

MySQL: Returning multiple columns from an in-line subquery

I'm creating an SQL statement that will return a month by month summary on sales.
The summary will list some simple columns for the date, total number of sales and the total value of sales.
However, in addition to these columns, i'd like to include 3 more that will list the months best customer by amount spent. For these columns, I need some kind of inline subquery that can return their ID, Name and the Amount they spent.
My current effort uses an inline SELECT statement, however, from my knowledge on how to implement these, you can only return one column and row per in-line statement.
To get around this with my scenario, I can of course create 3 separate in-line statements, however, besides this seeming impractical, it increases the query time more that necessary.
SELECT
DATE_FORMAT(OrderDate,'%M %Y') AS OrderMonth,
COUNT(OrderID) AS TotalOrders,
SUM(OrderTotal) AS TotalAmount,
(SELECT SUM(OrderTotal) FROM Orders WHERE DATE_FORMAT(OrderDate,'%M %Y') = OrderMonth GROUP BY OrderCustomerFK ORDER BY SUM(OrderTotal) DESC LIMIT 1) AS TotalCustomerAmount,
(SELECT OrderCustomerFK FROM Orders WHERE DATE_FORMAT(OrderDate,'%M %Y') = OrderMonth GROUP BY OrderCustomerFK ORDER BY SUM(OrderTotal) DESC LIMIT 1) AS CustomerID,
(SELECT CustomerName FROM Orders INNER JOIN Customers ON OrderCustomerFK = CustomerID WHERE DATE_FORMAT(OrderDate,'%M %Y') = OrderMonth GROUP BY OrderCustomerFK ORDER BY SUM(OrderTotal) DESC LIMIT 1) AS CustomerName
FROM Orders
GROUP BY DATE_FORMAT(OrderDate,'%m%y')
ORDER BY DATE_FORMAT(OrderDate,'%y%m') DESC
How can i better structure this query?
FULL ANSWER
After some tweaking of Dave Barkers solution, I have a final version for anyone in the future looking for help.
The solution by Dave Barker worked perfectly with the customer details, however, it made the simpler Total Sales and Total Sale Amount columns get some crazy figures.
SELECT
Y.OrderMonth, Y.TotalOrders, Y.TotalAmount,
Z.OrdCustFK, Z.CustCompany, Z.CustOrdTotal, Z.CustSalesTotal
FROM
(SELECT
OrdDate,
DATE_FORMAT(OrdDate,'%M %Y') AS OrderMonth,
COUNT(OrderID) AS TotalOrders,
SUM(OrdGrandTotal) AS TotalAmount
FROM Orders
WHERE OrdConfirmed = 1
GROUP BY DATE_FORMAT(OrdDate,'%m%y')
ORDER BY DATE_FORMAT(OrdDate,'%Y%m') DESC)
Y INNER JOIN
(SELECT
DATE_FORMAT(OrdDate,'%M %Y') AS CustMonth,
OrdCustFK,
CustCompany,
COUNT(OrderID) AS CustOrdTotal,
SUM(OrdGrandTotal) AS CustSalesTotal
FROM Orders INNER JOIN CustomerDetails ON OrdCustFK = CustomerID
WHERE OrdConfirmed = 1
GROUP BY DATE_FORMAT(OrdDate,'%m%y'), OrdCustFK
ORDER BY SUM(OrdGrandTotal) DESC)
Z ON Z.CustMonth = Y.OrderMonth
GROUP BY DATE_FORMAT(OrdDate,'%Y%m')
ORDER BY DATE_FORMAT(OrdDate,'%Y%m') DESC
Move the inline SQL to be a inner join query. So you'd have something like...
SELECT DATE_FORMAT(OrderDate,'%M %Y') AS OrderMonth, COUNT(OrderID) AS TotalOrders, SUM(OrderTotal) AS TotalAmount, Z.OrderCustomerFK, Z.CustomerName, z.OrderTotal as CustomerTotal
FROM Orders
INNER JOIN (SELECT DATE_FORMAT(OrderDate,'%M %Y') as Mon, OrderCustomerFK, CustomerName, SUM(OrderTotal) as OrderTotal
FROM Orders
GROUP BY DATE_FORMAT(OrderDate,'%M %Y'), OrderCustomerFK, CustomerName ORDER BY SUM(OrderTotal) DESC LIMIT 1) Z
ON Z.Mon = DATE_FORMAT(OrderDate,'%M %Y')
GROUP BY DATE_FORMAT(OrderDate,'%m%y'), Z.OrderCustomerFK, Z.CustomerName
ORDER BY DATE_FORMAT(OrderDate,'%y%m') DESC
You can also do something like:
SELECT
a.`y`,
( SELECT #c:=NULL ) AS `temp`,
( SELECT #d:=NULL ) AS `temp`,
( SELECT
CONCAT(#c:=b.`c`, #d:=b.`d`)
FROM `b`
ORDER BY b.`uid`
LIMIT 1 ) AS `temp`,
#c as c,
#d as d
FROM `a`
Give this a shot:
SELECT CONCAT(o.order_month, ' ', o.order_year),
o.total_orders,
o.total_amount,
x.sum_order_total,
x.ordercustomerfk,
x.customername
FROM (SELECT MONTH(t.orderdate) AS order_month,
YEAR(t.orderdate) AS order_year
COUNT(t.orderid) AS total_orders,
SUM(t.ordertotal) AS total_amount
FROM ORDERS t
GROUP BY MONTH(t.orderdate), YEAR(t.orderdate)) o
JOIN (SELECT MONTH(t.orderdate) AS ordermonth,
YEAR(t.orderdate) AS orderyear
SUM(t.ordertotal) 'sum_order_total',
t.ordercustomerfk,
c.customername
FROM ORDERS t
JOIN CUSTOMERS c ON c.customerid = o.ordercustomerfk
GROUP BY t.ordercustomerfk, MONTH(t.orderdate), YEAR(t.orderdate)) x ON x.order_month = o.order_month
AND x.order_year = o.order_year
ORDER BY o.order_year DESC, o.order_month DESC