SQL: How to return revenue for specific year - sql

I would like to show the revenue for a specific year for all customers regardless of whether or not they have revenue data for the specific year. (in cases they dont have data for the specific year, a filler like 'no data' would work)
Sample Data looks like:
Table 1
Customer
Price
Quantity
Order Date
xxx
12
5
1990/03/25
yyy
15
7
1991/05/35
xxx
34
2
1990/08/21
Desired Output would look a little something like this:
Customer
Revenue (for 1990)
xxx
128
yyy
no data
Getting the total revenue for each would be:
SELECT Customer,
SUM(quantity*price) AS Revenue
but how would i go about listing it out for a specific year for all customers? (incl. customers that dont have data for that specific year)

We can use a CTE or a sub-query to create a list of all customers and another to get all years and the cross join them and left join onto revenue.
This gives an row for each customer for each year. If you add where y= you will only get the year requested.
CREATE TABLE revenue(
Customer varchar(10),
Price int,
Quantity int,
OrderDate date);
insert into revenue values
('xxx', 12,5,'2021-03-25'),
('yyy', 15,7,'2021-05-15'),
('xxx', 34,2,'2022-08-21');
with cust as
(select distinct customer c from revenue),
years as
(select distinct year(OrderDate) y from revenue)
select
y "year",
c customer ,
sum(price*quantity) revenue
from years
cross join cust
left join revenue r
on cust.c = r.customer and years.y = year(OrderDate)
group by
c,y,
year(OrderDate)
order by y,c
year | customer | revenue
---: | :------- | ------:
2021 | xxx | 60
2021 | yyy | 105
2022 | xxx | 68
2022 | yyy | null
db<>fiddle here

You would simply use group by and do the sum in a subquery and left join it to your customers table. ie:
select customers.Name, totals.Revenue
from Customers
Left join
( select customerId, sum(quantity*price) as revenue
from myTable
where year(orderDate) = 1990
group by customer) totals on customers.CustomerId = myTable.customerId;

Related

Cumulative Sum Query in SQL table with distinct elements

I have a table like this, with column names as Date of Sale and insurance Salesman Names -
Date of Sale | Salesman Name | Sale Amount
2021-03-01 | Jack | 40
2021-03-02 | Mark | 60
2021-03-03 | Sam | 30
2021-03-03 | Mark | 70
2021-03-02 | Sam | 100
I want to do a group by, using the date of sale. The next column should display the cumulative count of the sellers who have made the sale till that date. But same sellers shouldn't be considered again.
For example,
The following table is incorrect,
Date of Sale | Count(Salesman Name) | Sum(Sale Amount)
2021-03-01 | 1 | 40
2021-03-02 | 3 | 200
2021-03-03 | 5 | 300
The following table is correct,
Date of Sale | Count(Salesman Name) | Sum(Sale Amount)
2021-03-01 | 1 | 40
2021-03-02 | 3 | 200
2021-03-03 | 3 | 300
I am not sure how to frame the SQL query, because there are two conditions involved here, cumulative count while ignoring the duplicates. I think the OVER clause along with the unbounded row preceding may be of some use here? Request your help
Edit - I have added the Sale Amount as a column. I need the cumulative sum for the Sales Amount also. But in this case , all the sale amounts should be considered unlike the salesman name case where only unique names were being considered.
One approach uses a self join and aggregation:
WITH cte AS (
SELECT t1.SaleDate,
COUNT(CASE WHEN t2.Salesman IS NULL THEN 1 END) AS cnt,
SUM(t1.SaleAmount) AS amt
FROM yourTable t1
LEFT JOIN yourTable t2
ON t2.Salesman = t1.Saleman AND
t2.SaleDate < t1.SaleDate
GROUP BY t1.SaleDate
)
SELECT
SaleDate,
SUM(cnt) OVER (ORDER BY SaleDate) AS NumSalesman,
SUM(amt) OVER (ORDER BY SaleDate) AS TotalAmount
FROM cte
ORDER BY SaleDate;
The logic in the CTE is that we try to find, for each salesman, an earlier record for the same salesman. If we can't find such a record, then we assume the record in question is the first appearance. Then we aggregate by date to get the counts per day, and finally take a rolling sum of counts in the outer query.
The best way to do this uses window functions to determine the first time a sales person appears. Then, you just want cumulative sums:
select saledate,
sum(case when seqnum = 1 then 1 else 0 end) over (order by saledate) as num_salespersons,
sum(sum(sales)) over (order by saledate) as running_sales
from (select t.*,
row_number() over (partition by salesperson order by saledate) as seqnum
from t
) t
group by saledate
order by saledate;
Note that this in addition to being more concise, this should have much, much better performance than a solution that uses a self-join.

Is there a way to select sum on one column based on other DISTINCT column, while grouping by third column(date) only

I have three columns
year | money | id
2020 100 01
2020 100 01
2019 50 02
2018 50 03
2020 40 04
results should be
Year | Money | total people
2020 | 240 | 4
** AS first two ids are the same, I tried it as below
select year, sum(money), Count( Distinct id) from table
group by year
But the result shows 4 people which is the correct but wrong sum, as it is counting all of the money
You can aggregate and then aggregate again:
select max(year), sum(money), count(*)
from (select distinct year, money, id
from t
) t;
You can use SUM() and COUNT(DISTINCT x).
For example:
select
year,
sum(money) as money,
(select count(distinct id) from t) as total_people
from t
where year = 2020
group by year;
Result:
YEAR MONEY TOTAL_PEOPLE
----- ------ ------------
2020 240 4
See running example at db<>fiddle.
Not the most performant, but if you wish to avoid a derived table, you can do
select distinct
max(year) over (),
sum(money) over (),
count(*) over ()
from t
group by year, money, id;
And if you want this grouped by year, you can define the partitions in the over clause

Displaying results for fixed values in SQL

I am having difficulty in solving the below problem:
I have a table which contains the shopid, date, hour, category and sales amount.
shopid date hour category amount
------------------------------------
1 date1 7 food 10
1 date1 8 food 15
1 date1 10 misc. 5
2 date1 7 food 6
...................................
I am trying to calculate the total sales amount in each hour by food category and display like the following:
shopid category hour amount
------------------------------------
1 food 6 0
1 food 7 5
1 food 8 20
2 food 9 40
...................................
The shops' opening hours are 6 am -10 pm. So for each hour, there might be any sales or not. I was able to perform the hourly summation. But I am unable to display zero and the time when there are no sales at a particular time (e.g. 6 am or any other time between the opening hours) for each sale category.
Use a left join against a list of hours:
select t.shopid, t.category. g.hour, sum(t.amount)
from generate_series(6,22) as g(hour)
left join the_table t on t.hour = g.hour
group by t.shopid, t.category, g.hour
order by t.shopid, t.category, g.hour;
I am trying to calculate the total sales amount in each hour by food category.
This makes sense, but it doesn't make sense to include the shopid in the results.
To do this, you need to generate the rows -- which are all hours and food categories. Then bring in the actual results using left join:
select c.category. g.hour, coalesce(sum(s.amount), 0)
from generate_series(6, 22) g(hour) cross join
(select distinct category from sales) c left join
sales s
on s.hour = g.hour and s.category = c.category
group by c.category, g.hour
order by c.category, g.hour;
If you want results by shop/category/hour, then you can use the same idea:
select sh.shopid, c.category. g.hour,
coalesce(sum(s.amount), 0)
from generate_series(6, 22) g(hour) cross join
(select distinct category from sales) c cross join
(select distinct shopid from sales) sh left join
sales s
on s.shopid = sh.shopid and
s.hour = g.hour and
s.category = c.category
group by sh.shopid, c.category, g.hour
order by sh.shopid, c.category, g.hour;

Firebird Query- Return first row each group

In a firebird database with a table "Sales", I need to select the first sale of all customers. See below a sample that show the table and desired result of query.
---------------------------------------
SALES
---------------------------------------
ID CUSTOMERID DTHRSALE
1 25 01/04/16 09:32
2 30 02/04/16 11:22
3 25 05/04/16 08:10
4 31 07/03/16 10:22
5 22 01/02/16 12:30
6 22 10/01/16 08:45
Result: only first sale, based on sale date.
ID CUSTOMERID DTHRSALE
1 25 01/04/16 09:32
2 30 02/04/16 11:22
4 31 07/03/16 10:22
6 22 10/01/16 08:45
I've already tested following code "Select first row in each GROUP BY group?", but it did not work.
In Firebird 2.5 you can do this with the following query; this is a minor modification of the second part of the accepted answer of the question you linked to tailored to your schema and requirements:
select x.id,
x.customerid,
x.dthrsale
from sales x
join (select customerid,
min(dthrsale) as first_sale
from sales
group by customerid) p on p.customerid = x.customerid
and p.first_sale = x.dthrsale
order by x.id
The order by is not necessary, I just added it to make it give the order as shown in your question.
With Firebird 3 you can use the window function ROW_NUMBER which is also described in the linked answer. The linked answer incorrectly said the first solution would work on Firebird 2.1 and higher. I have now edited it.
Search for the sales with no earlier sales:
SELECT S1.*
FROM SALES S1
LEFT JOIN SALES S2 ON S2.CUSTOMERID = S1.CUSTOMERID AND S2.DTHRSALE < S1.DTHRSALE
WHERE S2.ID IS NULL
Define an index over (customerid, dthrsale) to make it fast.
in Firebird 3 , get first row foreach customer by min sales_date :
SELECT id, customer_id, total, sales_date
FROM (
SELECT id, customer_id, total, sales_date
, row_number() OVER(PARTITION BY customer_id ORDER BY sales_date ASC ) AS rn
FROM SALES
) sub
WHERE rn = 1;
İf you want to get other related columns, This is where your self-answer fails.
select customer_id , min(sales_date)
, id, total --what about other colums
from SALES
group by customer_id
So simple as:
select CUSTOMERID min(DTHRSALE) from SALES group by CUSTOMERID

Build customers report of last years with T-SQL

I've 3 tables (simplified):
-----------Orders--------------------
Id | Total_Price | Customer_Id | Date
--------Order Details---------------------
Id | Order_Id | Product Name | Qty | Value
----Customers------
Id | Name | Address
I take a total order value of single customer with this query:
SELECT C.ID, C.NAME , SUM(O.TOTAL_PRICE)
FROM CUSTOMERS C
JOIN ORDERS O ON O.CUSTOMER_ID = C.ID
GROUP BY C.ID, C.NAME
Now, I want to build a report with total order value filtered by a range of dates:
SELECT C.ID, C.NAME , SUM(O.TOTAL_PRICE)
FROM CUSTOMERS C
JOIN ORDERS O ON O.CUSTOMER_ID = C.ID
WHERE O.DATE BETWEEN #value1 AND #value2
GROUP BY C.ID, C.NAME
this works OK, but I want to select last 3 year sums of total orders value grouped by customer, this is the results that I want:
1Year | 2Year | 3Year | Customer_Name
-------------------------------------------------
XXX | YYY | ZZZZ | Customer1
XYX | YYZ | ZZTZ | Customer2
....
I've this cardinality:
Customer table with 22.000 rows
Orders table with 87.000 rows
Orders details with 600.000
It is possible without temptable,vartable or stored procedure with long execution time?
In my report I want also to calculate total Qty of last 3 years grouped by customer of a product, but this is the next step.
Any ideas?
Thanks
You can use a case statement to get the result you want. Since there is some ambiguity in your post about how the year ranges are defined, I've left out any calculations to get those year end/starts and just put variables in. You can revise to suit your need.
SELECT C.ID
,C.NAME
,SUM(CASE
WHEN o.DATE BETWEEN #year1start
AND #year1end
THEN O.TOTAL_PRICE
ELSE 0
END) Year1
,SUM(CASE
WHEN o.DATE BETWEEN #year2start
AND #year2end
THEN O.TOTAL_PRICE
ELSE 0
END) Year2
,SUM(CASE
WHEN o.DATE BETWEEN #year3start
AND #year3end
THEN O.TOTAL_PRICE
ELSE 0
END) Year3
FROM CUSTOMERS C
INNER JOIN ORDERS O ON O.CUSTOMER_ID = C.ID
GROUP BY C.ID
,C.NAME
Another option is to use pivot statement. I assume every your date range equals to one year (e.g. 2013, 2014 and so on).
If these years are strongly determined pivot isn't very beautiful option (look at full sqlfiddle example, it has possible solution for your additional question):
select
c.Id, c.Name, c.Address, CostByYear.[2013], CostByYear.[2014], CostByYear.[2015]
from Customers c
left join (
select
pt.Customer_Id, isnull(pt.[2013], 0) as [2013],
isnull(pt.[2014], 0) as [2014], isnull(pt.[2015], 0) as [2015]
from (
select
o.Customer_Id, year(o.Date) [Year], sum(o.Total_Price) [TotalCost]
from Orders o
group by
o.Customer_Id, year(o.Date)
) src
pivot (
sum(TotalCost) for [Year] in ([2013], [2014], [2015])
) pt
) CostByYear on
c.Id = CostByYear.Customer_Id
order by
c.Name
Also you can do both approaches (mine and prev answer) with dynamically created queries if year ranges aren't known and strongly defined.