display only specific rows in a column with a group by - sql

I'm somewhat new to Oracle SQL and can't figure this out. I want to display the rows with the high value in the third column. Here is my table i'm working with:
theyear custseg sales
2010 Corporate 573637.62
2010 Home Office 515314.98
2010 Small Biz 390361.94
2010 Consumer 383825.67
2011 Corporate 731208
2011 Home Office 521274.34
2011 Consumer 390967.03
2011 Small Biz 273264.81
2012 Corporate 823861.38
2012 Consumer 480082.9
2012 Home Office 478106.93
I want the highest value grouped by year. If I do a group by with just the year I get the answer somewhat, but I can't include/display customer segment (ugh). It just displays the year and the max sales. When I include the customer segment it gives me that table, which displays all the sales - not what i'm looking for. I simply want the rows that contain the MAX sales given the year (theyear) AND the customer segment (custseg). For what it's worth here is the code I used to create the above:
select theyear, custseg, max(totalsales) sales from (
select custseg, extract(year from ordshipdate) theyear, sum(ordsales) TotalSales from customers, orderdet
where customers.custid = orderdet.custid
group by custseg, extract(year from ordshipdate)
order by sum(ordsales) desc)
group by theyear, custseg
order by theyear, max(totalsales) desc;

Assuming all fields are in the customer table as described in the question, the following query would do what you want:
select c.theyear, c.custseg, c.sales
from
customer c inner join
(
select theyear, max(sales) as max_sales_in_year
from customer
group by theyear
) maxvalues
on (
c.year = maxvalues.theyear and
c.sales = maxvalues.max_sales_in_year
);
Swap the inner join with a right outer join if you do not plan to settle ties arbitrarily.

I would use ROW_NUMBER():
SELECT theyear, custseg, totalsales FROM
(
select theyear, custseg, totalsales,
ROW_NUMBER OVER(PARTITION BY theyear ORDER BY totalsales DESC) rn
from
(
select custseg, extract(year from ordshipdate) theyear, sum(ordsales) TotalSales
from customers, orderdet
where customers.custid = orderdet.custid
group by custseg, extract(year from ordshipdate)
) a
) b
WHERE rn = 1;
BTW, the query above will look more readable when using CTE:
WITH a AS(
select custseg, extract(year from ordshipdate) theyear, sum(ordsales) TotalSales
from customers, orderdet
where customers.custid = orderdet.custid
group by custseg, extract(year from ordshipdate)),
b AS (
select theyear, custseg, totalsales,
ROW_NUMBER OVER(PARTITION BY theyear ORDER BY totalsales DESC) rn
FROM a)
SELECT theyear, custseg, totalsales
FROM b;

Related

SQL get top 3 values / bottom 3 values with group by and sum

I am working on a restaurant management system. There I have two tables
order_details(orderId,dishId,createdAt)
dishes(id,name,imageUrl)
My customer wants to see a report top 3 selling items / least selling 3 items by the month
For the moment I did something like this
SELECT
*
FROM
(SELECT
SUM(qty) AS qty,
order_details.dishId,
MONTHNAME(order_details.createdAt) AS mon,
dishes.name,
dishes.imageUrl
FROM
rms.order_details
INNER JOIN dishes ON order_details.dishId = dishes.id
GROUP BY order_details.dishId , MONTHNAME(order_details.createdAt)) t
ORDER BY t.qty
This gives me all the dishes sold count order by qty.
I have to manually filter max 3 records and reject the rest. There should be a SQL way of doing this. How do I do this in SQL?
You would use row_number() for this purpose. You don't specify the database you are using, so I am guessing at the appropriate date functions. I also assume that you mean a month within a year, so you need to take the year into account as well:
SELECT ym.*
FROM (SELECT YEAR(od.CreatedAt) as yyyy,
MONTH(od.createdAt) as mm,
SUM(qty) AS qty,
od.dishId, d.name, d.imageUrl,
ROW_NUMBER() OVER (PARTITION BY YEAR(od.CreatedAt), MONTH(od.createdAt) ORDER BY SUM(qty) DESC) as seqnum_desc,
ROW_NUMBER() OVER (PARTITION BY YEAR(od.CreatedAt), MONTH(od.createdAt) ORDER BY SUM(qty) DESC) as seqnum_asc
FROM rms.order_details od INNER JOIN
dishes d
ON od.dishId = d.id
GROUP BY YEAR(od.CreatedAt), MONTH(od.CreatedAt), od.dishId
) ym
WHERE seqnum_asc <= 3 OR
seqnum_desc <= 3;
Using the above info i used i combination of group by, order by and limit
as shown below. I hope this is what you are looking for
SELECT
t.qty,
t.dishId,
t.month,
d.name,
d.mageUrl
from
(
SELECT
od.dishId,
count(od.dishId) AS 'qty',
date_format(od.createdAt,'%Y-%m') as 'month'
FROM
rms.order_details od
group by date_format(od.createdAt,'%Y-%m'),od.dishId
order by qty desc
limit 3) t
join rms.dishes d on (t.dishId = d.id)

T-SQL query to summarize sales for ALL YEARS: with total per month per year, and cumulative monthly amounts

I need to create a Sales Report that shows all years sales per month, and cumulative sales.
The database table is simple:
Transactions
(
ID INT,
TransactionDate DATETIME,
SalesAmount MONEY
)
I want the results to look similar to ExcelSheet below (I am showing only 2017/2018 amounts, but actual query needs to return results for all available years according to TransactionDate)
This is aggregation and a cumulative sum:
select year(TransactionDate), month(TransactionDate),
sum(SalesAmount),
sum(sum(SalesAmount)) over (partition by year(TransactionDate) order by min(TransactionDate))
from Transactions
group by year(TransactionDate), month(TransactionDate)
order by year(TransactionDate), month(TransactionDate);
Try it:
With Q
as
(
Select DatePart(yyyy,TransactionDate) 'Year',DatePart(m,TransactionDate) 'Month', sum(SalesAmount) 'Sales'
From Transactions
Group by DatePart(yyyy,TransactionDate),DatePart(m,TransactionDate)
)
Select q.Year,q.Month,q.sales,( Select sum(q1.Sales)
From Q q1
Where q1.Year=q.Year
And q1.Month <= q.Month
) 'Cumulative Sale'
From Q q
Order by q.Year,q.Month
Try this:
with cte as
(
select year(TransactionDate) as Year, month(TransactionDate) as Month, SalesAmount
)
select a.Year, a.Month, a.SalesAmount, sum(b.SalesAmount) as cumulativeSalesAmount
from Transactions a inner join Transactions b on a.STORE_ID = b.STORE_ID and a.Year = b.Year and a.Month >= b.Month
group by a.Year, a.Month
order by 1, 2

how to filter data in sql based on percentile

I have 2 tables, the first one is contain customer information such as id,age, and name . the second table is contain their id, information of product they purchase, and the purchase_date (the date is from 2016 to 2018)
Table 1
-------
customer_id
customer_age
customer_name
Table2
------
customer_id
product
purchase_date
my desired result is to generate the table that contain customer_name and product who made purchase in 2017 and older than 75% of customer that make purchase in 2016.
Depending on your flavor of SQL, you can get quartiles using the more general ntile analytical function. This basically adds a new column to your query.
SELECT MIN(customer_age) as min_age FROM (
SELECT customer_id, customer_age, ntile(4) OVER(ORDER BY customer_age) AS q4 FROM table1
WHERE customer_id IN (
SELECT customer_id FROM table2 WHERE purchase_date = 2016)
) q
WHERE q4=4
This returns the lowest age of the 4th-quartile customers, which can be used in a subquery against the customers who made purchases in 2017.
The argument to ntile is how many buckets you want to divide into. In this case 75%+ equals 4th quartile, so 4 buckets is OK. The OVER() clause specifies what you want to sort by (customer_age in our case), and also lets us partition (group) the data if we want to, say, create multiple rankings for different years or countries.
Age is a horrible field to include in a database. Every day it changes. You should have date-of-birth or something similar.
To get the 75% oldest value in 2016, there are several possibilities. I usually go for row_number() and count(*):
select min(customer_age)
from (select c.*,
row_number() over (order by customer_age) as seqnum,
count(*) over () as cnt
from customers c join
where exists (select 1
from customer_products cp
where cp.customer_id = c.customer_id and
cp.purchase_date >= '2016-01-01' and
cp.purchase_date < '2017-01-01'
)
)
where seqnum >= 0.75 * cnt;
Then, to use this for a query for 2017:
with a2016 as (
select min(customer_age) as customer_age
from (select c.*,
row_number() over (order by customer_age) as seqnum,
count(*) over () as cnt
from customers c
where exists (select 1
from customer_products cp
where cp.customer_id = c.customer_id and
cp.purchase_date >= '2016-01-01' and
cp.purchase_date < '2017-01-01'
)
) c
where seqnum >= 0.75 * cnt
)
select c.*, cp.product_id
from customers c join
customer_products cp
on cp.customer_id = c.customer_id and
cp.purchase_date >= '2017-01-01' and
cp.purchase_date < '2018-01-01' join
a2016 a
on c.customer_age >= a.customer_age;

SQL YTD and last year YTD on complete data

I need to calculate YTD and last year YTD on a table [SQL Server 2012]. Below is the query I tried. Its getting doubled and tripled for some cases.
SELECT SUM(A.RevisionNumber)YTD,SUM(P.RevisionNumber)LY_YTD,B.OrderDateM,B.OrderDateY
FROM
(select MONTH(OrderDate)OrderDateM,YEAR(OrderDate)OrderDateY from sales.SalesOrderHeader B
group by MONTH(OrderDate),YEAR(OrderDate))B
LEFT JOIN
(select SUM(RevisionNumber)RevisionNumber,MONTH(OrderDate)OrderDateM,YEAR(OrderDate)OrderDateY
from sales.SalesOrderHeader
group by MONTH(OrderDate),YEAR(OrderDate))A
ON A.OrderDateM<=B.OrderDateM AND A.OrderDateY=B.OrderDateY
LEFT JOIN
(select SUM(RevisionNumber)RevisionNumber,MONTH(OrderDate)OrderDateM,YEAR(OrderDate)OrderDateY
from sales.SalesOrderHeader
group by MONTH(OrderDate),YEAR(OrderDate))P
ON P.OrderDateM<=B.OrderDateM AND P.OrderDateY=B.OrderDateY-1
GROUP BY B.OrderDateM,B.OrderDateY
ORDER BY B.OrderDateY,B.OrderDateM
You can use windowing function as below:
;With cte as (
Select Sum(RevisionNumber) As SM_RevisionNumber, Month(OrderDate) as OrderM,
Year(OrderDate) as OrderY
From Sales.SalesOrderHeader
Group by Month(OrderDate), Year(OrderDate)
), cte2 as (
Select YTD = Sum(SM_RevisionNumber) over (partition by OrderY order by OrderM),
OrderM, OrderY, RowN = Row_Number() over(order by OrderY, OrderM)
from cte
)
Select YTD, LY_YTD = lag(YTD, 12, null) over(Order by RowN), OrderM, ORderY
from cte2
But this solution assumes we have atleast one entry for each month and year.

How to return the most ordered item for each month

I am trying to return the most ordered product per month, of the year 2007. I would like to see the name of the product, how many of them where ordered that month, and the month. I am using the AdventureWorks2012 database. I have tried a few different ways but each time multiple product orders are returned for the same month, instead of the one product that had the most order quantity that month. Sorry if this is not clear. I am trying to test myself so I make up my own questions and try to answer them. If anyone knows a site that have questions and answers like this so I can verify that would be super helpful! Thanks for any help. Here is the farthest I have been able to get with the query.
WITH Ord2007Sum
AS (SELECT sum(od.orderqty) AS sorder,
od.productid,
oh.orderdate,
od.SalesOrderID
FROM Sales.SalesOrderDetail AS od
INNER JOIN
sales.SalesOrderHeader AS oh
ON od.SalesOrderID = oh.SalesOrderID
WHERE year(oh.OrderDate) = 2007
GROUP BY ProductID, oh.OrderDate, od.SalesOrderID)
SELECT max(sorder),
s.productid,
month(h.orderdate) AS morder --, s.salesorderid
FROM Ord2007Sum AS s
INNER JOIN
sales.SalesOrderheader AS h
ON s.OrderDate = h.OrderDate
GROUP BY s.ProductID, month(h.orderdate)
ORDER BY morder;
Make a CTE that groups our products by month and creates a sum
;WITH OrderRows AS
(
SELECT
od.ProductId,
MONTH(oh.OrderDate) SalesMonth,
SUM(od.orderqty) OVER (PARTITION BY od.ProductId, MONTH(oh.OrderDate) ORDER BY oh.OrderDate) ProdMonthSum
FROM SalesOrderDetail AS od
INNER JOIN SalesOrderHeader AS oh
ON od.SalesOrderID = oh.SalesOrderID
WHERE year(oh.OrderDate) = 2007
),
Make a simple numbers table to break out each month of the year
Months AS
(
SELECT 1 AS MonthNum UNION SELECT 2 UNION SELECT 3 UNION SELECT 4
UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8
UNION SELECT 9 UNION SELECT 10 UNION SELECT 11 UNION SELECT 12
)
We query our months table against the data and select the top product for each month based on the sum
SELECT
m.MonthNum,
d.ProductID,
d.ProdMonthSum
FROM Months m
OUTER APPLY
(
SELECT TOP 1 r.ProductID, r.ProdMonthSum
FROM OrderRows r
WHERE r.SalesMonth = m.MonthNum
ORDER BY ProdMonthSum DESC
) d
Your group by statement should not include oh.OrderDate, od.SalesOrderID because this will aggregate your data to the incorrect level. You want the ProductID that was most commonly sold per month so the group by conditions become ProductID, datepart(mm,oh.OrderDate). As Andrew suggested the Row_Number function is useful in this case as it lets you create a key that is ordered by month and sorder and which resets each month. Finally in the outer query limits the results to the first instance (which is the highest quantity)for each month.
WITH Ord2007Sum
AS(
SELECT sum(od.orderqty) AS sorder,
od.productid,
datepart(mm,oh.OrderDate) AS 'Month'
row_number() over (partition by datepart(mm,oh.OrderDate)
Order by datepart(mm,oh.OrderDate)desc, sorder desc) row
FROM Sales.SalesOrderDetail AS od
INNER JOIN
sales.SalesOrderHeader AS oh
ON od.SalesOrderID = oh.SalesOrderID
WHERE datepart(yyyy,oh.OrderDate) = 2007
GROUP BY ProductID, datepart(mm,oh.OrderDate)
)
SELECT productid,
sorder,
[month]
FROM Ord2007Sum
WHERE row =1