SQL How to take the minium for multiple fields? - sql

Consider the following data set that records the product sold, year, and revenue from that particular product in thousands of dollars. This data table (YEARLY_PRODUCT_REVENUE) is stored in SQL and has many more rows.
Year | Product | Revenue
2000 Table 100
2000 Chair 200
2000 Bed 150
2010 Table 120
2010 Chair 190
2010 Bed 390
Using SQL, for every year I would like to find the product that has the maximum revenue.
That is, I would like my output to be the following:
Year | Product | Revenue
2000 Chair 200
2010 Bed 390
My attempt so far has been this:
SELECT year, product, MIN(revenue)
FROM YEARLY_PRODUCT_REVENUE
GROUP BY article, month;
But when I do this, I get multiple-year values for distinct products. For instance, I'm getting the output below which is an error. I'm not entirely sure what the error here is. Any help would be much appreciated!
Year | Product | Revenue
2000 Table 100
2000 Bed 150
2010 Table 120
2010 Chair 190

You don't mention the database so I'll assume it's PostgreSQL. You can do:
select distinct on (year) * from t order by year, revenue desc

You want filtering rather than aggregation. We can use window functions (which most databases support) to rank yearly product sales, and then retain only the top selling product per year.
select *
from (
select r.*, rank() over(partition by year order by revenue desc) rn
from yearly_product_revenue r
) r
where rn = 1;
Here is a shorter solution if your database support the standard WITH TIES clause:
select *
from yearly_product_revenue r
order by rank() over(partition by year order by revenue desc)
fetch first row with ties

Related

Running Total by Year in SQL

I have a table broken out into a series of numbers by year, and need to build a running total column but restart during the next year.
The desired outcome is below
Amount | Year | Running Total
-----------------------------
1 2000 1
5 2000 6
10 2000 16
5 2001 5
10 2001 15
3 2001 18
I can do an ORDER BY to get a standard running total, but can't figure out how to base it just on the year such that it does the running total for each unique year.
SQL tables represent unordered sets. You need a column to specify the ordering. One you have this, it is a simple cumulative sum:
select amount, year, sum(amount) over (partition by year order by <ordering column>)
from t;
Without a column that specifies ordering, "cumulative sum" does not make sense on a table in SQL.

Oracle sql: Order by with GROUP BY ROLLUP

I'm looking everywhere for an answer but nothing seems to compare with my problem. So, using rollup with query:
select year, month, count (sale_id) from sales
group by rollup (year, month);
Will give the result like:
YEAR MONTH TOTAL
2015 1 200
2015 2 415
2015 null 615
2016 1 444
2016 2 423
2016 null 867
null null 1482
And I would like to sort by total desc, but I would like year with biggest total to be on top (important: with all records that compares to that year), and then other records for other years. So I would like it to look like:
YEAR MONTH TOTAL
null null 1482
2016 null 867
2016 1 444
2016 2 423
2015 null 615
2015 2 415
2015 1 200
Or something like that. Main purpose is to not "split" records comparing to one year while sorting it with total. Can somebody help me with that?
Try using window function max to get max of total for each year in the order by clause:
select year, month, count(sale_id) total
from sales
group by rollup(year, month)
order by max(total) over (partition by year) desc, total desc;
Hmmm. I think this does what you want:
select year, month, count(sale_id) as cnt
from sales
group by rollup (year, month)
order by sum(count(sale_id)) over (partition by year) desc, year;
Actually, I've never use window functions in an order by with a rollup query. I wouldn't be surprised if a subquery were necessary.
I think you need to used GROUPING SETS and GROUP_ID's. These will help you determine a NULL caused by a subtotal. Take a look at the doc: https://docs.oracle.com/cd/B19306_01/server.102/b14223/aggreg.htm

Row number in query result

I have query to get firms by theirs sales last year.
select
Name,
Sale
from Sales
order by
Sale DESC
and I get
Firm 2 | 200 000
Firm 1 | 190 000
Firm 3 | 100 000
And I would like to get index of row in result. For Firm 2 I would like to get 0 (or 1), for Firm 3 1 (or 2) and etc. Is this possible? Or at least create some sort of autoincrement column. I can use even stored procedure if it is needed.
Firebird 3.0 supports row_number() which is the better way to do this.
However for Firebird 2.5, you can get what you want with a correlated subquery:
select s.Name, s.Sale,
(select count(*) from Sales s2 where s2.sale >= s.sale) as seqnum
from Sales s
order by s.Sale DESC;

Deciling by partitions in Teradata SQL

I have a table in Teradata which contains Sales Information per store pertaining to each region.
StoreID RegionID Sales
1 A 200
2 A 150
3 A 210
4 B 400
5 B 420
How can I find out the stores in top 2 deciles by sales for each region?
There's the QUANTILE function, but this is old deprecated syntax. The top 2 decile are the top 20 percent and you can simply use PERCENT_RANK for this:
QUALIFY
PERCENT_RANK()
OVER (PARTITION BY RegionID
ORDER BY Sales DESC) <= 0.2

SQL - select top xx% rows

I have a table, sales, which is ordered by descending TotalSales
user_id | TotalSales
----------------------
4 10
2 1.5
5 0.99
3 0.5
1 0.33
What I would like to do is find the percentage of the sum of all sales that the xx% most important sales represent.
For example if I wanted to do it for top 40% sales, here I would get (10+1.5)/(10+1.5+0.99+0.5+0.33)= 86%
But right now I haven't been able to select "top xx% rows".
Edit: DB management system can be MySQL or Vertica or Hive
select Sum(a) as s from sales where a in (Select TotalSales from sales where TotalSales>=x)
GROUP BY a
select Sum(TotalSales) as b from sales group by b
your result is s/b
and x= the percentage you set each time