Fetch row with max occurrence in Oracle - sql

I have a table like:
SALES
PROD_CODE SALE_ID
321 30
123 67
321 46
321 82
123 48
321 91
For the code:
SELECT PROD_CODE, COUNT(SALE_ID) AS TOTAL_SALES
FROM SALES
GROUP BY PROD_CODE
ORDER BY COUNT(SALE_ID) DESC;
The output is:
PROD_CODE TOTAL_SALES
321 4
123 2
But, when I am expecting only the prod_code with the maximum number of sales as the output,
like:
PROD_CODE
321
For the code:
SELECT PROD_CODE
FROM (SELECT MAX(COUNT(SALE_ID)) FROM SALES
GROUP BY SALE_ID);
The code isn't working!

In Oracle 12c+, you can do:
select s.prod_code
from sales s
order by count(*) desc
fetch first 1 row only;
In earlier versions, either
select s.*
from (select s.prod_code
from sales s
order by count(*) desc
) s
where rownum = 1;
Or:
select max(prod_code) over (dense_rank first order by cnt desc)
from (select s.prod_code, count(*) as cnt
from sales s
group by s.prod_code
) s
The first two versions fetch the entire row. You can limit it to one or more columns is that is all you want.

You could use stats_mode function to fetch row/column with maximum occurrence.
Here is detailed doc for this function https://docs.oracle.com/database/121/SQLRF/functions188.htm#SQLRF06320

Related

selecting the first and last 10 rows of a sql query

hi i have a sqlite database like this
price category_id product_id
100000 89 1
2000 88 2
50000 89 3
i want to extract the top and last 5 of product id for each category (the highest and lowest products of each category)
i have written this
SELECT *
FROM sql_data_users.products
GROUP BY product_id,category_id
ORDER BY price ASC
LIMIT 10
but it gives me 10 rows instead of 10*len(category_id)
also for the solution to be complete i thought of adding another query and changeing the order to ASC and then uniting the 2 query is that possible and how?
You would use window functions:
select t.*
from (select t.*,
row_number() over (partition by category order by price asc) as seqnum_asc,
row_number() over (partition by category order by price desc) as seqnum_desc
from t
) t
where seqnum_asc <= 5 or seqnum_desc <= 5
order by category, price desc;

Limit top pair of columns

The data I am working with looks like below-
category_id subcategory_id date quantities
123 45 2020-02-01 500
123 45 2020-02-13 400
456 35 2020-05-09 350
456 35 2020-05-15 250
456 35 2020-06-18 200
.
.
.
n such columns
Quantities are sorted in descending order
I want to get the data (as seen above) for the first (top) 10 unique pairs of (category_id, subcategory_id). Just like we use limit 10 to get the first 10 records, I want to limit by the top 10 unique pairs of (category_id, subcategory_id) and get the all the data as seen above.
Below is for BigQuery Standard SQL
#standardSQL
SELECT * EXCEPT(rn) FROM (
SELECT *,
ROW_NUMBER() OVER(PARTITION BY category_id, subcategory_id ORDER BY quantities DESC) rn
FROM `project.dataset.table`
)
WHERE rn <= 10
Another - more BigQuery'ish alternative is below
#standardSQL
SELECT TopN.* FROM (
SELECT ARRAY_AGG(t ORDER BY quantities DESC LIMIT 10) topN
FROM `project.dataset.table` t
GROUP BY category_id, subcategory_id
) t, t.topN
If you want 10 rows, each with different category_id/subcategory_id pairs, then you can use:
select t.* except (seqnum)
from (select t.*,
row_number() over (partition by category_id, subcategory_id order by quantities desc) as seqnum
from t
) t
where seqnum = 1
order by quantities desc
limit 10;
This gets the first row (by quantities) for each id pair and then limits to the 10 largest values.

SQL sum grouped by field with all rows

I have this table:
id sale_id price
-------------------
1 1 100
2 1 200
3 2 50
4 3 50
I want this result:
id sale_id price sum(price by sale_id)
------------------------------------------
1 1 100 300
2 1 200 300
3 2 50 50
4 3 50 50
I tried this:
SELECT id, sale_id, price,
(SELECT sum(price) FROM sale_lines GROUP BY sale_id)
FROM sale_lines
But get the error that subquery returns different number of rows.
How can I do it?
I want all the rows of sale_lines table selecting all fields and adding the sum(price) grouped by sale_id.
You can use window function :
sum(price) over (partition by sale_id) as sum
If you want sub-query then you need to correlate them :
SELECT sl.id, sl.sale_id, sl.price,
(SELECT sum(sll.price)
FROM sale_lines sll
WHERE sl.sale_id = sll.sale_id
)
FROM sale_lines sl;
Don't use GROUP BY in the sub-query, make it a co-related sub-query:
SELECT sl1.id, sl1.sale_id, sl1.price,
(SELECT sum(sl2.price) FROM sale_lines sl2 where sl2.sale_id = sl.sale_id) as total
FROM sale_lines sl1
In addition to other approaches, You can use CROSS APPLY and get the sum.
SELECT id, sale_id,price, Price_Sum
FROM YourTable AS ot
CROSS APPLY
(SELECT SUM(price) AS Price_Sum
FROM YourTable
WHERE sale_id = ot.sale_id);
SELECT t1.*,
total_price
FROM `sale_lines` AS t1
JOIN(SELECT Sum(price) AS total_price,
sale_id
FROM sale_lines
GROUP BY sale_id) AS t2
ON t1.sale_id = t2.sale_id

How to build SQL to capture most unique value?

I am trying to build a query results with SQL. Here is my table:
CUST_ID ORDER_ID STORE_FREQUENCY
---------- ----------- ---------------
100 20122 500
100 20100 500
100 20100 737
200 20119 287
300 20130 434
300 20150 434
300 20130 434
300 20120 120
The expected output is:
CUST_ID UNIQUE_ORDERS TOP_STORE
--------- ----------------- ---------
100 2 737
200 1 287
300 3 434
The requirement for the output is:
TOP_STORE = Per CUST_ID, sort the STORE_FREQUENCY column by DESC and get the greatest store frequency
UNIQUE_ORDERS = Per CUST_ID, the number of unique ORDER_IDs in the column
I have started this SELECT statement, but having difficulties completing it to include the 2 columns correctly:
Select cust_id, Count(order_id) as unique_orders
From ORDERS_TABLE
Group By Order_ID
Can you help me complete the 2 columns?
Use aggregate functions such as COUNT(DISTINCT ...) and MAX()
SELECT CUST_ID, COUNT(DISTINCT ORDER_ID), MAX(STORE_FREQUENCY )
FROM TableName
GROUP BY CUST_ID
Here's a DEMO.
It seems to be that the top store should be the store with the greatest number of orders. If so, then CUST_ID 100 should have store 500 as the top store, not 737. In other words, I would expect the following output:
This requirement changes the query strategy, because we no longer can just do a single simple aggregation over the entire table. One approach is to do a separate calculation to find the top store for each customer, then join that result to a query similar to the other answers.
WITH cte AS (
SELECT CUST_ID, STORE_FREQUENCY, cnt,
ROW_NUMBER() OVER (PARTITION BY CUST_ID ORDER BY cnt DESC) rn
FROM
(
SELECT CUST_ID, STORE_FREQUENCY,
COUNT(*) OVER (PARTITION BY CUST_ID, STORE_FREQUENCY) cnt
FROM yourTable
) t
)
SELECT
t1.CUST_ID,
t1.UNIQUE_ORDERS,
t2.TOP_STORE
FROM
(
SELECT CUST_ID, COUNT(DISTINCT ORDER_ID) AS UNIQUE_ORDERS
FROM yourTable
GROUP BY CUST_ID
) t1
INNER JOIN
(
SELECT CUST_ID, STORE_FREQUENCY AS TOP_STORE
FROM cte
WHERE rn = 1
) t2
ON t1.CUST_ID = t2.CUST_ID;
Demo

how to use same column twice with different criteria with one common column in sql

I have a table
ID P_ID Cost
1 101 1000
2 101 1050
3 101 1100
4 102 5000
5 102 2000
6 102 6000
7 103 3000
8 103 5000
9 103 4000
I want to use 'Cost' column twice to fetch first and last inserted value in cost corresponding to each P_ID
I want output as:
P_ID First_Cost Last_Cost
101 1000 1100
102 5000 6000
103 3000 4000
;WITH t AS
(
SELECT P_ID, Cost,
f = ROW_NUMBER() OVER (PARTITION BY P_ID ORDER BY ID),
l = ROW_NUMBER() OVER (PARTITION BY P_ID ORDER BY ID DESC)
FROM dbo.tablename
)
SELECT t.P_ID, t.Cost, t2.Cost
FROM t INNER JOIN t AS t2
ON t.P_ID = t2.P_ID
WHERE t.f = 1 AND t2.l = 1;
In 2012 you will be able to use FIRST_VALUE():
SELECT DISTINCT
P_ID,
FIRST_VALUE(Cost) OVER (PARTITION BY P_ID ORDER BY ID),
FIRST_VALUE(Cost) OVER (PARTITION BY P_ID ORDER BY ID DESC)
FROM dbo.tablename;
You get a slightly more favorable plan if you remove the DISTINCT and instead use ROW_NUMBER() with the same partitioning to eliminate multiple rows with the same P_ID:
;WITH t AS
(
SELECT
P_ID,
f = FIRST_VALUE(Cost) OVER (PARTITION BY P_ID ORDER BY ID),
l = FIRST_VALUE(Cost) OVER (PARTITION BY P_ID ORDER BY ID DESC),
r = ROW_NUMBER() OVER (PARTITION BY P_ID ORDER BY ID)
FROM dbo.tablename
)
SELECT P_ID, f, l FROM t WHERE r = 1;
Why not LAST_VALUE(), you ask? Well, it doesn't work like you might expect. For more details, see the comments under the documentation.
SELECT t.P_ID,
SUM(CASE WHEN ID = t.minID THEN Cost ELSE 0 END) as FirstCost,
SUM(CASE WHEN ID = t.maxID THEN Cost ELSE 0 END) as LastCost
FROM myTable
JOIN (
SELECT P_ID, MIN(ID) as minID, MAX(ID) as maxID
FROM myTable
GROUP BY P_ID) t ON myTable.ID IN (t.minID, t.maxID)
GROUP BY t.P_ID
Admittedly, #AaronBertrand's approach is cleaner here. However, this solution will work on older versions of SQL Server (that don't support CTE's or window functions), or on pretty much any other DBMS.
Do you want first and last in terms of Min and Max, or do you want which one was entered first and which one was entered last? If you want Min and max you can group by.
SELECT P_ID, MIN(Cost), MAX(Cost) FROM table_name GROUP BY P_ID
I believe this does your thing also, just without self joins or subqueries:
SELECT DISTINCT
P_ID
,MIN(Cost) OVER (PARTITION BY P_ID) as FirstCost
,MAX(Cost) OVER (PARTITION BY P_ID) as LastCost
FROM Table