Limit top pair of columns - sql

The data I am working with looks like below-
category_id subcategory_id date quantities
123 45 2020-02-01 500
123 45 2020-02-13 400
456 35 2020-05-09 350
456 35 2020-05-15 250
456 35 2020-06-18 200
.
.
.
n such columns
Quantities are sorted in descending order
I want to get the data (as seen above) for the first (top) 10 unique pairs of (category_id, subcategory_id). Just like we use limit 10 to get the first 10 records, I want to limit by the top 10 unique pairs of (category_id, subcategory_id) and get the all the data as seen above.

Below is for BigQuery Standard SQL
#standardSQL
SELECT * EXCEPT(rn) FROM (
SELECT *,
ROW_NUMBER() OVER(PARTITION BY category_id, subcategory_id ORDER BY quantities DESC) rn
FROM `project.dataset.table`
)
WHERE rn <= 10
Another - more BigQuery'ish alternative is below
#standardSQL
SELECT TopN.* FROM (
SELECT ARRAY_AGG(t ORDER BY quantities DESC LIMIT 10) topN
FROM `project.dataset.table` t
GROUP BY category_id, subcategory_id
) t, t.topN

If you want 10 rows, each with different category_id/subcategory_id pairs, then you can use:
select t.* except (seqnum)
from (select t.*,
row_number() over (partition by category_id, subcategory_id order by quantities desc) as seqnum
from t
) t
where seqnum = 1
order by quantities desc
limit 10;
This gets the first row (by quantities) for each id pair and then limits to the 10 largest values.

Related

How to Rank By Partition with island and gap issue

Is it possible to rank item by partition without use CTE method
Expected Table
item
value
ID
A
10
1
A
20
1
B
30
2
B
40
2
C
50
3
C
60
3
A
70
4
A
80
4
By giving id to the partition to allow agitated function to work the way I want.
item
MIN
MAX
ID
A
10
20
1
B
30
40
2
C
50
60
3
A
70
80
4
SQL Version: Microsoft SQL Sever 2017
Assuming that the value column provides the intended ordering of the records which we see in your question above, we can try using the difference in row numbers method here. Your problem is a type of gaps and islands problem.
WITH cte AS (
SELECT *, ROW_NUMBER() OVER (ORDER BY value) rn1,
ROW_NUMBER() OVER (PARTITION BY item ORDER BY value) rn2
FROM yourTable
)
SELECT item, MIN(value) AS [MIN], MAX(value) AS [MAX], MIN(ID) AS ID
FROM cte
GROUP BY item, rn1 - rn2
ORDER BY MIN(value);
Demo
If you don't want to use a CTE here, for whatever reason, you may simply inline the SQL code in the CTE into the bottom query, as a subquery:
SELECT item, MIN(value) AS [MIN], MAX(value) AS [MAX], MIN(ID) AS ID
FROM
(
SELECT *, ROW_NUMBER() OVER (ORDER BY value) rn1,
ROW_NUMBER() OVER (PARTITION BY item ORDER BY value) rn2
FROM yourTable
) t
GROUP BY item, rn1 - rn2
ORDER BY MIN(value);
You can generate group IDs by analyzing the previous row item value that could be obtained with the LAG function and finally use GROUP BY to get the minimum and maximum value in item groups.
SELECT
item,
MIN(value) AS "min",
MAX(value) AS "max",
group_id + 1 AS id
FROM (
SELECT
*,
SUM(CASE WHEN item = prev_item THEN 0 ELSE 1 END) OVER (ORDER BY value) AS group_id
FROM (
SELECT
*,
LAG(item, 1, item) OVER (ORDER BY value) AS prev_item
FROM t
) items
) groups
GROUP BY item, group_id
Query produces output
item
min
max
id
A
10
20
1
B
30
40
2
C
50
60
3
A
70
80
4
You can check a working demo here

selecting the first and last 10 rows of a sql query

hi i have a sqlite database like this
price category_id product_id
100000 89 1
2000 88 2
50000 89 3
i want to extract the top and last 5 of product id for each category (the highest and lowest products of each category)
i have written this
SELECT *
FROM sql_data_users.products
GROUP BY product_id,category_id
ORDER BY price ASC
LIMIT 10
but it gives me 10 rows instead of 10*len(category_id)
also for the solution to be complete i thought of adding another query and changeing the order to ASC and then uniting the 2 query is that possible and how?
You would use window functions:
select t.*
from (select t.*,
row_number() over (partition by category order by price asc) as seqnum_asc,
row_number() over (partition by category order by price desc) as seqnum_desc
from t
) t
where seqnum_asc <= 5 or seqnum_desc <= 5
order by category, price desc;

How to build SQL to capture most unique value?

I am trying to build a query results with SQL. Here is my table:
CUST_ID ORDER_ID STORE_FREQUENCY
---------- ----------- ---------------
100 20122 500
100 20100 500
100 20100 737
200 20119 287
300 20130 434
300 20150 434
300 20130 434
300 20120 120
The expected output is:
CUST_ID UNIQUE_ORDERS TOP_STORE
--------- ----------------- ---------
100 2 737
200 1 287
300 3 434
The requirement for the output is:
TOP_STORE = Per CUST_ID, sort the STORE_FREQUENCY column by DESC and get the greatest store frequency
UNIQUE_ORDERS = Per CUST_ID, the number of unique ORDER_IDs in the column
I have started this SELECT statement, but having difficulties completing it to include the 2 columns correctly:
Select cust_id, Count(order_id) as unique_orders
From ORDERS_TABLE
Group By Order_ID
Can you help me complete the 2 columns?
Use aggregate functions such as COUNT(DISTINCT ...) and MAX()
SELECT CUST_ID, COUNT(DISTINCT ORDER_ID), MAX(STORE_FREQUENCY )
FROM TableName
GROUP BY CUST_ID
Here's a DEMO.
It seems to be that the top store should be the store with the greatest number of orders. If so, then CUST_ID 100 should have store 500 as the top store, not 737. In other words, I would expect the following output:
This requirement changes the query strategy, because we no longer can just do a single simple aggregation over the entire table. One approach is to do a separate calculation to find the top store for each customer, then join that result to a query similar to the other answers.
WITH cte AS (
SELECT CUST_ID, STORE_FREQUENCY, cnt,
ROW_NUMBER() OVER (PARTITION BY CUST_ID ORDER BY cnt DESC) rn
FROM
(
SELECT CUST_ID, STORE_FREQUENCY,
COUNT(*) OVER (PARTITION BY CUST_ID, STORE_FREQUENCY) cnt
FROM yourTable
) t
)
SELECT
t1.CUST_ID,
t1.UNIQUE_ORDERS,
t2.TOP_STORE
FROM
(
SELECT CUST_ID, COUNT(DISTINCT ORDER_ID) AS UNIQUE_ORDERS
FROM yourTable
GROUP BY CUST_ID
) t1
INNER JOIN
(
SELECT CUST_ID, STORE_FREQUENCY AS TOP_STORE
FROM cte
WHERE rn = 1
) t2
ON t1.CUST_ID = t2.CUST_ID;
Demo

Fetch row with max occurrence in Oracle

I have a table like:
SALES
PROD_CODE SALE_ID
321 30
123 67
321 46
321 82
123 48
321 91
For the code:
SELECT PROD_CODE, COUNT(SALE_ID) AS TOTAL_SALES
FROM SALES
GROUP BY PROD_CODE
ORDER BY COUNT(SALE_ID) DESC;
The output is:
PROD_CODE TOTAL_SALES
321 4
123 2
But, when I am expecting only the prod_code with the maximum number of sales as the output,
like:
PROD_CODE
321
For the code:
SELECT PROD_CODE
FROM (SELECT MAX(COUNT(SALE_ID)) FROM SALES
GROUP BY SALE_ID);
The code isn't working!
In Oracle 12c+, you can do:
select s.prod_code
from sales s
order by count(*) desc
fetch first 1 row only;
In earlier versions, either
select s.*
from (select s.prod_code
from sales s
order by count(*) desc
) s
where rownum = 1;
Or:
select max(prod_code) over (dense_rank first order by cnt desc)
from (select s.prod_code, count(*) as cnt
from sales s
group by s.prod_code
) s
The first two versions fetch the entire row. You can limit it to one or more columns is that is all you want.
You could use stats_mode function to fetch row/column with maximum occurrence.
Here is detailed doc for this function https://docs.oracle.com/database/121/SQLRF/functions188.htm#SQLRF06320

How to get the max value id from sql table

I need to get a max (Amount) value of each Account from the below table
ID Account Amount
1 rx00 100
2 rx00 200
3 rx00 100
4 vxtt 50
5 vxtt 70
6 vxtt 80
I need a result table as
ID Account Amount
2 rx00 200
6 vxtt 80
Please advise to the above result
You can use ROW_NUMBER for this:
SELECT ID, Account, Amount
FROM (
SELECT ID, Account, Amount,
ROW_NUMBER() OVER (PARTITION BY Account
ORDER BY Amount DESC) AS rn
FROM mytable) AS t
WHERE t.rn = 1
If you have ties, i.e. more than one records sharing the same maximum Amount value and you want to return all these records, then use RANK instead of ROW_NUMBER.
You can use row_number to get the desired result.
DECLARE #table TABLE
(ID int, Account varchar(10),Amount int)
INSERT INTO #table
(ID,Account,Amount)
VALUES
(1,'rx00',100),
(2,'rx00',200),
(3,'rx00',100),
(4,'vxtt',50),
(5,'vxtt',70),
(6,'vxtt',80)
SELECT ID, Account, Amount
FROM (
SELECT ID, Account, Amount,
ROW_NUMBER() OVER (PARTITION BY Account
ORDER BY Amount DESC) AS rnk
FROM #table) AS t
WHERE t.rnk = 1
This will be the output:
ID Account Amount
2 rx00 200
6 vxtt 80
If you don't need the ID column in the result set , you can use the below Query.
SELECT Account,max(Amount) AS Amount FROM #table t
GROUP BY Account
This will be the output:
Account Amount
rx00 200
vxtt 80
;With cte
as
(
select id,account,amount
,row_number () over (partition by account order by amount desc) as rn
from
table
)
select * from cte where rn=1
In above Code, a unique RowNumber is assigned for every value Partitioned by account and amount (values sorted in desc order of amount).since we cant directly select Row_number or any calculated value in where clause.I used a Virtual table