hi i have a sqlite database like this
price category_id product_id
100000 89 1
2000 88 2
50000 89 3
i want to extract the top and last 5 of product id for each category (the highest and lowest products of each category)
i have written this
SELECT *
FROM sql_data_users.products
GROUP BY product_id,category_id
ORDER BY price ASC
LIMIT 10
but it gives me 10 rows instead of 10*len(category_id)
also for the solution to be complete i thought of adding another query and changeing the order to ASC and then uniting the 2 query is that possible and how?
You would use window functions:
select t.*
from (select t.*,
row_number() over (partition by category order by price asc) as seqnum_asc,
row_number() over (partition by category order by price desc) as seqnum_desc
from t
) t
where seqnum_asc <= 5 or seqnum_desc <= 5
order by category, price desc;
Related
Is it possible to rank item by partition without use CTE method
Expected Table
item
value
ID
A
10
1
A
20
1
B
30
2
B
40
2
C
50
3
C
60
3
A
70
4
A
80
4
By giving id to the partition to allow agitated function to work the way I want.
item
MIN
MAX
ID
A
10
20
1
B
30
40
2
C
50
60
3
A
70
80
4
SQL Version: Microsoft SQL Sever 2017
Assuming that the value column provides the intended ordering of the records which we see in your question above, we can try using the difference in row numbers method here. Your problem is a type of gaps and islands problem.
WITH cte AS (
SELECT *, ROW_NUMBER() OVER (ORDER BY value) rn1,
ROW_NUMBER() OVER (PARTITION BY item ORDER BY value) rn2
FROM yourTable
)
SELECT item, MIN(value) AS [MIN], MAX(value) AS [MAX], MIN(ID) AS ID
FROM cte
GROUP BY item, rn1 - rn2
ORDER BY MIN(value);
Demo
If you don't want to use a CTE here, for whatever reason, you may simply inline the SQL code in the CTE into the bottom query, as a subquery:
SELECT item, MIN(value) AS [MIN], MAX(value) AS [MAX], MIN(ID) AS ID
FROM
(
SELECT *, ROW_NUMBER() OVER (ORDER BY value) rn1,
ROW_NUMBER() OVER (PARTITION BY item ORDER BY value) rn2
FROM yourTable
) t
GROUP BY item, rn1 - rn2
ORDER BY MIN(value);
You can generate group IDs by analyzing the previous row item value that could be obtained with the LAG function and finally use GROUP BY to get the minimum and maximum value in item groups.
SELECT
item,
MIN(value) AS "min",
MAX(value) AS "max",
group_id + 1 AS id
FROM (
SELECT
*,
SUM(CASE WHEN item = prev_item THEN 0 ELSE 1 END) OVER (ORDER BY value) AS group_id
FROM (
SELECT
*,
LAG(item, 1, item) OVER (ORDER BY value) AS prev_item
FROM t
) items
) groups
GROUP BY item, group_id
Query produces output
item
min
max
id
A
10
20
1
B
30
40
2
C
50
60
3
A
70
80
4
You can check a working demo here
basically how do I turn
id name quantity
1 Jerry 1
1 Jerry 2
1 Nana 1
2 Max 4
2 Lenny 3
into
id name quantity
1 Jerry 3
2 Max 4
in HIVE?
I want to sum up and find the highest quantity for each unique ID
You can use window functions with aggregation:
select id, name, quantity
from (select id, name, sum(quantity) as quantity,
row_number() over (partition by id order by sum(quantity) desc) as seqnum
from t
group by id, name
) t
where seqnum = 1;
You can first calculate the sum of quantity per group, then rank them according to descending quantity, and finally filter the rows with rank = 1.
select
id, name, quantity
from (
select
*,
row_number() over (partition by id order by quantity desc) as rn
from (
select id, name, sum(quantity) as quantity
from mytable
group by id, name
)
) where rn = 1;
try like below
with cte as
(
select id,name,sum(quantity) as q
from table_name group by id,name
) select id,name,q from cte t1
where t1.q=( select max(q) from cte t2 where t1.id=t2.id)
The data I am working with looks like below-
category_id subcategory_id date quantities
123 45 2020-02-01 500
123 45 2020-02-13 400
456 35 2020-05-09 350
456 35 2020-05-15 250
456 35 2020-06-18 200
.
.
.
n such columns
Quantities are sorted in descending order
I want to get the data (as seen above) for the first (top) 10 unique pairs of (category_id, subcategory_id). Just like we use limit 10 to get the first 10 records, I want to limit by the top 10 unique pairs of (category_id, subcategory_id) and get the all the data as seen above.
Below is for BigQuery Standard SQL
#standardSQL
SELECT * EXCEPT(rn) FROM (
SELECT *,
ROW_NUMBER() OVER(PARTITION BY category_id, subcategory_id ORDER BY quantities DESC) rn
FROM `project.dataset.table`
)
WHERE rn <= 10
Another - more BigQuery'ish alternative is below
#standardSQL
SELECT TopN.* FROM (
SELECT ARRAY_AGG(t ORDER BY quantities DESC LIMIT 10) topN
FROM `project.dataset.table` t
GROUP BY category_id, subcategory_id
) t, t.topN
If you want 10 rows, each with different category_id/subcategory_id pairs, then you can use:
select t.* except (seqnum)
from (select t.*,
row_number() over (partition by category_id, subcategory_id order by quantities desc) as seqnum
from t
) t
where seqnum = 1
order by quantities desc
limit 10;
This gets the first row (by quantities) for each id pair and then limits to the 10 largest values.
I'm having problem with getting only TOP 2 values for each group (groups are in column).
Example :
ID Group Value
1 A 30
2 A 150
3 A 40
4 A 70
5 B 0
6 B 100
7 B 90
I expect my output to be
ID Group Value
1 A 150
2 A 70
3 B 100
4 B 90
Simply, for each group I want just 2 rows with the highest Value
Most databases support the ANSI standard row_number() function. You would use it as:
select group, value
from (select t.*,
row_number() over (partition by group order by value desc) as seqnum
from t
) t
where seqnum <= 2;
To set the id you can use row_number() in the outer query:
select row_number() over (order by group, value) as id,
group, value
from (select t.*,
row_number() over (partition by group order by value desc) as seqnum
from t
) t
where seqnum <= 2;
However, changing the id seems suspicious.
You can use CTE with rank function ROW_NUMBER() .
Here is query to get your result.
;WITH cte AS
( SELECT Group, value,
ROW_NUMBER() OVER (PARTITION BY Group ORDER BY value DESC) AS rn
FROM test
)
SELECT Group, value FROM cte
WHERE rn <= 2
ORDER BY value
I have a query which uses row_number() over partition.
When the result comes out it looks like
Product Row_Number Price
A 1 25
A 2 20
A 3 15
B 1 100
B 2 10
B 3 2
I want to get the result to show over columns like
Product Row1 Row2 Row3 price1 price2 price3
A 1 2 3 25 20 15
B 1 2 3 100 10 2
Should I use something like rank()???
I'm using Teradata
You can add two more window functions to get the 2nd and 3rd highest price, this should run in the same STAT-step as your current ROW_NUMBER, so there's no additional overhead:
select
product,
price as Price1,
min(price)
over (partition by product
order by price desc
rows between 1 following and 1 following) as Price2,
min(price)
over (partition by product
order by price desc
rows between 2 following and 2 following) as Price3
from tab
qualify
row_number()
over (partition by product
order by price desc) = 1
I just give sort direction for each sort parameter , and then , it works , very fine. No Partition is used.
SELECT TOP (5) ROW_NUMBER() OVER (ORDER BY SCHEME ASC,APPLICATION_DATE DESC,TRANSACTION_REF_NO ASC,APPLICATION_STATUS DESC)