how to rank/group 3 rows each in sql - sql

I have a table with 6 records, and need to get the results in the format below, grouping them into 3 rows each.
Input table:
id Value
-------------
1 abcd
2 defgh
3 ijkl
4 mnop
5 qrst
6 uvwx
Output format needed:
Rank id Value
--------------------
1 1 abcd
1 2 defgh
1 3 ijkl
2 4 mnop
2 5 qrst
2 6 uvwx

Here is one method:
select dense_rank() over (order by (id - 1)/3) as grp, id, value
from t;
This assumes, as in your sample data, that id starts at 1 and increases with no gaps.
If that is not true, then an alternative is:
select dense_rank() over (order by seqnum/3) as grp, id, value
from (select t.*, row_number() over (order by id) - 1 as seqnum
from t
);

You can use NTILE() here.
SELECT NTILE(2) OVER(ORDER BY id),id FROM TABLE_NAME
Think of it as buckets, NTILE(2) will make 2 buckets, half the rows will have the value 1 and the other half the value 2

Related

SQL Server - logical question - Get rank from IDs

I'm trying to write a query to solve a logical problem using Redshift POSTGRES 8.
Input column is a bunch of IDs and Order IDs and desired output is basically a rank of the ID as you can see in the screenshot.
(I'm sorry I'm not allowed to embed images into my StackOverflow posts yet)
If you could help me answer this question using SQL, that would be great! Thanks
Data
order id
id
size
desired output
1
abcd
2
1
1
abcd
2
1
1
efgh
5
2
1
efgh
5
2
1
efgh
5
2
1
efgh
5
2
2
aa
2
1
2
aa
2
1
2
bb
2
2
2
bb
2
2
SELECT
*,
DENSE_RANK() OVER (PARTITION BY order_item_id ORDER BY id) AS desired_result
FROM
your_table
DENSE_RANK() creates sequences starting from 1 according to the ORDER BY.
Any rows with the same ID will get the same value, and where RANK() would skip values in the event of ties DENSE_RANK() does not.
The PARTITION BY allows new sequences to be created for each different order_item_id.

Hive: window function - how to exclude the CURRENT ROW

I wish to calculate the minimum of a value over a partition, but the current row should not be taken into account.
SELECT *,
MIN(val) OVER(PARTITION BY col1)
FROM table
outputs the minimum over all rows in the partition.
The documentation shows ways to use CURRENT ROW, but not how to exclude it while performing the windowing operation.
I am looking for something like this:
SELECT *,
MIN(val) OVER(PARTITION BY col1 ROWS NOT CURRENT ROW)
FROM table
but this does not work.
I can think of a way to do this. The min over a window excluding the current row will always be the min over the window except when the row you are at is the min; then then min will be the 2nd min over the window. Example:
Data:
-----------
key | val
-----------
1 8
1 2
1 4
1 6
1 11
2 3
2 5
2 7
2 9
Query:
select key, val, act_min, val_arr
, case when act_min=val then val_arr[1] else act_min
end as min_except_for_c_row
from (
select key, val, act_min, sort_array(val_arr) val_arr
from (
select key, val
, min(val) over (partition by key) act_min
, collect_set(val) over (partition by key) val_arr
from db.table ) A
) B
I left all the columns in for illustration. You can modify the query as needed.
Output:
key val act_min val_arr min_except_for_c_row
1 8 2 [2,4,6,8,11] 2
1 2 2 [2,4,6,8,11] 4
1 4 2 [2,4,6,8,11] 2
1 6 2 [2,4,6,8,11] 2
1 11 2 [2,4,6,8,11] 2
2 3 3 [3,5,7,9] 5
2 5 3 [3,5,7,9] 3
2 7 3 [3,5,7,9] 3
2 9 3 [3,5,7,9] 3

SQL Server GROUP BY COUNT Consecutive Rows Only

I have a table called DATA on Microsoft SQL Server 2008 R2 with three non-nullable integer fields: ID, Sequence, and Value. Sequence values with the same ID will be consecutive, but can start with any value. I need a query that will return a count of consecutive rows with the same ID and Value.
For example, let's say I have the following data:
ID Sequence Value
-- -------- -----
1 1 1
5 1 100
5 2 200
5 3 200
5 4 100
10 10 10
I want the following result:
ID Start Value Count
-- ----- ----- -----
1 1 1 1
5 1 100 1
5 2 200 2
5 4 100 1
10 10 10 1
I tried
SELECT ID, MIN([Sequence]) AS Start, Value, COUNT(*) AS [Count]
FROM DATA
GROUP BY ID, Value
ORDER BY ID, Start
but that gives
ID Start Value Count
-- ----- ----- -----
1 1 1 1
5 1 100 2
5 2 200 2
10 10 10 1
which groups all rows with the same values, not just consecutive rows.
Any ideas? From what I've seen, I believe I have to left join the table with itself on consecutive rows using ROW_NUMBER(), but I am not sure exactly how to get counts from that.
Thanks in advance.
You can use Sequence - ROW_NUMBER() OVER (ORDER BY ID, Val, Sequence) AS g to create a group:
SELECT
ID,
MIN(Sequence) AS Sequence,
Val,
COUNT(*) AS cnt
FROM
(
SELECT
ID,
Sequence,
Sequence - ROW_NUMBER() OVER (ORDER BY ID, Val, Sequence) AS g,
Val
FROM
yourtable
) AS s
GROUP BY
ID, Val, g
Please see a fiddle here.

Increment Row Number on Group

I am working on a query for SQL Server 2005 that needs to return data with two 'index' fields. The first index 't_index' should increment every time the 'shade' column changes, whilst the second index increments within the partition of the values in the 'shade' column:
t_index s_index shade
1 1 A
1 2 A
1 3 A
1 4 A
1 5 A
2 1 B
2 2 B
2 3 B
2 4 B
2 5 B
To get the s_index column I am using the following:
Select ROW_NUMBER() OVER(PARTITION BY [shade] ORDER BY [shade]) as s_index
My question is how to get the first index to only increment when the value in the 'shade' column changes?
That can be accomplished with the DENSE_RANK() function:
DENSE_RANK() OVER(Order By [shade]) as t_index
You can try to use DENSE_RANK() for that:
SELECT
shade,
s_index = ROW_NUMBER() OVER(PARTITION BY [shade] ORDER BY [shade]),
t_index = DENSE_RANK() OVER (ORDER BY [shade])
FROM dbo.YourTableNameHEre
Gives output:
shade s_index t_index
A 1 1
A 2 1
A 3 1
A 4 1
A 5 1
B 1 2
B 2 2
B 3 2
B 4 2
B 5 2

sql assign a category id when using order by

I'm new to SQL.
When I order by SomeData on my table I get:
ID SomeData
6 ABC
3 ABC
12 FG
1 FH
2 GI
4 JU
8 K3
5 K3
11 P7
great. but what i really want on output is
ID Category
6 1
3 1
12 2
1 3
2 4
4 5
8 6
5 6
11 7
That is every time SomeData changes on the sort I want to increment Category by one
I can't see how to do this. Any help would be greatly appreciated. Thanks.
If you are on SQL-Server, you can use the DENSE_RANK() ranking function in combination with OVER:
SELECT ID
, DENSE_RANK() OVER (ORDER BY SomeData)
AS Category
FROM myTable
ORDER BY SomeData
See: SQL-Server: Ranking Functions
SELECT ROW_NUMBER() OVER(ORDER BY SomeData ASC) AS Category, otherfield1..