Ranking: How to reset the ROW_NUMBER or RANK to 1 - sql

Using SQL Server 2014:
Consider the following table:
DECLARE #Table TABLE (
Id int NOT NULL identity(1,1),
Col_Value varchar(2)
)
INSERT INTO #Table (Col_Value)
VALUES ('A'),('A'),('B'),('B'),('B'),('A'),('A'),('B'),('B'),('B'),('A'),('B'),('B'),('A'),('A'),('B'),('C'),('C'),('A'),('A'),('B'),('B'),('C')
How can I create a query that produces R column in the result like below
+----+------+---+
| ID | Data | R |
+----+------+---+
| 1 | A | 1 |
+----+------+---+
| 2 | A | 2 |
+----+------+---+
| 3 | B | 1 |
+----+------+---+
| 4 | B | 2 |
+----+------+---+
| 5 | B | 3 |
+----+------+---+
| 6 | A | 1 |
+----+------+---+
| 7 | A | 2 |
+----+------+---+
| 8 | B | 1 |
+----+------+---+
| 9 | B | 2 |
+----+------+---+
| 10 | B | 3 |
+----+------+---+
| 11 | A | 1 |
+----+------+---+
| 12 | B | 1 |
+----+------+---+
| 13 | B | 2 |
+----+------+---+
| 14 | A | 1 |
+----+------+---+
| 15 | A | 2 |
+----+------+---+
| 16 | B | 1 |
+----+------+---+
| 17 | C | 1 |
+----+------+---+
| 18 | C | 2 |
+----+------+---+
| 19 | A | 1 |
+----+------+---+
| 20 | A | 2 |
+----+------+---+
| 21 | B | 1 |
+----+------+---+
| 22 | B | 2 |
+----+------+---+
| 23 | C | 1 |
+----+------+---+
In the above result table, once Data column changes in a row, the R value resets to 1
Update 1
Ben Thul's answer works very well.
I suggest below post be updated with a reference to this answer.
T-sql Reset Row number on Field Change

This is known as a "gaps and islands" problem in the literature. First, my proposed solution:
with cte as (
select *, [Id] - row_number() over (partition by [Col_Value] order by [Id]) as [GroupID]
from #table
)
select [Id], [Col_Value], row_number() over (partition by [GroupID], [Col_Value] order by [Id])
from cte
order by [Id];
For exposition, note that if I enumerate all of the "A" values using row_number(), those that are contiguous have the row_number() value go up at the same rate as the Id value. Which is to say that their difference will be the same for those in that contiguous group (also known as an "island"). Once we calculate that group identifier, it's merely a matter of enumerating each member per group.

Related

Count without using functions (like count) oracle

I have two tables:
TABLE A :
CREATE TABLE z_ostan ( id NUMBER PRIMARY KEY,
name VARCHAR2(30) NOT NULL CHECK (upper(name)=name)
);
TABLE B:
CREATE TABLE z_shahr ( id NUMBER PRIMARY KEY,
name VARCHAR2(30) NOT NULL CHECK (upper(name)=name),
ref_ostan NUMBER,
CONSTRAINT fk_ref_ostan FOREIGN KEY (ref_ostan) REFERENCES z_ostan(id)
);
How can I find the second and third place "id" from -Table A- The least used table B in the table? Without using predefined functions like "count()"
This only processes existing references to Table A.
Updated for oracle (used 12c)
Without using any aggregate or window functions:
Sample data for Table: tblb
+----+---------+---------+
| id | name | tbla_id |
+----+---------+---------+
| 1 | TBLB_01 | 1 |
| 2 | TBLB_02 | 1 |
| 3 | TBLB_03 | 1 |
| 4 | TBLB_04 | 1 | 4 rows
| 5 | TBLB_05 | 2 |
| 6 | TBLB_06 | 2 |
| 7 | TBLB_07 | 2 | 3 rows
| 8 | TBLB_08 | 3 |
| 9 | TBLB_09 | 3 |
| 10 | TBLB_10 | 3 |
| 11 | TBLB_11 | 3 |
| 12 | TBLB_12 | 3 |
| 13 | TBLB_13 | 3 | 6 rows
| 14 | TBLB_14 | 4 |
| 15 | TBLB_15 | 4 |
| 16 | TBLB_16 | 4 | 3 rows
| 17 | TBLB_17 | 5 | 1 row
| 18 | TBLB_18 | 6 |
| 19 | TBLB_19 | 6 | 2 rows
| 20 | TBLB_20 | 7 | 1 row
+----+---------+---------+
There are many ways to express this logic.
Step by step with CTE terms.
The intent is (for each set of tbla_id rows in tblb)
generate a row_number (n) for the rows in each partition.
We would normally use window functions for this.
But I assume these are not allowed.
Use this row_number (n) to determine the count of rows in each tbla_id partition.
To find that count per partition, find the last row in each partition (from step 1).
Order the results of step 2 by n of these last rows.
Choose the 2nd and 3rd row of this result
Done.
WITH first AS ( -- Find the first row per tbla_id
SELECT t1.*
FROM tblb t1
LEFT JOIN tblb t2
ON t1.id > t2.id
AND t1.tbla_id = t2.tbla_id
WHERE t2.id IS NULL
)
, rnum (id, name, tbla_id, n) AS ( -- Generate a row_number (n) for each tbla_id partition
SELECT f.*, 1 FROM first f UNION ALL
SELECT n.id, n.name, n.tbla_id, c.n+1
FROM rnum c
JOIN tblb n
ON c.tbla_id = n.tbla_id
AND c.id < n.id
LEFT JOIN tblb n2
ON n.tbla_id = n2.tbla_id
AND c.id < n2.id
AND n.id > n2.id
WHERE n2.id IS NULL
)
, last AS ( -- Find the last row in each partition to obtain the count of tbla_id references
SELECT t1.*
FROM rnum t1
LEFT JOIN rnum t2
ON t1.id < t2.id
AND t1.tbla_id = t2.tbla_id
WHERE t2.id IS NULL
)
SELECT * FROM last
ORDER BY n, tbla_id OFFSET 1 ROWS FETCH NEXT 2 ROWS ONLY
;
Final Result, where n is the count of references to tbla:
+------+---------+---------+------+
| id | name | tbla_id | n |
+------+---------+---------+------+
| 20 | TBLB_20 | 7 | 1 |
| 19 | TBLB_19 | 6 | 2 |
+------+---------+---------+------+
Some intermediate results...
last CTE term result. The 2nd and 3rd rows of this become the final result.
+------+---------+---------+------+
| id | name | tbla_id | n |
+------+---------+---------+------+
| 17 | TBLB_17 | 5 | 1 |
| 20 | TBLB_20 | 7 | 1 |
| 19 | TBLB_19 | 6 | 2 |
| 7 | TBLB_07 | 2 | 3 |
| 16 | TBLB_16 | 4 | 3 |
| 4 | TBLB_04 | 1 | 4 |
| 13 | TBLB_13 | 3 | 6 |
+------+---------+---------+------+
rnum CTE term result. This provides the row_number over tbla_id partitions ordered by id
+------+---------+---------+------+
| id | name | tbla_id | n |
+------+---------+---------+------+
| 1 | TBLB_01 | 1 | 1 |
| 2 | TBLB_02 | 1 | 2 |
| 3 | TBLB_03 | 1 | 3 |
| 4 | TBLB_04 | 1 | 4 |
| 5 | TBLB_05 | 2 | 1 |
| 6 | TBLB_06 | 2 | 2 |
| 7 | TBLB_07 | 2 | 3 |
| 8 | TBLB_08 | 3 | 1 |
| 9 | TBLB_09 | 3 | 2 |
| 10 | TBLB_10 | 3 | 3 |
| 11 | TBLB_11 | 3 | 4 |
| 12 | TBLB_12 | 3 | 5 |
| 13 | TBLB_13 | 3 | 6 |
| 14 | TBLB_14 | 4 | 1 |
| 15 | TBLB_15 | 4 | 2 |
| 16 | TBLB_16 | 4 | 3 |
| 17 | TBLB_17 | 5 | 1 |
| 18 | TBLB_18 | 6 | 1 |
| 19 | TBLB_19 | 6 | 2 |
| 20 | TBLB_20 | 7 | 1 |
+------+---------+---------+------+
There are a few other ways to tackle this problem in just SQL.

Get the Id of the matched data from other table. No duplicates of ID from both tables

Here is my table A.
| Id | GroupId | StoreId | Amount |
| 1 | 20 | 7 | 15000 |
| 2 | 20 | 7 | 1230 |
| 3 | 20 | 7 | 14230 |
| 4 | 20 | 7 | 9540 |
| 5 | 20 | 7 | 24230 |
| 6 | 20 | 7 | 1230 |
| 7 | 20 | 7 | 1230 |
Here is my table B.
| Id | GroupId | StoreId | Credit |
| 12 | 20 | 7 | 1230 |
| 14 | 20 | 7 | 15000 |
| 15 | 20 | 7 | 14230 |
| 16 | 20 | 7 | 1230 |
| 17 | 20 | 7 | 7004 |
| 18 | 20 | 7 | 65523 |
I want to get this result without getting duplicate Id of both table.
I need to get the Id of table B and A where the Amount = Credit.
| A.ID | B.ID | Amount |
| 1 | 14 | 15000 |
| 2 | 12 | 1230 |
| 3 | 15 | 14230 |
| 4 | null | 9540 |
| 5 | null | 24230 |
| 6 | 16 | 1230 |
| 7 | null | 1230 |
My problem is when I have 2 or more same Amount in table A, I get duplicate ID of table B. which should be null. Please help me. Thank you.
I think you want a left join. But this is tricky because you have duplicate amounts, but you only want one to match. The solution is to use row_number():
select . . .
from (select a.*, row_number() over (partition by amount order by id) as seqnum
from a
) a left join
(select b.*, row_number() over (partition by credit order by id) as seqnum
from b
)b
on a.amount = b.credit and a.seqnum = b.seqnum;
Another approach, I think simplier and shorter :)
select ID [A.ID],
(select top 1 ID from TABLE_B where Credit = A.Amount) [B.ID],
Amount
from TABLE_A [A]

Limit a sorted number of rows joined

I have two tables, A and B, and a join table M. I want to, for each A.id, get the top 2 B.id's sorting on the value in table M, producing the results below. This is running on an Azure SQL database
Table A Table M Table B
+-----+ +-----+-----+-------+ +-----+
| Id | | AId | BId | Value | | Id |
+-----+ +-----+-----+-------+ +-----+
| 1 | | 1 | 3 | 4 | | 1 |
| 2 | | 1 | 2 | 3 | | 2 |
| 3 | | 3 | 2 | 3 | | 3 |
| 4 | | 3 | 5 | 6 | | 4 |
+-----+ | 3 | 3 | 4 | | 5 |
| 4 | 1 | 2 | +-----+
| 4 | 2 | 1 |
| 4 | 4 | 3 |
+-----+-----+-------+
Result
+-----+-----+-------+
| AId | BId | Value |
+-----+-----+-------+
| 1 | 3 | 4 |
| 1 | 2 | 3 |
| 3 | 5 | 6 |
| 3 | 3 | 4 |
| 4 | 1 | 2 |
| 4 | 4 | 3 |
+-----+-----+-------+
I know that I can select all the M.AId rows where they equal 1, sort it, and limit by 2, but I need to do this for every row in Table A. I've made an attempt to use group by, but I wasn't sure how to sort and limit it. I've also tried to search for resources associated with this issue but I couldn't find any resources.
(I also wasn't sure how to word the title for this issue)
You can just use ROW_NUMBER:
SELECT
AId, BId, Value
FROM (
SELECT *,
Rn = ROW_NUMBER() OVER(PARTITION BY AId ORDER BY Value DESC)
FROM M
) t
WHERE Rn <= 2

group by top two results based on order

I have been trying to get this to work with some row_number, group by, top, sort of things, but I am missing some fundamental concept. I have a table like so:
+-------+-------+-------+
| name | ord | f_id |
+-------+-------+-------+
| a | 1 | 2 |
| b | 5 | 2 |
| c | 6 | 2 |
| d | 2 | 1 |
| e | 4 | 1 |
| a | 2 | 3 |
| c | 50 | 4 |
+-------+-------+-------+
And my desired output would be:
+-------+---------+--------+-------+
| f_id | ord_n | ord | name |
+-------+---------+--------+-------+
| 2 | 1 | 1 | a |
| 2 | 2 | 5 | b |
| 1 | 1 | 2 | d |
| 1 | 2 | 4 | e |
| 3 | 1 | 2 | a |
| 4 | 1 | 50 | c |
+-------+---------+--------+-------+
Where data is ordered by the ord value, and only up to two results per f_id. Should I be working on a Stored Procedure for this or can I just do it with SQL? I have experimented with some select TOP subqueries, but nothing has even come close..
Here are some statements to create the test table:
create table help(name varchar(255),ord tinyint,f_id tinyint);
insert into help values
('a',1,2),
('b',5,2),
('c',6,2),
('d',2,1),
('e',4,1),
('a',2,3),
('c',50,4);
You may use Rank or DENSE_RANK functions.
select A.name, A.ord_n, A.ord , A.f_id from
(
select
RANK() OVER (partition by f_id ORDER BY ord asc) AS "Rank",
ROW_NUMBER() OVER (partition by f_id ORDER BY ord asc) AS "ord_n",
help.*
from help
) A where A.rank <= 2
Sqlfiddle demo

Sequential Group By in sql server

For this Table:
+----+--------+-------+
| ID | Status | Value |
+----+--------+-------+
| 1 | 1 | 4 |
| 2 | 1 | 7 |
| 3 | 1 | 9 |
| 4 | 2 | 1 |
| 5 | 2 | 7 |
| 6 | 1 | 8 |
| 7 | 1 | 9 |
| 8 | 2 | 1 |
| 9 | 0 | 4 |
| 10 | 0 | 3 |
| 11 | 0 | 8 |
| 12 | 1 | 9 |
| 13 | 3 | 1 |
+----+--------+-------+
I need to sum sequential groups with the same Status to produce this result.
+--------+------------+
| Status | Sum(Value) |
+--------+------------+
| 1 | 20 |
| 2 | 8 |
| 1 | 17 |
| 2 | 1 |
| 0 | 15 |
| 1 | 9 |
| 3 | 1 |
+--------+------------+
How can I do that in SQL Server?
NB: The values in the ID column are contiguous.
Per the tag I added to your question this is a gaps and islands problem.
The best performing solution will likely be
WITH T
AS (SELECT *,
ID - ROW_NUMBER() OVER (PARTITION BY [STATUS] ORDER BY [ID]) AS Grp
FROM YourTable)
SELECT [STATUS],
SUM([VALUE]) AS [SUM(VALUE)]
FROM T
GROUP BY [STATUS],
Grp
ORDER BY MIN(ID)
If the ID values were not guaranteed contiguous as stated then you would need to use
ROW_NUMBER() OVER (ORDER BY [ID]) -
ROW_NUMBER() OVER (PARTITION BY [STATUS] ORDER BY [ID]) AS Grp
Instead in the CTE definition.
SQL Fiddle