Rank of partitions in T-SQL

Rank of partitions in T-SQL - sql

Given the following table:
CREATE TABLE #values (ID int, TYPE nchar(2), NUMBER int)
INSERT INTO #values values (1, 'A', 0)
INSERT INTO #values values (2, 'A', 0)
INSERT INTO #values values (3, 'B', 1)
INSERT INTO #values values (4, 'A', 1)
INSERT INTO #values values (5, 'B', 2)
SELECT * FROM #values
I would like to generate this table:
Id | T | N | COUNT
------------------
1 | A | 0 | 1000
2 | A | 0 | 1000
3 | B | 1 | 1001
4 | A | 1 | 1002
5 | B | 2 | 1003
How can I do this in T-SQL?
I've been fiddling with ROW_NUMBER() OVER(PARTITION BY) but this does not solve the problem, as it resets the count at each partition, which is not what I would like to do.

I think you're looking for dense_rank:
SELECT
ID,
TYPE,
NUMBER,
DENSE_RANK() over (order by TYPE, Number)
FROM #values
This produces
1 A 0 1
2 A 0 1
4 A 1 2
3 B 1 3
5 B 2 4

Related

SQL group by and sum based on distinct value in other column (sum once if value in other column is duplicated)

I need help with a group-by query. My table looks like this:
CREATE MULTISET TABLE MY_TABLE (PERSON CHAR(1), ITEM CHAR(1), COST INT);
INSERT INTO MY_TABLE VALUES ('A', '1', 5);
INSERT INTO MY_TABLE VALUES ('A', '1', 5);
INSERT INTO MY_TABLE VALUES ('A', '2', 1);
INSERT INTO MY_TABLE VALUES ('B', '3', 0);
INSERT INTO MY_TABLE VALUES ('B', '4', 10);
INSERT INTO MY_TABLE VALUES ('B', '4', 10);
INSERT INTO MY_TABLE VALUES ('C', '5', 1);
INSERT INTO MY_TABLE VALUES ('C', '5', 1);
INSERT INTO MY_TABLE VALUES ('C', '5', 1);
+--------+------+------+
| PERSON | ITEM | COST |
+--------+------+------+
| A | 1 | 5 |
| A | 1 | 5 |
| A | 2 | 1 |
| B | 3 | 0 |
| B | 4 | 10 |
| B | 4 | 10 |
| C | 5 | 1 |
| C | 5 | 1 |
| C | 5 | 1 |
+--------+------+------+
I need to group items and costs by person, but in different ways. For each person, I need the number of unique items they have. Ex: Person A has two distinct items, item 1 and item 2. I can get this with COUNT(DISTINCT ITEM).
Then for each person, I need to sum the cost but only once per distinct item (for duplicate items, the cost is always the same). Ex: Person A has item 1 for $5, item 1 for $5, and item 2 for $1. Since this person has item 1 twice, I count the $5 once, and then add the $1 from item 2 for a total of $6. The output should look like this:
+--------+---------------------+------------------------+
| PERSON | ITEM_DISTINCT_COUNT | COST_DISTINCT_ITEM_SUM |
+--------+---------------------+------------------------+
| A | 2 | 6 |
| B | 2 | 10 |
| C | 1 | 1 |
+--------+---------------------+------------------------+
Is there an easy way to do this that performs good on a lot of rows?
SELECT PERSON
,COUNT(DISTINCT ITEM) ITEM_DISTINCT_COUNT
-- help with COST_DISTINCT_ITEM_SUM
FROM MY_TABLE
GROUP BY PERSON

You can make a subquery which gets the distinct values of item and cost for each person, and then aggregate over that:
SELECT PERSON,
COUNT(ITEM) AS ITEM_DISTINCT_COUNT,
SUM(COST) AS COST_DISTINCT_ITEM_SUM
FROM (
SELECT DISTINCT PERSON, ITEM, COST
FROM MY_TABLE
) M
GROUP BY PERSON
Output:
PERSON ITEM_DISTINCT_COUNT COST_DISTINCT_ITEM_SUM
A 2 6
B 2 10
C 1 1
Demo on dbfiddle

I recommend a two levels of aggregation:
select person, count(*) as num_items, sum(cost)
from (select person, item, avg(cost) as cost
from my_table t
group by person, item
) t
group by person;

Classify records based on matching table

I have two tables: ITEMS and MATCHING_ITEMS, as below:
ITEMS:
|---------------------|------------------|
| ID | Name |
|---------------------|------------------|
| 1 | A |
| 2 | B |
| 3 | C |
| 4 | D |
| 5 | E |
| 6 | F |
| 7 | G |
|---------------------|------------------|
MATCHING_ITEMS:
|---------------------|------------------|
| ID_1 | ID_2 |
|---------------------|------------------|
| 1 | 2 |
| 1 | 3 |
| 2 | 3 |
| 4 | 5 |
| 4 | 6 |
| 5 | 6 |
|---------------------|------------------|
The MATCHING_ITEMS table defines items that match each other, and thus belong to the same group, i.e. items 1,2, and 3 match with each other and thus belong in a group, and the same for items 4,5, and 6. Item 7 does not have a match belong to any group.
I now need to add a 'Group' column on the ITEMS table which contains a unique integer for each group, so it would look as follows:
ITEMS:
|---------------------|------------------|------------------|
| ID | Name | Group |
|---------------------|------------------|------------------|
| 1 | A | 1 |
| 2 | B | 1 |
| 3 | C | 1 |
| 4 | D | 2 |
| 5 | E | 2 |
| 6 | F | 2 |
| 7 | G | NULL |
|---------------------|------------------|------------------|
So far I have been using a stored procedure to do this, looping over each line in the MATCHING_ITEMS table and updating the ITEMS table with a group value. The problem is that I eventually need to do this for a table containing millions of records, and the looping method is far too slow.
Is there a way that I can achieve this without using a loop?

If you have all pairs of matches in the matching table, then you can just use the minimum id to assign the group. For this:
select i.*,
(case when grp_id is not null
then dense_rank() over (order by grp_id)
end) as grouping
from items i left join
(select mi.id_1, least(mi.id1, min(mi.id2)) as grp_id
from matching_items mi
group by mi.id_1
) mi
on i.id = mi.id_1;
Note: This works only if all pairs are in the matching items table. Otherwise, you will need a recursive/hierarchical query to get all the pairs.

You could use min and max at first, then dense_rank to assign group numbers:
select id, name, dense_rank() over (order by mn, mx) grp
from (
select distinct id, name,
min(id_1) over (partition by name) mn,
max(id_2) over (partition by name) mx
from items left join matching_items on id in (id_1, id_2))
order by id
demo

The pairs 2,3 and 5,6 in the Matching_items table seem redundant as they could be derived (if I am reading your question right)
Here is how I did it. I just reused id_1 from your example as the group no:
create table
items (
ID number,
name varchar2 (2)
);
insert into items values (1, 'A');
insert into items values (2, 'B');
insert into items values (3, 'C');
insert into items values (4, 'D');
insert into items values (5, 'E');
insert into items values (6, 'F');
insert into items values (7, 'G');
create table
matching_items (
ID number,
ID_2 number
);
insert into matching_items values (1, 2);
insert into matching_items values (1, 3);
insert into matching_items values (2, 3);
insert into matching_items values (4, 5);
insert into matching_items values (4, 6);
insert into matching_items values (5, 6);
with new_grp as
(
select id, id_2, id as group_no
from matching_items
where id in (select id from items)
and id not in (select id_2 from matching_items)),
assign_grp as
(
select id, group_no
from new_grp
union
select id_2, group_no
from new_grp)
select items.id, name, group_no
from items left outer join assign_grp
on items.id = assign_grp.id;

Running sum for subgroups of records in SQL Server

I need to calculate a running total for groups of data within a table (in SQL Server 2014). Please see the example below. I need to calculate the RunningTotalByID column.
Any thoughts/ideas would be much appreciated.
Thanks!

SQL Server 2014 supports SUM() OVER (PARTITION BY ... ORDER BY ...), which calculates running sum:
DECLARE #T TABLE (ID char(1), Value int);
INSERT INTO #T (ID, Value) VALUES
('A', 1),
('A', 1),
('B', 1),
('B', 1),
('B', 1),
('C', 1),
('C', 1);
SELECT
ID
,Value
,SUM(Value) OVER (PARTITION BY ID ORDER BY Value
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS RunningTotal
FROM #T
ORDER BY ID, RunningTotal;
Result
+----+-------+--------------+
| ID | Value | RunningTotal |
+----+-------+--------------+
| A | 1 | 1 |
| A | 1 | 2 |
| B | 1 | 1 |
| B | 1 | 2 |
| B | 1 | 3 |
| C | 1 | 1 |
| C | 1 | 2 |
+----+-------+--------------+

How to select shipments that have one activity but doesn't have another one

Simplified version of a table
Table ActivityHistory:
ActivityHistoryid(PK) | ShipmentID | ActivityCode | Datetime
1 | 1 | CodeA |
2 | 1 | CodeB |
3 | 1 | CodeC |
4 | 2 | CodeA |
5 | 3 | CodeA |
6 | 3 | CodeB |
7 | 4 | CodeC |
This table contains the list of activities that occurred to given shipments.
Task: I need to select shipments(shipment ids) that has "CodeA" and doesn't have a "CodeC" activity.
In this example, shipment id 2 and 3 will match the criteria.
Table Shipment: (ShipmentID(PK), other shipment related columns)
Thank you.

Try this one -
Query:
DECLARE #temp TABLE
(
ActivityHistoryid INT
, ShipmentID INT
, ActivityCode VARCHAR(20)
)
INSERT INTO #temp (ActivityHistoryid, ShipmentID, ActivityCode)
VALUES
(1, 1, 'CodeA'),
(2, 1, 'CodeB'),
(3, 1, 'CodeC'),
(4, 2, 'CodeA'),
(5, 3, 'CodeA'),
(6, 3, 'CodeB'),
(7, 4, 'CodeC')
SELECT *
FROM #temp t
WHERE ActivityCode = 'CodeA'
AND NOT EXISTS(
SELECT 1
FROM #temp t2
WHERE t2.ActivityCode = 'CodeC'
AND t2.ShipmentID = t.ShipmentID
)
Output:
ActivityHistoryid ShipmentID ActivityCode
----------------- ----------- --------------------
4 2 CodeA
5 3 CodeA

SELECT inherit values from parent in a hierarchy

I'm trying to acheive through T-SQL (in a stored procedure) a way to copy a value from a parent into the child when retrieving rows. Here is some example data:
DROP TABLE TEST_LEVELS
CREATE TABLE TEST_LEVELS(
ID INT NOT NULL
,VALUE INT NULL
,PARENT_ID INT NULL
,LEVEL_NO INT NOT NULL
)
INSERT INTO TEST_LEVELS (ID, VALUE, PARENT_ID, LEVEL_NO) VALUES (1, 10000, NULL, 1)
INSERT INTO TEST_LEVELS (ID, VALUE, PARENT_ID, LEVEL_NO) VALUES (2, NULL, 1, 2)
INSERT INTO TEST_LEVELS (ID, VALUE, PARENT_ID, LEVEL_NO) VALUES (3, NULL, 2, 3)
INSERT INTO TEST_LEVELS (ID, VALUE, PARENT_ID, LEVEL_NO) VALUES (4, 20000, NULL, 1)
INSERT INTO TEST_LEVELS (ID, VALUE, PARENT_ID, LEVEL_NO) VALUES (5, NULL, 4, 2)
INSERT INTO TEST_LEVELS (ID, VALUE, PARENT_ID, LEVEL_NO) VALUES (6, 25000, 5, 3)
INSERT INTO TEST_LEVELS (ID, VALUE, PARENT_ID, LEVEL_NO) VALUES (7, NULL, 6, 4)
Selecting the data as follows:
SELECT ID, VALUE, LEVEL_NO
FROM TEST_LEVELS
results in:
+----+-------+----------+
| ID | VALUE | LEVEL_NO |
+----+-------+----------+
| 1 | 10000 | 1 |
| 2 | NULL | 2 |
| 3 | NULL | 3 |
| 4 | 20000 | 1 |
| 5 | NULL | 2 |
| 6 | 25000 | 3 |
| 7 | NULL | 4 |
+----+-------+----------+
But I need something like this (values are inherited by the parent):
+----+-------+----------+
| ID | VALUE | LEVEL_NO |
+----+-------+----------+
| 1 | 10000 | 1 |
| 2 | 10000 | 2 |
| 3 | 10000 | 3 |
| 4 | 20000 | 1 |
| 5 | 20000 | 2 |
| 6 | 25000 | 3 |
| 7 | 25000 | 4 |
+----+-------+----------+
Can this be achieved without using cursors (it must also run on SQL Server 2005)?

Use:
;with cte
as
(
select t.ID, t.VALUE, t.PARENT_ID, t.LEVEL_NO
from #t t
where t.Value is not null
union all
select t.ID, c.Value, t.PARENT_ID, t.LEVEL_NO
from cte c
join #t t on t.PARENT_ID = c.ID
where t.Value is null
)
select c.ID, c.Value, c.LEVEL_NO
from cte c
order by c.ID
Output:
ID Value LEVEL_NO
----------- ----------- -----------
1 10000 1
2 10000 2
3 10000 3
4 20000 1
5 20000 2
6 25000 3
7 25000 4

Maybe something like this:
;WITH cte_name(ID,VALUE,PARENT_ID,LEVEL_NO)
AS
(
SELECT
tbl.ID,
tbl.VALUE,
tbl.PARENT_ID,
tbl.LEVEL_NO
FROM
TEST_LEVELS AS tbl
WHERE
tbl.PARENT_ID IS NULL
UNION ALL
SELECT
tbl.ID,
ISNULL(tbl.VALUE,cte_name.VALUE),
tbl.PARENT_ID,
tbl.LEVEL_NO
FROM
cte_name
JOIN TEST_LEVELS AS tbl
ON cte_name.ID=tbl.PARENT_ID
)
SELECT
*
FROM
cte_name
ORDER BY
ID

One way to do it:
SELECT T.ID,
case when T.VALUE IS NULL
THEN (SELECT A.VALUE FROM TEST_LEVELS A WHERE A.ID = T.PARENT_ID)
ELSE T.VALUE
END,
T.LEVEL_NO
FROM TEST_LEVELS T

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Rank of partitions in T-SQL - sql

I think you're looking for dense_rank: SELECT ID, TYPE, NUMBER, DENSE_RANK() over (order by TYPE, Number) FROM #values This produces 1 A 0 1 2 A 0 1 4 A 1 2 3 B 1 3 5 B 2 4

Related

SQL group by and sum based on distinct value in other column (sum once if value in other column is duplicated)

Classify records based on matching table

Running sum for subgroups of records in SQL Server

How to select shipments that have one activity but doesn't have another one

SELECT inherit values from parent in a hierarchy

Categories

Resources