I have a set of rankings, ordered by group and ranking:
Group | Rank
------------
A | 1
A | 2
A | 3
A | 4
A | 5
A | 6
B | 1
B | 2
B | 3
B | 4
C | 1
C | 2
C | 3
C | 4
C | 5
D | 1
D | 2
D | 3
D | 4
I want to interleave the groups, ordered by group and rank, n rankings per group at a time (here, n=2):
Group | Rank
------------
A | 1
A | 2
B | 1
B | 2
C | 1
C | 2
D | 1
D | 2
A | 3
A | 4
B | 3
B | 4
C | 3
C | 4
D | 3
D | 4
A | 5
A | 6
C | 5
I have achieved the desired result with loops and table variables (code pasted here because I got a non-descript syntax error in a SQL Fiddle):
CREATE TABLE Rankings([Group] NCHAR(1), [Rank] INT)
INSERT Rankings
VALUES
('A',1),
('A',2),
('A',3),
('A',4),
('A',5),
('A',6),
('B',1),
('B',2),
('B',3),
('B',4),
('C',1),
('C',2),
('C',3),
('C',4),
('C',5),
('D',1),
('D',2),
('D',3),
('D',4)
-- input
DECLARE #n INT = 2 --number of group rankings per rotation
-- output
DECLARE #OrderedRankings TABLE([Group] NCHAR(1), Rank INT)
--
-- in-memory rankings.. we will be deleting used rows
DECLARE #RankingsTemp TABLE(GroupIndex INT, [Group] NCHAR(1), Rank INT)
INSERT #RankingsTemp
SELECT
ROW_NUMBER() OVER (PARTITION BY Rank ORDER BY [Group]) - 1 AS GroupIndex,
[Group],
Rank
FROM Rankings
ORDER BY [Group], Rank
-- loop variables
DECLARE #MaxGroupIndex INT = (SELECT MAX(GroupIndex) FROM #RankingsTemp)
DECLARE #RankingCount INT = (SELECT COUNT(*) FROM #RankingsTemp)
DECLARE #i INT
WHILE(#RankingCount > 0)
BEGIN
SET #i = 0;
WHILE(#i <= #MaxGroupIndex)
BEGIN
INSERT INTO #OrderedRankings
([Group], Rank)
SELECT TOP(#n)
[Group],
Rank
FROM #RankingsTemp
WHERE GroupIndex = #i;
WITH T AS (
SELECT TOP(#n) *
FROM #RankingsTemp
WHERE GroupIndex = #i
);
DELETE FROM T
SET #i = #i + 1;
END
SET #RankingCount = (SELECT COUNT(*) FROM #RankingsTemp)
END
SELECT #RankingCount as RankingCount, #MaxGroupIndex as MaxGroupIndex
-- view results
SELECT * FROM #OrderedRankings
How can I achieve the desired ordering with a set-based approach (no loops, no table variables)?
I'm using SQL Server Enterprise 2008 R2.
Edit: To clarify, I need no more than n rows per group to appear contiguously. The goal of this query is to yield an ordering, when read sequentially, offers an equal representation (n rows at a time) of each group, with respect to rank.
Perhaps something like this...SQL FIDDLE
Order by
Ceiling(rank*1.0/2), group, rank
Working fiddle above (column names changed slightly)
Updated: was burned by int math... . should work now. forcing int to decimal by multiplying by 1.0 so implicit casting doesn't drop the remainder I need for ceiling to round correctly.
Assuming you have a relatively low number of ranks, this would work:
Order by
case when rank <= n then 10
when rank <= 2*n then 20
when rank <= 3*n then 30
when rank <= 4*n then 40
when rank <= 5*n then 50 --more cases here if needed
else 100
end
, group
, rank
Related
SELECT CEILING (RAND(CAST(NEWID() AS varbinary)) *275) AS RandomNumber
This creates random numbers. However, it spits out duplicates
Generate a numbers table with the range of your desire. In my case, I do it via recursive cte. Then order the numbers table using the newid function.
with numbers as (
select 0 as val union all
select val + 1 from numbers where val < 275
)
select ord = row_number() over(order by ap.nid),
val
into #rands
from numbers n
cross apply (select nid = newid()) ap
order by ord
option (maxrecursion 1000);
One run of the code above results in a table of 276 values that begins and ends as follows:
| ord | val |
+-----+-----+
| 1 | 102 |
| 2 | 4 |
| 3 | 127 |
| ... | ... |
| 276 | 194 |
Non duplicating ordering of random numbers.
You can select from it a variety of ways, but one way could be:
-- initiate these to begin with
declare #ord int = 1;
declare #val int;
declare #rand int;
-- do this on every incremental need for a random number
select #val = val,
#ord = #ord + 1
from #rands
where ord = #ord;
print #val;
In the comments to my other answer, you write:
The table I'm working with has an ID , Name , and I want to generate a 3rd column that assigns a unique random number between 1-275 (as there are 275 rows) with no duplicates.
In the future, please include details like this in your original question. With this information, we can help you out better.
First, let's make a sample of your problem. We'll simplify it to just 5 rows, not 275:
create table #data (
id int,
name varchar(10)
);
insert #data values
(101, 'Amanda'),
(102, 'Beatrice'),
(103, 'Courtney'),
(104, 'Denise'),
(105, 'Elvia');
Let's now add the third column you want:
alter table #data add rando int;
Finally, let's update the table by creating a subquery that orders the rows randomly using row_number(), and applying the output the the column we just created:
update reordered
set rando = rowNum
from (
select *,
rowNum = row_number() over(order by newid())
from #data
) reordered;
Here's the result I get, but of course it will be different every time it is run:
select *
from #data
| id | name | rando |
+-----+----------+-------+
| 101 | Amanda | 3 |
| 102 | Beatrice | 1 |
| 103 | Courtney | 4 |
| 104 | Denise | 5 |
| 105 | Elvia | 2 |
Here's a tough one: I have data coming back in a temporary table foo in this form:
id n v
-- - -
1 3 1
1 3 10
1 3 100
1 3 201
1 3 300
2 1 13
2 1 21
2 1 300
4 2 1
4 2 7
4 2 19
4 2 21
4 2 300
8 1 11
Grouping by id, I need to get the row with the nth-lowest value for v based on the value in n. For example, for the group with an ID of 1, I need to get the row which has v equal to 100, since 100 is the third-lowest value for v.
Here's what the final results need to look like:
id n v
-- - -
1 3 100
2 1 13
4 2 7
8 1 11
Some notes about the data:
the number of rows for each ID may vary
n will always be the same for every row with a given ID
n for a given ID will never be greater than the number of rows with that ID
the data will already be sorted by id, then v
Bonus points if you can do it in generic SQL instead of oracle-specific stuff, but that's not a requirement (I suspect that rownum may factor prominently in any solutions). It has in my attempts, but I wind up confusing myself before I get a working solution.
I would use row_number function make row number the compare with n column value in CTE, do another CTE to make row number order by v desc.
get rn = 1 which is mean max value in the n number group.
CREATE TABLE foo(
id int,
n int,
v int
);
insert into foo values (1,3,1);
insert into foo values (1,3,10);
insert into foo values (1,3,100);
insert into foo values (1,3,201);
insert into foo values (1,3,300);
insert into foo values (2,1,13);
insert into foo values (2,1,21);
insert into foo values (2,1,300);
insert into foo values (4,2,1);
insert into foo values (4,2,7);
insert into foo values (4,2,19);
insert into foo values (4,2,21);
insert into foo values (4,2,300);
insert into foo values (8,1,11);
Query 1:
with cte as(
select id,n,v
from
(
select t.*, row_number() over(partition by id ,n order by n) as rn
from foo t
) t1
where rn <= n
), maxcte as (
select id,n,v, row_number() over(partition by id ,n order by v desc) rn
from cte
)
select id,n,v
from maxcte
where rn = 1
Results:
| ID | N | V |
|----|---|-----|
| 1 | 3 | 100 |
| 2 | 1 | 13 |
| 4 | 2 | 7 |
| 8 | 1 | 11 |
use window function
select * from
(
select t.*, row_number() over(partition by id ,n order by v) as rn
from foo t
) t1
where t1.rn=t1.n
as ops sample output just need 3rd highest value so i put where condition t1.rn=3 though accodring to description it would be t1.rn=t1.n
https://dbfiddle.uk/?rdbms=oracle_11.2&fiddle=65abf8d4101d2d1802c1a05ed82c9064
If your database is version 12.1 or higher then there is a much simpler solution:
SELECT DISTINCT ID, n, NTH_VALUE(v,n) OVER (PARTITION BY ID) AS v
FROM foo
ORDER BY ID;
| ID | N | V |
|----|---|-----|
| 1 | 3 | 100 |
| 2 | 1 | 13 |
| 4 | 2 | 7 |
| 8 | 1 | 11 |
Depending on your real data you may have to add an ORDER BY n clause and/or windowing_clause as RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING, see NTH_VALUE
In one application, I have a table with three fields, being Id, Name and count.
Id | Name | Value
1 | A | 5
2 | B | 9
3 | C | 9
4 | D | 5
5 | E | 6
6 | F | 6
now, how can I obtain a cross table from the above? I mean, as follows:
Value | Count
---- | ----
5 | 2
6 | 2
7 | 0
8 | 0
9 | 2
can you help, please?
First, you need to create a tally table. There are many methods for that. You will use the tally table to number off all the values between min and max of your source table. Once you have all the numbers between min and max, you will need to LEFT JOIN those into a a version of your table where you use COUNT() and GROUP BY to total the number of times each value appears.
Below Table A is the tally table.
Table B is your aggregated source table.
DECLARE #MinValue INT
DECLARE #MaxValue INT
SET #MinValue = (SELECT MIN(Value) FROM dbo.MyTable)
SET #MaxValue = (SELECT MAX(Value) FROM dbo.MyTable)
SELECT number as Value, COALESCE(Count,0) AS Count
FROM (
SELECT DISTINCT number
FROM master..spt_values
WHERE number
BETWEEN #MinValue AND #MaxValue
) AS A
LEFT JOIN (
SELECT Value, COUNT(Value) AS Count
FROM dbo.MyTable
GROUP BY Value
) AS B
ON A.number = B.value
There is a table of the following structure:
CREATE TABLE history
(
pk serial NOT NULL,
"from" integer NOT NULL,
"to" integer NOT NULL,
entity_key text NOT NULL,
data text NOT NULL,
CONSTRAINT history_pkey PRIMARY KEY (pk)
);
The pk is a primary key, from and to define a position in the sequence and the sequence itself for a given entity identified by entity_key. So the entity has one sequence of 2 rows in case if the first row has the from = 1; to = 2 and the second one has from = 2; to = 3. So the point here is that the to of the previous row matches the from of the next one.
The order to determine "next"/"previous" row is defined by pk which grows monotonously (since it's a SERIAL).
The sequence does not have to start with 1 and the to - from does not necessary 1 always. So it can be from = 1; to = 10. What matters is that the "next" row in the sequence matches the to exactly.
Sample dataset:
pk | from | to | entity_key | data
----+--------+------+--------------+-------
1 | 1 | 2 | 42 | foo
2 | 2 | 3 | 42 | bar
3 | 3 | 4 | 42 | baz
4 | 10 | 11 | 42 | another foo
5 | 11 | 12 | 42 | another baz
6 | 1 | 2 | 111 | one one one
7 | 2 | 3 | 111 | one one one two
8 | 3 | 4 | 111 | one one one three
And what I cannot realize is how to partition by "sequences" here so that I could apply window functions to the group that represents a single "sequence".
Let's say I want to use the row_number() function and would like to get the following result:
pk | row_number | entity_key
----+-------------+------------
1 | 1 | 42
2 | 2 | 42
3 | 3 | 42
4 | 1 | 42
5 | 2 | 42
6 | 1 | 111
7 | 2 | 111
8 | 3 | 111
For convenience I created an SQLFiddle with initial seed: http://sqlfiddle.com/#!15/e7c1c
PS: It's not the "give me the codez" question, I made my own research and I just out of ideas how to partition.
It's obvious that I need to LEFT JOIN with the next.from = curr.to, but then it's still not clear how to reset the partition on next.from IS NULL.
PS: It will be a 100 points bounty for the most elegant query that provides the requested result
PPS: the desired solution should be an SQL query not pgsql due to some other limitations that are out of scope of this question.
I don’t know if it counts as “elegant,” but I think this will do what you want:
with Lagged as (
select
pk,
case when lag("to",1) over (order by pk) is distinct from "from" then 1 else 0 end as starts,
entity_key
from history
), LaggedGroups as (
select
pk,
sum(starts) over (order by pk) as groups,
entity_key
from Lagged
)
select
pk,
row_number() over (
partition by groups
order by pk
) as "row_number",
entity_key
from LaggedGroups
Just for fun & completeness: a recursive solution to reconstruct the (doubly) linked lists of records. [ this will not be the fastest solution ]
NOTE: I commented out the ascending pk condition(s) since they are not needed for the connection logic.
WITH RECURSIVE zzz AS (
SELECT h0.pk
, h0."to" AS next
, h0.entity_key AS ek
, 1::integer AS rnk
FROM history h0
WHERE NOT EXISTS (
SELECT * FROM history nx
WHERE nx.entity_key = h0.entity_key
AND nx."to" = h0."from"
-- AND nx.pk > h0.pk
)
UNION ALL
SELECT h1.pk
, h1."to" AS next
, h1.entity_key AS ek
, 1+zzz.rnk AS rnk
FROM zzz
JOIN history h1
ON h1.entity_key = zzz.ek
AND h1."from" = zzz.next
-- AND h1.pk > zzz.pk
)
SELECT * FROM zzz
ORDER BY ek,pk
;
You can use generate_series() to generate all the rows between the two values. Then you can use the difference of row numbers on that:
select pk, "from", "to",
row_number() over (partition by entity_key, min(grp) order by pk) as row_number
from (select h.*,
(row_number() over (partition by entity_key order by ind) -
ind) as grp
from (select h.*, generate_series("from", "to" - 1) as ind
from history h
) h
) h
group by pk, "from", "to", entity_key
Because you specify that the difference is between 1 and 10, this might actually not have such bad performance.
Unfortunately, your SQL Fiddle isn't working right now, so I can't test it.
Well,
this not exactly one SQL query but:
select a.pk as PK, a.entity_key as ENTITY_KEY, b.pk as BPK, 0 as Seq into #tmp
from history a left join history b on a."to" = b."from" and a.pk = b.pk-1
declare #seq int
select #seq = 1
update #tmp set Seq = case when (BPK is null) then #seq-1 else #seq end,
#seq = case when (BPK is null) then #seq+1 else #seq end
select pk, entity_key, ROW_NUMBER() over (PARTITION by entity_key, seq order by pk asc)
from #tmp order by pk
This is in SQL Server 2008
Let's say I had the following table:
id | name | points
-------------------
1 | joe | 100
2 | bob | 95
3 | max | 95
4 | leo | 90
Can I produce a reversed rank recordset like so:
id | name | points | rank
--------------------------
4 | leo | 90 | 1
3 | max | 95 | 2.5
2 | bob | 95 | 2.5
1 | joe | 100 | 4
This is a fully working example, with this sample table
create table tpoints (id int, name varchar(10), points int);
insert tpoints values
(1 ,'joe', 100 ),
(2 ,'bob', 95 ),
(3 ,'max', 95 ),
(4 ,'leo', 90 );
The MySQL query
select t.*, sq.`rank`
from
(
select
points,
#rank := case when #g = points then #rank else #rn + (c-1)/2.0 end `rank`,
#g := points,
#rn := #rn + c
from
(select #g:=null, #rn:=1) g,
(select points, count(*) c
from tpoints
group by points
order by points asc) p
) sq inner join tpoints t on t.points = sq.points
order by t.points asc;
It also has the benefit of performing very well compared to performing a correlated cross (self) join.
1x pass through tpoints to aggregate into groups
calculation of rank with ties
1x join to table to put ranks against the records.
Won't do "2.5" as a rank value, but duplicates will have the same number if you use:
SELECT x.id,
x.name,
x.points,
(SELECT COUNT(*)
FROM YOUR_TABLE y
WHERE y.points <= x.points) AS rank
FROM YOUR_TABLE x
ORDER BY x.points