Reorder the rows of a table according to the numbers of similar cells in a specific column using SQL

Reorder the rows of a table according to the numbers of similar cells in a specific column using SQL - sql

I have a table like this:
D
S
2
1
2
3
4
2
4
3
4
5
6
1
in which the code of symptoms(S) of three diseases(D) are shown. I want to rearrange this table (D-S) such that the diseases with more symptoms come up i.e. order it by decreasing the numbers of symptoms as below:
D
S
4
2
4
3
4
5
2
1
2
3
6
1
Can anyone help me to write a SQL code for it in SQL server?
I had tried to do this as the following but this doesn't work:
SELECT *
FROM (
select D, Count(S) cnt
from [D-S]
group by D
) Q
order by Q.cnt desc

select
D,
S
from
D-S
order by
count(*) over(partition by D) desc,
D,
S;

Two easy ways to approach this:
--==== Sample Data
DECLARE #t TABLE (D INT, S INT);
INSERT #t VALUES(2,1),(2,3),(4,2),(4,3),(4,5),(6,1);
--==== Using Window Function
SELECT t.D, t.S
FROM (SELECT t.*, Rnk = COUNT(*) OVER (PARTITION BY t.D) FROM #t AS t) AS t
ORDER BY t.Rnk DESC;
--==== Using standard GROUP BY
SELECT t.*
FROM #t AS t
JOIN
(
SELECT t2.D, Cnt = COUNT(*)
FROM #t AS t2
GROUP BY t2.D
) AS t2 ON t.D = t2.D
ORDER BY t2.Cnt DESC;
Results:
D S
----------- -----------
4 2
4 3
4 5
2 1
2 3
6 1

Related

Rolling Average in SQL with Partition [duplicate]

declare #t table
(
id int,
SomeNumt int
)
insert into #t
select 1,10
union
select 2,12
union
select 3,3
union
select 4,15
union
select 5,23
select * from #t
the above select returns me the following.
id SomeNumt
1 10
2 12
3 3
4 15
5 23
How do I get the following:
id srome CumSrome
1 10 10
2 12 22
3 3 25
4 15 40
5 23 63

select t1.id, t1.SomeNumt, SUM(t2.SomeNumt) as sum
from #t t1
inner join #t t2 on t1.id >= t2.id
group by t1.id, t1.SomeNumt
order by t1.id
SQL Fiddle example
Output
| ID | SOMENUMT | SUM |
-----------------------
| 1 | 10 | 10 |
| 2 | 12 | 22 |
| 3 | 3 | 25 |
| 4 | 15 | 40 |
| 5 | 23 | 63 |
Edit: this is a generalized solution that will work across most db platforms. When there is a better solution available for your specific platform (e.g., gareth's), use it!

The latest version of SQL Server (2012) permits the following.
SELECT
RowID,
Col1,
SUM(Col1) OVER(ORDER BY RowId ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS Col2
FROM tablehh
ORDER BY RowId
or
SELECT
GroupID,
RowID,
Col1,
SUM(Col1) OVER(PARTITION BY GroupID ORDER BY RowId ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS Col2
FROM tablehh
ORDER BY RowId
This is even faster. Partitioned version completes in 34 seconds over 5 million rows for me.
Thanks to Peso, who commented on the SQL Team thread referred to in another answer.

For SQL Server 2012 onwards it could be easy:
SELECT id, SomeNumt, sum(SomeNumt) OVER (ORDER BY id) as CumSrome FROM #t
because ORDER BY clause for SUM by default means RANGE UNBOUNDED PRECEDING AND CURRENT ROW for window frame ("General Remarks" at https://msdn.microsoft.com/en-us/library/ms189461.aspx)

Let's first create a table with dummy data:
Create Table CUMULATIVESUM (id tinyint , SomeValue tinyint)
Now let's insert some data into the table;
Insert Into CUMULATIVESUM
Select 1, 10 union
Select 2, 2 union
Select 3, 6 union
Select 4, 10
Here I am joining same table (self joining)
Select c1.ID, c1.SomeValue, c2.SomeValue
From CumulativeSum c1, CumulativeSum c2
Where c1.id >= c2.ID
Order By c1.id Asc
Result:
ID SomeValue SomeValue
-------------------------
1 10 10
2 2 10
2 2 2
3 6 10
3 6 2
3 6 6
4 10 10
4 10 2
4 10 6
4 10 10
Here we go now just sum the Somevalue of t2 and we`ll get the answer:
Select c1.ID, c1.SomeValue, Sum(c2.SomeValue) CumulativeSumValue
From CumulativeSum c1, CumulativeSum c2
Where c1.id >= c2.ID
Group By c1.ID, c1.SomeValue
Order By c1.id Asc
For SQL Server 2012 and above (much better performance):
Select
c1.ID, c1.SomeValue,
Sum (SomeValue) Over (Order By c1.ID )
From CumulativeSum c1
Order By c1.id Asc
Desired result:
ID SomeValue CumlativeSumValue
---------------------------------
1 10 10
2 2 12
3 6 18
4 10 28
Drop Table CumulativeSum

A CTE version, just for fun:
;
WITH abcd
AS ( SELECT id
,SomeNumt
,SomeNumt AS MySum
FROM #t
WHERE id = 1
UNION ALL
SELECT t.id
,t.SomeNumt
,t.SomeNumt + a.MySum AS MySum
FROM #t AS t
JOIN abcd AS a ON a.id = t.id - 1
)
SELECT * FROM abcd
OPTION ( MAXRECURSION 1000 ) -- limit recursion here, or 0 for no limit.
Returns:
id SomeNumt MySum
----------- ----------- -----------
1 10 10
2 12 22
3 3 25
4 15 40
5 23 63

Late answer but showing one more possibility...
Cumulative Sum generation can be more optimized with the CROSS APPLY logic.
Works better than the INNER JOIN & OVER Clause when analyzed the actual query plan ...
/* Create table & populate data */
IF OBJECT_ID('tempdb..#TMP') IS NOT NULL
DROP TABLE #TMP
SELECT * INTO #TMP
FROM (
SELECT 1 AS id
UNION
SELECT 2 AS id
UNION
SELECT 3 AS id
UNION
SELECT 4 AS id
UNION
SELECT 5 AS id
) Tab
/* Using CROSS APPLY
Query cost relative to the batch 17%
*/
SELECT T1.id,
T2.CumSum
FROM #TMP T1
CROSS APPLY (
SELECT SUM(T2.id) AS CumSum
FROM #TMP T2
WHERE T1.id >= T2.id
) T2
/* Using INNER JOIN
Query cost relative to the batch 46%
*/
SELECT T1.id,
SUM(T2.id) CumSum
FROM #TMP T1
INNER JOIN #TMP T2
ON T1.id > = T2.id
GROUP BY T1.id
/* Using OVER clause
Query cost relative to the batch 37%
*/
SELECT T1.id,
SUM(T1.id) OVER( PARTITION BY id)
FROM #TMP T1
Output:-
id CumSum
------- -------
1 1
2 3
3 6
4 10
5 15

Select
*,
(Select Sum(SOMENUMT)
From #t S
Where S.id <= M.id)
From #t M

You can use this simple query for progressive calculation :
select
id
,SomeNumt
,sum(SomeNumt) over(order by id ROWS between UNBOUNDED PRECEDING and CURRENT ROW) as CumSrome
from #t

There is a much faster CTE implementation available in this excellent post:
http://weblogs.sqlteam.com/mladenp/archive/2009/07/28/SQL-Server-2005-Fast-Running-Totals.aspx
The problem in this thread can be expressed like this:
DECLARE #RT INT
SELECT #RT = 0
;
WITH abcd
AS ( SELECT TOP 100 percent
id
,SomeNumt
,MySum
order by id
)
update abcd
set #RT = MySum = #RT + SomeNumt
output inserted.*

For Ex: IF you have a table with two columns one is ID and second is number and wants to find out the cumulative sum.
SELECT ID,Number,SUM(Number)OVER(ORDER BY ID) FROM T

Once the table is created -
select
A.id, A.SomeNumt, SUM(B.SomeNumt) as sum
from #t A, #t B where A.id >= B.id
group by A.id, A.SomeNumt
order by A.id

The SQL solution wich combines "ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW" and "SUM" did exactly what i wanted to achieve.
Thank you so much!
If it can help anyone, here was my case. I wanted to cumulate +1 in a column whenever a maker is found as "Some Maker" (example). If not, no increment but show previous increment result.
So this piece of SQL:
SUM( CASE [rmaker] WHEN 'Some Maker' THEN 1 ELSE 0 END)
OVER
(PARTITION BY UserID ORDER BY UserID,[rrank] ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS Cumul_CNT
Allowed me to get something like this:
User 1 Rank1 MakerA 0
User 1 Rank2 MakerB 0
User 1 Rank3 Some Maker 1
User 1 Rank4 Some Maker 2
User 1 Rank5 MakerC 2
User 1 Rank6 Some Maker 3
User 2 Rank1 MakerA 0
User 2 Rank2 SomeMaker 1
Explanation of above: It starts the count of "some maker" with 0, Some Maker is found and we do +1. For User 1, MakerC is found so we dont do +1 but instead vertical count of Some Maker is stuck to 2 until next row.
Partitioning is by User so when we change user, cumulative count is back to zero.
I am at work, I dont want any merit on this answer, just say thank you and show my example in case someone is in the same situation. I was trying to combine SUM and PARTITION but the amazing syntax "ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW" completed the task.
Thanks!
Groaker

Above (Pre-SQL12) we see examples like this:-
SELECT
T1.id, SUM(T2.id) AS CumSum
FROM
#TMP T1
JOIN #TMP T2 ON T2.id < = T1.id
GROUP BY
T1.id
More efficient...
SELECT
T1.id, SUM(T2.id) + T1.id AS CumSum
FROM
#TMP T1
JOIN #TMP T2 ON T2.id < T1.id
GROUP BY
T1.id

Try this
select
t.id,
t.SomeNumt,
sum(t.SomeNumt) Over (Order by t.id asc Rows Between Unbounded Preceding and Current Row) as cum
from
#t t
group by
t.id,
t.SomeNumt
order by
t.id asc;

Try this:
CREATE TABLE #t(
[name] varchar NULL,
[val] [int] NULL,
[ID] [int] NULL
) ON [PRIMARY]
insert into #t (id,name,val) values
(1,'A',10), (2,'B',20), (3,'C',30)
select t1.id, t1.val, SUM(t2.val) as cumSum
from #t t1 inner join #t t2 on t1.id >= t2.id
group by t1.id, t1.val order by t1.id

Without using any type of JOIN cumulative salary for a person fetch by using follow query:
SELECT * , (
SELECT SUM( salary )
FROM `abc` AS table1
WHERE table1.ID <= `abc`.ID
AND table1.name = `abc`.Name
) AS cum
FROM `abc`
ORDER BY Name

sql numbering the partition of Numbers

I have a set of numbers like this
ID
===
1
2
3
1
2
1
1
2
3
4
5
...
I want to select a new row that increase when fetch next 1 like this
ID number
=== ========
1 1
2 1
3 1
1 2
2 2
1 3
1 4
2 4
3 4
4 4
5 4
Any suggestion ?

Assuming that you have a column o which specify the ordering then you can use a self-join like this:
select d1.o, d1.id, count(*)
from data d1
join data d2 on d1.o >= d2.o and d2.id = 1
group by d1.o, d1.id
DBFiddle DEMO

You can solve this with use of cte and window functions, as follows:
DECLARE #t TABLE (ID INT);
INSERT INTO #t VALUES (1),(2),(3),(1),(2),(1),(1),(2),(3),(4),(5);
WITH cte AS(
SELECT ID, ROW_NUMBER() OVER (ORDER BY (SELECT 1)) rn
FROM #t
),
cte1 AS(
SELECT ID, rn, ROW_NUMBER() OVER (ORDER BY rn) rn2
FROM cte
WHERE ID = 1
)
SELECT c.ID, MAX(rn2) OVER (ORDER BY c.rn) rn
FROM cte c
LEFT JOIN cte1 c1 ON c1.rn = c.rn
ORDER BY c.rn

Smarter GROUP BY

Consider Table like this.
I will call it Test
Id A B C D
1 1 1 8 25
2 1 2 5 35
3 1 3 2 75
4 2 2 2 45
5 3 2 5 26
Now I want rows with max 'Id' Grouped by 'A'
Id A B C D
3 1 3 2 75
4 2 2 2 45
5 3 2 5 26
-
--Work, but I do not want
SELECT MAX(Id), A FROM Test GROUP BY A
--I want but do not work
SELECT MAX(Id), A, B, C, D FROM Test GROUP BY A
--Work but I do not want
SELECT MAX(Id), A, B, C, D FROM Test GROUP BY A, B, C, D
--Work and I want
SELECT old.Id, old.A, new.B, new.C, new.D
FROM(
SELECT
MAX(Id) AS Id, A
FROM
Test GROUP BY A
)old
JOIN Test new
ON old.Id = new.Id
Is there a better way to write last query without join

Most databases support window functions:
select *
from (
select *, row_number() over (partition by a order by id desc) rn
from test
) t
where rn = 1

Most DBMS now support Common Table Expressions (CTE). You can use one.
;with maxa as (
select row_number() over(partition by a order by id desc) rn,
id,a,b,c,d from test
)
select id,a,b,c,d
from maxa
where rn=1

SQL Random N rows for each distinct value in column

I have the following table:
Name Field
A 1
B 1
C 1
D 1
E 1
F 1
G 1
H 2
I 2
J 2
K 3
L 3
M 3
N 3
O 3
P 3
Q 3
R 3
S 3
T 3
I need a SQL query which will generate me a set with 5 random rows for each distinct value on column Field.
For example, results expected:
Name Field
A 1
B 1
D 1
E 1
G 1
J 2
I 2
H 2
M 3
Q 3
T 3
S 3
P 3
Is there an easy way to do this? Or should i split that table into more tables and generate random for each table then union them?

You can do this with a CTE using a ROW_NUMBER() whilst PARTITIONing on the Field:
;With Cte As
(
Select Name, Field,
Row_Number() Over (Partition By Field Order By NewId()) RN
From YourTable
)
Select Name, Field
From Cte
Where RN <= 5
SQL Fiddle

You can readily do this with row_number():
select name, field
from (select t.*,
row_number() over (partition by field order by newid()) as seqnum
from t
) t
where seqnum <= 5;

An enhancement to Gordon Linoff's code, This code really helped me if you need criteria in your query.
select *
from (select t.*,
row_number() over (partition by region order by newid()) as seqnum
from MyTable t
WHERE t.program = 'ACME'
) t
where seqnum <= 1500;

left join without duplicate values using MIN()

I have a table_1:
id custno
1 1
2 2
3 3
and a table_2:
id custno qty descr
1 1 10 a
2 1 7 b
3 2 4 c
4 3 7 d
5 1 5 e
6 1 5 f
When I run this query to show the minimum order quantities from every customer:
SELECT DISTINCT table_1.custno,table_2.qty,table_2.descr
FROM table_1
LEFT OUTER JOIN table_2
ON table_1.custno = table_2.custno AND qty = (SELECT MIN(qty) FROM table_2
WHERE table_2.custno = table_1.custno )
Then I get this result:
custno qty descr
1 5 e
1 5 f
2 4 c
3 7 d
Customer 1 appears twice each time with the same minimum qty (& a different description) but I only want to see customer 1 appear once. I don't care if that is the record with 'e' as a description or 'f' as a description.

First of all... I'm not sure why you need to include table_1 in the queries to begin with:
select custno, min(qty) as min_qty
from table_2
group by custno;
But just in case there is other information that you need that wasn't included in the question:
select table_1.custno, ifnull(min(qty),0) as min_qty
from table_1
left outer join table_2
on table_1.custno = table_2.custno
group by table_1.custno;

"Generic" SQL way:
SELECT table_1.custno,table_2.qty,table_2.descr
FROM table_1, table_2
WHERE table_2.id = (SELECT TOP 1 id
FROM table_2
WHERE custno = table_1.custno
ORDER BY qty )
SQL 2008 way (probably faster):
SELECT custno, qty, descr
FROM
(SELECT
custno,
qty,
descr,
ROW_NUMBER() OVER (PARTITION BY custno ORDER BY qty) RowNum
FROM table_2
) A
WHERE RowNum = 1

If you use SQL-Server you could use ROW_NUMBER and a CTE:
WITH CTE AS
(
SELECT table_1.custno,table_2.qty,table_2.descr,
RN = ROW_NUMBER() OVER ( PARTITION BY table_1.custno
Order By table_2.qty ASC)
FROM table_1
LEFT OUTER JOIN table_2
ON table_1.custno = table_2.custno
)
SELECT custno, qty,descr
FROM CTE
WHERE RN = 1
Demolink

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Reorder the rows of a table according to the numbers of similar cells in a specific column using SQL - sql

select D, S from D-S order by count(*) over(partition by D) desc, D, S;

Related

Rolling Average in SQL with Partition [duplicate]

sql numbering the partition of Numbers

Smarter GROUP BY

SQL Random N rows for each distinct value in column

left join without duplicate values using MIN()

Categories

Resources