How to write a cross tab query with median

How to write a cross tab query with median - sql-server-2005

Here's my table
ST NUM
1 1
1 2
1 2
2 1
2 2
2 2
3 2
3 8
I want to return a query where it returns the median of NUM for each ST
ST NUM
1 2
2 2
3 5
I already have a median function
SELECT
CONVERT(DECIMAL(10,2), (
(CONVERT (DECIMAL(10,2),
(SELECT MAX(num) FROM
(SELECT TOP 50 PERCENT num FROM dbo.t ORDER BY num ASC) AS H1)
+
(SELECT MIN(sortTime) FROM
(SELECT TOP 50 PERCENT num FROM dbo.t ORDER BY num DESC) AS H2)
))) / 2) AS Median
Any tips for how to do this?

try this
With
MedianResult
as
(
Select
ST,NUM ,
Row_Number() OVER(Partition by ST Order by NUM) as A,
Row_Number() OVER(Partition by ST Order by NUM desc) as B
from **YourTableName**
)
Select ST, Avg(NUM) as Median
From MedianResult
Where Abs(A-B)<=1
Group by ST

Related

Selecting top most row in Bigquery based on conditions

I have a huge table, where sometimes 1 product ID has multiple specifications. I want to select the newest but unfortunately, I don't have the date information. please consider this example dataset
Row ID Type Sn Sn_Ind
1 3 SLN SL20 20
2 1 SL SL 0
3 2 SL SL 0
4 1 M SL21 10
5 3 M SL21 10
6 1 SLN SL20 20
I used the below query to somehow group the products in give them row numbers like
with cleanedMasterData as(
SELECT *
FROM (
SELECT *,ROW_NUMBER() OVER(PARTITION BY ID ORDER BY Sn DESC, Sn_Ind DESC) AS rn
FROM `project.dataset.table`
)
-- where rn = 1
)
select * from cleanedMasterData
Please find below the example table after cleaning
Row ID Type Sn Sn_Ind rn
1 1 SL SL 0 1
2 1 M SL21 10 2
3 1 SLN SL20 20 3
4 2 SL SL 0 1
5 3 M SL21 10 1
6 3 SLN SL20 20 2
but if you see for ID 2 and 3, I can easily select the top row with where rn = 1
but for ID 1, my preferred row would be 2 because that is the newest.
My question here is how do I prioritise a value in column so that I can get the desired solution like :
Row ID Type Sn Sn_Ind rn
1 1 M SL21 10 1
2 2 SL SL 0 1
3 3 M SL21 10 1
As the values are fixed in Sn column - for ex SL, SL20, SL19, SL21 etc - If somehow I can give weightage to these values and create a new temp column with weightage and sort based on it, then?
Thank you for your support in advance!!

Consider below
SELECT *
FROM `project.dataset.table`
WHERE TRUE
QUALIFY ROW_NUMBER() OVER(PARTITION BY ID ORDER BY IF(Sn = 'SL', 0, 1) DESC, Sn DESC) = 1
If applied to sample data in your question - output is

It wasn't difficult, I tried a few things and it worked out. If anyone can optimize the below solution even more that would be awesome.
first the dataset
#standardSQL
WITH `project.dataset.table` AS (
SELECT 1 ID, 'SLN' Type, 'SL20' Sn, 20 Sn_Ind UNION ALL
SELECT 1 , 'SL' , 'SL' , 0 UNION ALL
SELECT 2 , 'SL' , 'SL' , 0 UNION ALL
SELECT 1 , 'M' , 'SL21' , 10 UNION ALL
SELECT 3 , 'M' , 'SL21' , 10 UNION ALL
SELECT 1 , 'SLN' , 'SL20' , 20
)
with weightage as(
SELECT
*,
MAX(CASE Sn WHEN 'SL' THEN 0 ELSE 1 END) OVER (PARTITION BY Sn) AS weightt,
FROM
`project.dataset.table`
ORDER BY
weightt DESC, Sn DESC
), main as (
select * EXCEPT(rn, weightt)
from (
select * ,ROW_NUMBER() OVER(PARTITION BY ID ORDER BY weightt DESC, Sn DESC) AS rn
from weightage )
where rn = 1
)
select * from main
after this, I can get the desired result
Row ID Type Sn Sn_Ind
1 1 M SL21 10
2 2 SL SL 0
3 3 M SL21 10

Assign column value based on the percentage of rows

In DB2 is there a way to assign a column value based on the first x%, then y% and remaining z% of rows?
I've tried using row_number() function but no luck!
Example below
Assuming that the below example count(id) is already arranged in descending order
Input:
ID count(id)
5 10
3 8
1 5
4 3
2 1
Output:
First 30% rows of the above input should be assigned code H, last 30% of the rows will have code L and remaining will have code M. If 30% of rows evaluates to decimal then round up-to 0 decimal place.
ID code
5 H
3 H
1 M
4 L
2 L

You can use window functions:
select t.id,
(case ntile(3) over (order by count(id) desc)
when 1 then 'H'
when 2 then 'M'
when 3 then 'L'
end) as grp
from t
group by t.id;
This puts them into equal sized groups.
For 30-40-30% split with your conditions, you have to be more careful:
select t.id,
(case when (seqnum - 1.0) < 0.3 * cnt then 'H'
when (seqnum + 1.0) > 0.7 * cnt then 'L'
else 'M'
end) as grp
from (select t.id,
count(*) as cnt,
count(*) over () as num_ids,
row_number() over (order by count(*) desc) as seqnum
from t
group by t.id
) t

Try this:
with t(ID, count_id) as (values
(5, 10)
, (3, 8)
, (1, 5)
, (4, 3)
, (2, 1)
)
select t.*
, case
when pst <=30 then 'H'
when pst <=70 then 'M'
else 'L'
end as code
from
(
select t.*
, rownumber() over (order by count_id desc) as rn
, 100*rownumber() over (order by count_id desc)/nullif(count(1) over(), 0) as pst
from t
) t;
The result is:
ID COUNT_ID RN PST CODE
-- -------- -- --- ----
5 10 1 20 H
3 8 2 40 M
1 5 3 60 M
4 3 4 80 L
2 1 5 100 L

sql numbering the partition of Numbers

I have a set of numbers like this
ID
===
1
2
3
1
2
1
1
2
3
4
5
...
I want to select a new row that increase when fetch next 1 like this
ID number
=== ========
1 1
2 1
3 1
1 2
2 2
1 3
1 4
2 4
3 4
4 4
5 4
Any suggestion ?

Assuming that you have a column o which specify the ordering then you can use a self-join like this:
select d1.o, d1.id, count(*)
from data d1
join data d2 on d1.o >= d2.o and d2.id = 1
group by d1.o, d1.id
DBFiddle DEMO

You can solve this with use of cte and window functions, as follows:
DECLARE #t TABLE (ID INT);
INSERT INTO #t VALUES (1),(2),(3),(1),(2),(1),(1),(2),(3),(4),(5);
WITH cte AS(
SELECT ID, ROW_NUMBER() OVER (ORDER BY (SELECT 1)) rn
FROM #t
),
cte1 AS(
SELECT ID, rn, ROW_NUMBER() OVER (ORDER BY rn) rn2
FROM cte
WHERE ID = 1
)
SELECT c.ID, MAX(rn2) OVER (ORDER BY c.rn) rn
FROM cte c
LEFT JOIN cte1 c1 ON c1.rn = c.rn
ORDER BY c.rn

Generate group of data based on sum of values

How to calculate the sum and total files.
here is my table
**FileId** **FileSize(MB)**
1 5
2 4
3 1
4 6
5 8
6 1
7 7
8 2
Expected result
BatchNo StartId EndId BatchSize
1 1 3 10
2 4 4 6
3 5 6 9
4 7 8 9
If File Size >= 10 then start new batch
also file count per batch is >= 10 then start new batch
StartId and EndId based on FileId
and BatchNo Is AutoIncrement

You can use recursive query like this
with rdata as
(
select row_number() over (order by fileId) rn, * from data
), rcte as
(
select 1 no, 1 gr, fileSize fileSizeSum , *
from rdata where fileid = 1
union all
select case when fileSizeSum + d.fileSize > 10 or r.no = 10 then 1 else r.no + 1 end gr,
case when fileSizeSum + d.fileSize > 10 or r.no = 10 then r.gr + 1 else r.gr end gr,
case when fileSizeSum + d.fileSize > 10 or r.no = 10 then d.fileSize else d.fileSize + fileSizeSum end fileSizeSum,
d.*
from rcte r
join rdata d on r.rn + 1 = d.rn
)
select r.gr,
min(fileId),
max(fileId),
max(fileSizeSum)
from rcte r
group by r.gr
dbfiddle

Here is another solution, with different approach:
INSERT INTO #batchdetails (FileID, FileSizeTotal, GroupID)
SELECT FileID, (
SELECT SUM(filesize) FROM #filedetails f2
WHERE f1.fileid >= f2.fileid ) AS FileSizeTotal,
1+CONVERT(INT,(
SELECT SUM(filesize) FROM #filedetails f2
WHERE f1.fileid >= f2.fileid
)/(#filesizepergroup+0.1)) AS GroupID
FROM #filedetails f1
SELECT DISTINCT
BatchDetails.GroupID AS BatchNo,
MIN(BatchDetails.FileID) OVER (PARTITION BY BatchDetails.GroupID ORDER BY BatchDetails.GroupID) AS StartID,
MAX(BatchDetails.FileID) OVER (PARTITION BY BatchDetails.GroupID ORDER BY BatchDetails.GroupID) AS EndID,
BatchSizeGroup.BatchSize
FROM #batchdetails BatchDetails
INNER JOIN (
SELECT GroupID, (GroupFileSizeTotal - LAG(GroupFileSizeTotal,1,0) OVER (ORDER BY GroupID)) AS BatchSize
FROM (
SELECT DISTINCT
GroupID,
MAX(FileSizeTotal) OVER (PARTITION BY GroupID ORDER BY GroupID) AS GroupFileSizeTotal
FROM #batchdetails
GROUP BY GroupID, FileSizeTotal
)A
)BatchSizeGroup ON BatchDetails.GroupID = BatchSizeGroup.GroupID
GROUP BY BatchDetails.GroupID, BatchDetails.FileID, BatchSizeGroup.BatchSize
Demo is Here : dbfiddle

SQL Random N rows for each distinct value in column

I have the following table:
Name Field
A 1
B 1
C 1
D 1
E 1
F 1
G 1
H 2
I 2
J 2
K 3
L 3
M 3
N 3
O 3
P 3
Q 3
R 3
S 3
T 3
I need a SQL query which will generate me a set with 5 random rows for each distinct value on column Field.
For example, results expected:
Name Field
A 1
B 1
D 1
E 1
G 1
J 2
I 2
H 2
M 3
Q 3
T 3
S 3
P 3
Is there an easy way to do this? Or should i split that table into more tables and generate random for each table then union them?

You can do this with a CTE using a ROW_NUMBER() whilst PARTITIONing on the Field:
;With Cte As
(
Select Name, Field,
Row_Number() Over (Partition By Field Order By NewId()) RN
From YourTable
)
Select Name, Field
From Cte
Where RN <= 5
SQL Fiddle

You can readily do this with row_number():
select name, field
from (select t.*,
row_number() over (partition by field order by newid()) as seqnum
from t
) t
where seqnum <= 5;

An enhancement to Gordon Linoff's code, This code really helped me if you need criteria in your query.
select *
from (select t.*,
row_number() over (partition by region order by newid()) as seqnum
from MyTable t
WHERE t.program = 'ACME'
) t
where seqnum <= 1500;

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

How to write a cross tab query with median - sql-server-2005

try this With MedianResult as ( Select ST,NUM , Row_Number() OVER(Partition by ST Order by NUM) as A, Row_Number() OVER(Partition by ST Order by NUM desc) as B from YourTableName ) Select ST, Avg(NUM) as Median From MedianResult Where Abs(A-B)<=1 Group by ST

Related

Selecting top most row in Bigquery based on conditions

Assign column value based on the percentage of rows

sql numbering the partition of Numbers

Generate group of data based on sum of values

SQL Random N rows for each distinct value in column

Categories

Resources

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

How to write a cross tab query with median - sql-server-2005

try this With MedianResult as ( Select ST,NUM , Row_Number() OVER(Partition by ST Order by NUM) as A, Row_Number() OVER(Partition by ST Order by NUM desc) as B from **YourTableName** ) Select ST, Avg(NUM) as Median From MedianResult Where Abs(A-B)<=1 Group by ST

Related

Selecting top most row in Bigquery based on conditions

Assign column value based on the percentage of rows

sql numbering the partition of Numbers

Generate group of data based on sum of values

SQL Random N rows for each distinct value in column

Categories

Resources

try this With MedianResult as ( Select ST,NUM , Row_Number() OVER(Partition by ST Order by NUM) as A, Row_Number() OVER(Partition by ST Order by NUM desc) as B from YourTableName ) Select ST, Avg(NUM) as Median From MedianResult Where Abs(A-B)<=1 Group by ST