Grouping by an aggregate column - sql

I have a requirement to show a listing of data associated with a given widget, however I can only aggregate on the widget while it's in sequence (essentially breaking the running total when the widget changes).
Here's a bit of a break down of what I mean...
Example data:
ID WIDGET PART
1 A000 B22
2 A000 B23
3 A002 B24
4 A001 B25
5 A001 B26
6 A000 B27
Desired output:
WIDGET MINPART COUNT
A000 B22 2
A002 B24 1
A001 B25 2
A000 B27 1
In SQL Server I've tried running the following:
with a as (
select
WIDGET,
min(PART) over (partition by WIDGET) as MINPART,
1 tcount
from test )
select WIDGET, MINPART, sum(tcount)
from a
group by WIDGET, MINPART
But this just results in the usual aggregation you might expect. I.E.:
WIDGET MINPART COUNT
A000 B22 3
A002 B24 1
A001 B25 2

Does this work for you?
;with x as (
select *,
lag(widget) over(order by id) as lg
from #t
),
y as (
select *, sum(case when widget<>lg then 1 else 0 end) over(order by id) as grp
from x
)
select widget, min(part), count(*)
from y
group by widget, grp

Don't you mean to use the named columns?, i.e.
with a as (
select
WIDGET,
min(PART) over (partition by WIDGET) as MINPART,
1 tcount
from Widget )
select WIDGET, MINPART, sum(tcount)
from a
group by Widget, MinPart;
SqlFiddle here

we can write this one using derived table also other way
DECLARE #Widget TABLE
(ID INT,
Widget NVARCHAR(10),
PART NVARCHAR(10));
INSERT INTO #Widget VALUES
(1, 'A000', 'B22'),
(2, 'A000', 'B23'),
(3, 'A002', 'B24'),
(4, 'A001', 'B25'),
(5, 'A001', 'B26'),
(6, 'A000', 'B27');
IF OBJECT_ID('tempdb..#t1') IS NOT NULL
DROP TABLE #t1
select Widget,PART,1 as t into #t1 from
(select Widget,MIN(PART)OVER (PARTITION BY Widget) As Part from #Widget
GROUP BY Widget,ID,PART)AS F
select Widget,PART,SUM(t) from #t1
group by Widget, Part

#dean solution is neat but a pure SQLServer 2008 solution is possible, the idea is the same, the ranker formula
sum(case when widget<>lg then 1 else 0 end) over(order by id) as grp
can be faked by a running total, for the record the full query is
WITH part AS (
SELECT a.id, a.widget, a.part
, breaker = CASE WHEN a.widget<>coalesce(b.widget, a.widget)
THEN 1
ELSE 0
END
FROM Widgets a
LEFT JOIN Widgets b ON a.id = b.id + 1
)
, DATA AS(
SELECT a.id, a.widget, a.part
, rank = (SELECT sum(breaker) FROM part b WHERE a.id >= b.id)
FROM part a
)
SELECT widget
, minpart = min(part)
, [count] = count(1)
FROM DATA
GROUP BY widget, rank

Related

How do I get the count of distinct values in the first row of the result

I want to check if the colour and city are multiple for a document for the highest amount. if yes, I want to set a bit as 1 and if not, it should be 0
Sample data:
Code doc year amount colour city
AB 123 2021 485 Red Paris
AB 123 2021 416 Red Paris
AB 123 2021 729 Red London
AB 123 2021 645 Red Bengaluru
Expected output:
I want the output in one row
Code Doc Year Amount Colour City Col_Mul City_Mul
AB 123 2021 729 Red London 0 1
Amount, Colour and city should be the maximum one.
What I tried:
To get the data in one row, I used the row number and ordered by the maximum amount and selected the data where the row number is one. But after that I used dense rank for the Colour and City column. But I didn't get the expected output.
You can use CROSS APPLY and get data as given below:
Thanks #Gayani for test data.
select TOPROW.*,case when T1.colorcount > 1 THEN 1 else 0 end as Multi_color,
case when T2.citycount > 1 THEN 1 else 0 end as Multi_city
from
(SELECT TOP 1 * FROM tes_firstRow
order by amount desc) as toprow
cross apply
(
SELECT count(distinct color) from tes_firstrow WHERE doc = toprow.doc
) as t1(colorcount)
cross apply
(
SELECT count(distinct city) from tes_firstrow WHERE doc = toprow.doc
) as t2(citycount)
Code
doc
year
amount
color
City
Multi_color
Multi_city
AB
123
2021
729
RED
LONDON
0
1
There you go. Weird requirement though...
SELECT T.Code, Doc, Year, MAX(T.Amount) Amount,
(SELECT TOP 1 Colour FROM T as X WHERE Amount = MAX(T.Amount)) Colour,
(SELECT TOP 1 City FROM T as X WHERE Amount = MAX(T.Amount)) City,
CASE WHEN COUNT(DISTINCT T.Colour) > 1 THEN 1 ELSE 0 END as Col_Mul,
CASE WHEN COUNT(DISTINCT T.City) > 1 THEN 1 ELSE 0 END as City_Mul
FROM T
GROUP BY T.Code, Doc, Year
I think you just want window functions combined with conditional aggregation:
select code, doc, year, max(amount),
max(case when seqnum = 1 then color end) as color,
max(case when seqnum = 1 then city end) as city,
(case when seqnum = 1 and color_count > 1 then 1 else 0 end) as color_dup,
(case when seqnum = 1 and city_count > 1 then 1 else 0 end) as city_dup,
from (select t.*,
row_number() over (partition by code, doc, year order by amount desc) as seqnum,
count(*) over (partition by code, doc, year, color) as color_count,
count(*) over (partition by code, doc, year, city) as city_count
from t
) t
group by code, doc, year;
I'm not actually sure if you want 1 when the value is duplicated or not, so those values might be backwards.
I hope this code sample will help you with this.
Please try the below code and let me know if it helps with what you need. I used temporary tables here. You can use any of the technology to build the logic. CTE (Common table expressions) or derived tables.
CREATE TABLE tes_firstRow
(
Code varchar(100)
, doc int
, [year] int
, amount int
, color varchar(100)
, City varchar(100)
)
insert into tes_firstRow values ('AB', 123,2021,485,'RED','PARIS')
insert into tes_firstRow values ('AB', 123,2021,416,'RED','PARIS')
insert into tes_firstRow values ('AB', 123,2021,729,'RED','LONDON')
insert into tes_firstRow values ('AB', 123,2021,645,'RED','BENGALURU')
SELECT
RANK() OVER (PARTITION BY Code, doc,[year] ORDER BY amount DESC) AS [rank]
,Code
,doc
,[year]
,amount
,color
,City
INTO #temp_1
FROM tes_firstRow
SELECT
[#temp_1].[Code]
,[#temp_1].[doc]
,[#temp_1].[year]
,[#temp_1].[amount]
,[#temp_1].[color]
,[#temp_1].[City]
, (Select COUNT(Distinct [#temp_1].[color] ) where [#temp_1].[rank] = 1 ) as Col_Mul
, (Select COUNT(Distinct [#temp_1].[City]) where [#temp_1].[rank] = 1) as City_Mul
,1 as City_Mul
FROM #temp_1
WHERE #temp_1.[rank] = 1
group by [#temp_1].[Code]
,[#temp_1].[doc]
,[#temp_1].[year]
,[#temp_1].[amount]
,[#temp_1].[color]
,[#temp_1].[City]
,[#temp_1].[rank]
DROP TABLE #temp_1
Result:

Perform ranking depend on category

I Have a table looks like this:
RowNum category Rank4A Rank4B
-------------------------------------------
1 A
2 A
3 B
5 A
6 B
9 B
My requirement is based on the RowNum order, Make two new ranking columns depend on category. Rank4A works like the DENSERANK() by category = A, but if the row is for category B, it derives the latest appeared rank for category A order by RowNum. Rank4B have similar logic, but it orders by RowNum in DESC order. So the result would like this (W means this cell I don't care its value):
RowNum category Rank4A Rank4B
-------------------------------------------
1 A 1 W
2 A 2 W
3 B 2 3
5 A 3 2
6 B W 2
9 B W 1
One more additional requirement is that CROSS APPLY or CURSOR is not allowed due to dataset being large. Any neat solutions?
Edit: Also no CTE (due to MAX 32767 limit)
You can use the following query:
SELECT RowNum, category,
SUM(CASE
WHEN category = 'A' THEN 1
ELSE 0
END) OVER (ORDER BY RowNum) AS Rank4A,
SUM(CASE
WHEN category = 'B' THEN 1
ELSE 0
END) OVER (ORDER BY RowNum DESC) AS Rank4B
FROM mytable
ORDER BY RowNum
Giorgos Betsos' answer is better, please read it first.
Try this out. I believe each CTE is clear enough to show the steps.
IF OBJECT_ID('tempdb..#Data') IS NOT NULL
DROP TABLE #Data
CREATE TABLE #Data (
RowNum INT,
Category CHAR(1))
INSERT INTO #Data (
RowNum,
Category)
VALUES
(1, 'A'),
(2, 'A'),
(3, 'B'),
(5, 'A'),
(6, 'B'),
(9, 'B')
;WITH AscendentDenseRanking AS
(
SELECT
D.RowNum,
D.Category,
AscendentDenseRanking = DENSE_RANK() OVER (ORDER BY D.Rownum ASC)
FROM
#Data AS D
WHERE
D.Category = 'A'
),
LaggedRankingA AS
(
SELECT
D.RowNum,
AscendentDenseRankingA = MAX(A.AscendentDenseRanking)
FROM
#Data AS D
INNER JOIN AscendentDenseRanking AS A ON D.RowNum > A.RowNum
WHERE
D.Category = 'B'
GROUP BY
D.RowNum
),
DescendantDenseRanking AS
(
SELECT
D.RowNum,
D.Category,
DescendantDenseRanking = DENSE_RANK() OVER (ORDER BY D.Rownum DESC)
FROM
#Data AS D
WHERE
D.Category = 'B'
),
LaggedRankingB AS
(
SELECT
D.RowNum,
AscendentDenseRankingB = MAX(A.DescendantDenseRanking)
FROM
#Data AS D
INNER JOIN DescendantDenseRanking AS A ON D.RowNum < A.RowNum
WHERE
D.Category = 'A'
GROUP BY
D.RowNum
)
SELECT
D.RowNum,
D.Category,
Rank4A = ISNULL(RA.AscendentDenseRanking, LA.AscendentDenseRankingA),
Rank4B = ISNULL(RB.DescendantDenseRanking, LB.AscendentDenseRankingB)
FROM
#Data AS D
LEFT JOIN AscendentDenseRanking AS RA ON D.RowNum = RA.RowNum
LEFT JOIN LaggedRankingA AS LA ON D.RowNum = LA.RowNum
LEFT JOIN DescendantDenseRanking AS RB ON D.RowNum = RB.RowNum
LEFT JOIN LaggedRankingB AS LB ON D.RowNum = LB.RowNum
/*
Results:
RowNum Category Rank4A Rank4B
----------- -------- -------------------- --------------------
1 A 1 3
2 A 2 3
3 B 2 3
5 A 3 2
6 B 3 2
9 B 3 1
*/
This isn't a recursive CTE, so the limit 32k doesn't apply.

How to split an SQL Table into half and send the other half of the rows to new columns with SQL Query?

Country Percentage
India 12%
USA 20%
Australia 15%
Qatar 10%
Output :
Country1 Percentage1 Country2 Percentage2
India 12% Australia 15%
USA 20% Qatar 10%
For example there is a table Country which has percentages, I need to divide the table in Half and show the remaining half (i.e. the remaining rows) in the new columns. I've also provided the table structure in text.
First, this type of operation should be done at the application layer and not in the database. That said, it can be an interesting exercise to see how to do this in the database.
I would use conditional aggregation or pivot. Note that SQL tables are inherently unordered. Your base table has no apparent ordering, so the values could come out in any order.
select max(case when seqnum % 2 = 0 then country end) as country_1,
max(case when seqnum % 2 = 0 then percentage end) as percentage_1,
max(case when seqnum % 2 = 1 then country end) as country_2,
max(case when seqnum % 2 = 1 then percentage end) as percentage_2
from (select c.*,
(row_number() over (order by (select null)) - 1) as seqnum
from country c
) c
group by seqnum / 2;
Try this
declare #t table
(
Country VARCHAR(20),
percentage INT
)
declare #cnt int
INSERT INTO #T
VALUES('India',12),('USA',20),('Australia',15),('Quatar',12)
select #cnt = count(1)+1 from #t
;with cte
as
(
select
SeqNo = row_number() over(order by Country),
Country,
percentage
from #t
)
select
*
from cte c1
left join cte c2
on c1.seqno = (c2.SeqNo-#cnt/2)
and c2.SeqNo >= (#cnt/2)
where c1.SeqNo <= (#cnt/2)
My variant
SELECT 'A' Country,1 Percentage INTO #Country
UNION ALL SELECT 'B' Country,2 Percentage
UNION ALL SELECT 'C' Country,3 Percentage
UNION ALL SELECT 'D' Country,4 Percentage
UNION ALL SELECT 'E' Country,5 Percentage
;WITH numCTE AS(
SELECT
*,
ROW_NUMBER()OVER(ORDER BY Country) RowNum,
COUNT(*)OVER() CountOfCountry
FROM #Country
),
set1CTE AS(
SELECT Country,Percentage,ROW_NUMBER()OVER(ORDER BY Country) RowNum
FROM numCTE
WHERE RowNum<=CEILING(CountOfCountry/2.)
),
set2CTE AS(
SELECT Country,Percentage,ROW_NUMBER()OVER(ORDER BY Country) RowNum
FROM numCTE
WHERE RowNum>CEILING(CountOfCountry/2.)
)
SELECT
s1.Country,s1.Percentage,
s2.Country,s2.Percentage
FROM set1CTE s1
LEFT JOIN set2CTE s2 ON s1.RowNum=s2.RowNum
DROP TABLE #Country
I just wanted to try something. I have used the function OFFSET. It does the requirement i think for your sample data, but dont know if its bulletproof all the way:
SQL Code
declare #myt table (country nvarchar(50),percentage int)
insert into #myt
values
('India' ,12),
('USA' ,20),
('Australia' ,15),
('Qatar' ,10),
('Denmark',10)
DECLARE #TotalRows int
SET #TotalRows = (select CEILING(count(*) / 2.) from #myt);
WITH dataset1 AS (
SELECT *,ROW_NUMBER() over(order by country ) as rn from (
SELECT Country,percentage from #myt a
ORDER BY country OFFSET 0 rows FETCH FIRST #TotalRows ROWS ONLY
) z
)
,dataset2 AS (
SELECT *,ROW_NUMBER() over(order by country ) as rn from (
SELECT Country,percentage from #myt a
ORDER BY country OFFSET #TotalRows rows FETCH NEXT #TotalRows ROWS ONLY
) z
)
SELECT * FROM dataset1 a LEFT JOIN dataset2 b ON a.rn = b.rn
Result
Assuming you want descending alphabetic country names, but the left column is determined by where India is located in the result:
with CoutryCTE as (
select c.*
, row_number() over (order by country)-1 as rn
from country c
)
, Col as (
select rn % 2 as num from CoutryCTE
where Country = 'India'
)
select max(case when rn % 2 = Col.num then country end) as country_1
, max(case when rn % 2 = Col.num then percentage end) as percentage_1
, max(case when rn % 2 <> Col.num then country end) as country_2
, max(case when rn % 2 <> Col.num then percentage end) as percentage_2
from CoutryCTE
cross join Col
group by rn / 2
;
SQLFiddle Demo
| country_1 | percentage_1 | country_2 | percentage_2 |
|-----------|--------------|-----------|--------------|
| India | 12% | Australia | 15% |
| USA | 20% | Qatar | 10% |
nb: this is extremely similar to an earlier answer by Gordon Linoff

Max and Min value's corresponding records

I have a scenario to get the respective field value of "Max" and "Min" records
Please find the sample data below
-----------------------------------------------------------------------
ID Label ProcessedDate
-----------------------------------------------------------------------
1 Label1 11/01/2016
2 Label2 11/02/2016
3 Label3 11/03/2016
4 Label4 11/04/2016
5 Label5 11/05/2016
I have the "ID" field populated in another table as a foreign key. While querying those records in that table based on the "ID" field I need to get the "Label" field of "Max" Processed date and "Min" processed date.
-----------------------------------------------------------------------
ID LabelID GroupingField
-----------------------------------------------------------------------
1 1 101
2 2 101
3 3 101
4 4 101
5 5 101
6 1 102
7 2 102
8 3 102
9 4 102
And the final result set I expect it to look something like this.
-----------------------------------------------------------------------
GroupingField FirstProcessed LastProcessed
-----------------------------------------------------------------------
101 Label1 Label5
102 Label1 Label4
I have 'almost' managed to get this above result using rank function but still not satisfied with it. So I am looking if someone can provide me with a better option.
Thanks,
Prakazz
CREATE TABLE #Details (ID INT,LabelID INT,GroupingField INT)
CREATE TABLE #Details1 (ID INT,Label VARCHAR(100),ProcessedDate VARCHAR(100))
INSERT INTO #Details1 (ID ,Label ,ProcessedDate )
SELECT 1,'Label1','11/01/2016' UNION ALL
SELECT 2,'Label2','11/02/2016' UNION ALL
SELECT 3,'Label3','11/03/2016' UNION ALL
SELECT 4,'Label4','11/04/2016' UNION ALL
SELECT 5,'Label5','11/05/2016'
INSERT INTO #Details (ID ,LabelID ,GroupingField )
SELECT 1,1,101 UNION ALL
SELECT 2,2,101 UNION ALL
SELECT 3,3,101 UNION ALL
SELECT 4,4,101 UNION ALL
SELECT 5,5,101 UNION ALL
SELECT 6,1,102 UNION ALL
SELECT 7,2,102 UNION ALL
SELECT 8,3,102 UNION ALL
SELECT 9,4,102
;WITH CTE (GroupingField , MAXId ,MinId) AS
(
SELECT GroupingField,MAX(LabelID) MAXId,MIN(LabelID) MinId
FROM #Details
GROUP BY GroupingField
)
SELECT GroupingField ,B.Label FirstProcessed, A.Label LastProcessed
FROM CTE
JOIN #Details1 A ON MAXId = A.ID
JOIN #Details1 B ON MinId = B.ID
You can use SQL Row_Number() function using Partition By as follows with a combination of Group By
;with cte as (
select
t.Label, t.ProcessedDate,
g.GroupingField,
ROW_NUMBER() over (partition by GroupingField Order By ProcessedDate ASC) minD,
ROW_NUMBER() over (partition by GroupingField Order By ProcessedDate DESC) maxD
from tbl t
inner join GroupingFieldTbl g
on t.ID = g.LabelID
)
select GroupingField, max(FirstProcessed) FirstProcessed, max(LastProcessed) LastProcessed
from (
select
GroupingField,
FirstProcessed = CASE when minD = 1 then Label else null end,
LastProcessed = CASE when maxD = 1 then Label else null end
from cte
where
minD = 1 or maxD = 1
) t
group by GroupingField
order by GroupingField
I also used CTE expression to make coding easier and understandable
Output is as

SELECT records until new value SQL

I have a table
Val | Number
08 | 1
09 | 1
10 | 1
11 | 3
12 | 0
13 | 1
14 | 1
15 | 1
I need to return the last values where Number = 1 (however many that may be) until Number changes, but do not need the first instances where Number = 1. Essentially I need to select back until Number changes to 0 (15, 14, 13)
Is there a proper way to do this in MSSQL?
Based on following:
I need to return the last values where Number = 1
Essentially I need to select back until Number changes to 0 (15, 14,
13)
Try (Fiddle demo ):
select val, number
from T
where val > (select max(val)
from T
where number<>1)
EDIT: to address all possible combinations (Fiddle demo 2)
;with cte1 as
(
select 1 id, max(val) maxOne
from T
where number=1
),
cte2 as
(
select 1 id, isnull(max(val),0) maxOther
from T
where val < (select maxOne from cte1) and number<>1
)
select val, number
from T cross join
(select maxOne, maxOther
from cte1 join cte2 on cte1.id = cte2.id
) X
where val>maxOther and val<=maxOne
I think you can use window functions, something like this:
with cte as (
-- generate two row_number to enumerate distinct groups
select
Val, Number,
row_number() over(partition by Number order by Val) as rn1,
row_number() over(order by Val) as rn2
from Table1
), cte2 as (
-- get groups with Number = 1 and last group
select
Val, Number,
rn2 - rn1 as rn1, max(rn2 - rn1) over() as rn2
from cte
where Number = 1
)
select Val, Number
from cte2
where rn1 = rn2
sql fiddle demo
DEMO: http://sqlfiddle.com/#!3/e7d54/23
DDL
create table T(val int identity(8,1), number int)
insert into T values
(1),(1),(1),(3),(0),(1),(1),(1),(0),(2)
DML
; WITH last_1 AS (
SELECT Max(val) As val
FROM t
WHERE number = 1
)
, last_non_1 AS (
SELECT Coalesce(Max(val), -937) As val
FROM t
WHERE EXISTS (
SELECT val
FROM last_1
WHERE last_1.val > t.val
)
AND number <> 1
)
SELECT t.val
, t.number
FROM t
CROSS
JOIN last_1
CROSS
JOIN last_non_1
WHERE t.val <= last_1.val
AND t.val > last_non_1.val
I know it's a little verbose but I've deliberately kept it that way to illustrate the methodolgy.
Find the highest val where number=1.
For all values where the val is less than the number found in step 1, find the largest val where the number<>1
Finally, find the rows that fall within the values we uncovered in steps 1 & 2.
select val, count (number) from
yourtable
group by val
having count(number) > 1
The having clause is the key here, giving you all the vals that have more than one value of 1.
This is a common approach for getting rows until some value changes. For your specific case use desc in proper spots.
Create sample table
select * into #tmp from
(select 1 as id, 'Alpha' as value union all
select 2 as id, 'Alpha' as value union all
select 3 as id, 'Alpha' as value union all
select 4 as id, 'Beta' as value union all
select 5 as id, 'Alpha' as value union all
select 6 as id, 'Gamma' as value union all
select 7 as id, 'Alpha' as value) t
Pull top rows until value changes:
with cte as (select * from #tmp t)
select * from
(select cte.*, ROW_NUMBER() over (order by id) rn from cte) OriginTable
inner join
(
select cte.*, ROW_NUMBER() over (order by id) rn from cte
where cte.value = (select top 1 cte.value from cte order by cte.id)
) OnlyFirstValueRecords
on OriginTable.rn = OnlyFirstValueRecords.rn and OriginTable.id = OnlyFirstValueRecords.id
On the left side we put an original table. On the right side we put only rows whose value is equal to the value in first line.
Records in both tables will be same until target value changes. After line #3 row numbers will get different IDs associated because of the offset and will never be joined with original table:
LEFT RIGHT
ID Value RN ID Value RN
1 Alpha 1 | 1 Alpha 1
2 Alpha 2 | 2 Alpha 2
3 Alpha 3 | 3 Alpha 3
----------------------- result set ends here
4 Beta 4 | 5 Alpha 4
5 Alpha 5 | 7 Alpha 5
6 Gamma 6 |
7 Alpha 7 |
The ID must be unique. Ordering by this ID must be same in both ROW_NUMBER() functions.