How can I get the sum of one group of data - sql

I have problem to get this list of data. Mostly everything I tried did not go as expected.
This is my input data:
Code Name total
01 First 50
02 Last 20
11 First 10
12 Last 25
21 First 15
22 Last 15
This is the output I would like:
Code Name total
01 First 50
02 Last 20
GROUP: 0 - 70
11 First 10
12 Last 25
GROUP: 1 - 35
21 First 15
22 Last 15
GROUP: 2 - 30
I need third row(not column) after first two rows that represent the group of first two rows(group zero) (sum of first two rows like third row) and also for last two group.

This was my original idea when I saw the question. Might be better to use grouping sets than compute since Microsoft deprecated it.
with data as (
select Code, substring(Code, 1, len(Code) - 1) as Prefix, Name, Total from T
)
select
case when grouping(Name) = 1 then Prefix else min(Code) end as Code,
case when grouping(Name) = 1 then '-' else Name end as Name,
sum(Total) as Total
from data
group by grouping sets ( (Prefix, Name), (Prefix) )
order by Prefix, grouping(Name), Code
Fixed a few problems with my old query. Here's a SQL Fiddle.

declare #table table (code varchar(10), name varchar(10), total int);
insert into #table(code, name, total) values
('01', 'First', 50),
('02', 'Last', 20),
('11', 'First', 10),
('12', 'Last', 15),
('21', 'First', 15),
('22', 'Last', 15);
select * from #table;
--select code, name, sum(total)
-- from #table
-- group by rollup (substring(code,1,1), name);
select code, name, total,
sum(total) over (partition by substring(code,1,1)) as subtotal
from #table;
Compute is not longer supported:
A reach and not tested
SELECT Code, Name, Total
FROM Table
ORDER BY Code, Name
COMPUTE SUM(Total) BY SubString(Code,1,1);
Compute

Related

Remove duplicates from single field only in rollup query

I have a table of data for individual audits on inventory. Every audit has a location, an expected value, a variance value, and some other data that aren't really important here.
I am writing a query for Cognos 11 which summarizes a week of these audits. Currently, it rolls everything up into sums by location class. My problem is that there may be multiple audits for individual locations and while I want the variance field to sum the data from all audits regardless of whether it's the first count on that location, I only want the expected value for distinct locations (i.e. only SUM expected value where the location is distinct).
Below is a simplified version of the query. Is this even possible or will I have to write a separate query in Cognos and make it two reports that will have to be combined after the fact? As you can likely tell, I'm fairly new to SQL and Cognos.
SELECT COALESCE(CASE
WHEN location_class = 'A'
THEN 'Active'
WHEN location_class = 'C'
THEN 'Active'
WHEN location_class IN (
'R'
,'0'
)
THEN 'Reserve'
END, 'Grand Total') "Row Labels"
,SUM(NVL(expected_cost, 0)) "Sum of Expected Cost"
,SUM(NVL(variance_cost, 0)) "Sum of Variance Cost"
,SUM(ABS(NVL(variance_cost, 0))) "Sum of Absolute Cost"
,COUNT(DISTINCT location) "Count of Locations"
,(SUM(NVL(variance_cost, 0)) / SUM(NVL(expected_cost, 0))) "Variance"
FROM audit_table
WHERE audit_datetime <= #prompt('EndDate') # audit_datetime >= #prompt('StartDate') #
GROUP BY ROLLUP(CASE
WHEN location_class = 'A'
THEN 'Active'
WHEN location_class = 'C'
THEN 'Active'
WHEN location_class IN (
'R'
,'0'
)
THEN 'Reserve'
END)
ORDER BY 1 ASC
This is what I'm hoping to end up with:
Thanks for any help!
Have you tried taking a look at the OVER clause in SQL? It allows you to use windowed functions within a result set such that you can get aggregates based on specific conditions. This would probably help since you seem to trying to get a summation of data based on a different grouping within a larger grouping.
For example, let's say we have the below dataset:
group1 group2 val dateadded
----------- ----------- ----------- -----------------------
1 1 1 2020-11-18
1 1 1 2020-11-20
1 2 10 2020-11-18
1 2 10 2020-11-20
2 3 100 2020-11-18
2 3 100 2020-11-20
2 4 1000 2020-11-18
2 4 1000 2020-11-20
Using a single query we can return both the sums of "val" over "group1" as well as the summation of the first (based on datetime) "val" records in "group2":
declare #table table (group1 int, group2 int, val int, dateadded datetime)
insert into #table values (1, 1, 1, getdate())
insert into #table values (1, 1, 1, dateadd(day, 1, getdate()))
insert into #table values (1, 2, 10, getdate())
insert into #table values (1, 2, 10, dateadd(day, 1, getdate()))
insert into #table values (2, 3, 100, getdate())
insert into #table values (2, 3, 100, dateadd(day, 1, getdate()))
insert into #table values (2, 4, 1000, getdate())
insert into #table values (2, 4, 1000, dateadd(day, 1, getdate()))
select t.group1, sum(t.val) as group1_sum, group2_first_val_sum
from #table t
inner join
(
select group1, sum(group2_first_val) as group2_first_val_sum
from
(
select group1, val as group2_first_val, row_number() over (partition by group2 order by dateadded) as rownumber
from #table
) y
where rownumber = 1
group by group1
) x on t.group1 = x.group1
group by t.group1, x.group2_first_val_sum
This returns the below result set:
group1 group1_sum group2_first_val_sum
----------- ----------- --------------------
1 22 11
2 2200 1100
The most inner subquery in the joined table numbers the rows in the data set based on "group2", resulting in the records either having a "1" or a "2" in the "rownum" column since there's only 2 records in each "group2".
The next subquery takes that data and filters out any rows that are not the first (rownum = 1) and sums the "val" data.
The main query gets the sum of "val" in each "group1" from the main table and then joins on the subqueried table to get the "val" sum of only the first records in each "group2".
There are more efficient ways to write this such as moving the summation of the "group1" values to a subquery in the SELECT statement to get rid of one of the nested tabled subqueries, but I wanted to show how to do it without subqueries in the SELECT statement.
Have you tried to put the distinct at the bottom like this ?
(SUM(NVL(variance_cost,0)) / SUM(NVL(expected_cost,0))) "Variance",
COUNT(DISTINCT location) "Count of Locations"
FROM audit_table

SQL check if group is continuous when ordered and return broken groups

I tried to find a way to list rows, that are breaking continuous groups of records. I say groups, because we could use GROUP BY to list values of groups (but that is not applied, we need particular rows).
Sample data:
CREATE TABLE Test (ID INT, NNO INT, DIDX INT, SIDX INT);
-- Valid sample rows
INSERT INTO Test (ID, NNO, DIDX, SIDX) VALUES
( 1, 107 , 7898, 0 ),
( 2, 102 , 7883, 0 ),
( 3, 53 , 7877, 0 ),
( 4, 62 , 7877, 42 ),
( 5, 101 , 7870, 81 ),
( 6, 103 , 7918, 42 ),
( 7, 110 , 7920, 42 ),
( 8, 100 , 7919, 0 ),
( 9, 24 , 7921, 0 ),
(10, 85 , 7904, 0 ),
(11, 85 , 7905, 0 ),
(12, 85 , 7906, 0 ),
(13, 85 , 7907, 0 ),
(14, 85 , 7908, 0 ),
(15, 85 , 7911, 0 ),
(16, 112 , 7876, 0 ),
(17, 5 , 7891, 42 ),
(18, 80 , 7912, 42 ),
(19, 66 , 7912, 91 ),
(20, 22 , 7912, 81 ),
(21, 60 , 7911, 42 ),
(22, 60 , 7912, 0 ),
(23, 78 , 7891, 81 );
-- Disecting row
INSERT INTO Test (ID, NNO, DIDX, SIDX) VALUES
(24, 666 , 7906, 120);
EDIT: I probaly mislead some answers a bit by providing an example too much simplified. It then appeared that perhaps the groups could be only broken by a single row. So please add another row into the example data set:
-- Disecting row -2-
INSERT INTO Test (ID, NNO, DIDX, SIDX) VALUES
(25, 444 , 7906, 160);
Now if ordered the rows in this particular order:
SELECT ID, NNO, DIDX, SIDX
FROM Test
ORDER BY DIDX, SIDX;
...the last inserted row will break (disect) group of records, which have NNO=85:
ID NNO DIDX SIDX
----------- ----------- ----------- -----------
...
10 85 7904 0
11 85 7905 0
12 85 7906 0
24 666 7906 120 <<<<<<<<<<<<<<<<<<<
25 444 7906 160 <<<<<<<<<<<< after EDIT <<<<<<<
13 85 7907 0
14 85 7908 0
15 85 7911 0
...
The result should say 85, which is the broken group, or NULL if we would use healthy data without row ID=24.
Another way to look at the problem is, that for each group (even if it contains 1 row), there may not be records of another group which start or end lies between start and end of the queried group. In the provided example, queried group (85) starts with DIDX=7904 and SIDX=0 and ends with DIDX=7911 and SIDX=0 and nothing can fall into the range - which is the case of record ID=24.
I so far tried some approaches like using ROW_NUMBER() OVER (ORDER BY ...), using WITH with MIN and MAX to go through each group and check, if there are rows that fall within (failed so far to construct it) and GROUP BY with MIN and MAX with aim to cross check it with table rows. No attempt is really worth publishing, so far.
Could anyone advice, how I could check continuity of such defined groups?
WITH CTE AS (
SELECT
ID
,NNO
,DIDX
,SIDX
,LAG(NNO) OVER (ORDER BY DIDX, SIDX) as previousNNO
,LEAD(NNO) OVER (ORDER BY DIDX, SIDX) as nextNNO
FROM Test
)
SELECT
previousNNO as BrokenGroup
FROM CTE
WHERE previousNNO=nextNNO
AND NNO<>previousNNO
I used a CTE and WINDOW functions to also keep track of the previous and next group (NNO) for each row. A broken group will be one that has a different current group while the previous and next are the same. From your example with ID 24.
ID NNO DIDX SIDX
----------- ----------- ----------- -----------
...
12 85 < previous group 7906 0
24 666 < current group 7906 120 <<<<<<<<<<<<<<<<<<<
13 85 < next group 7907 0
...
I'm assuming that any consecutive range of DIDX should only have one NNO. As such there will be no two valid groups that abut each other.
This should help identify the offenders:
with data as (
select NNO, DIDX, dense_rank() over (order by DIDX) as rn
from Test
)
select min(DIDX) as range_start, max(DIDX) as range_end
from data
group by DIDX - rn
having count(distinct NNO) > 1;
Getting the actual rows:
with data as (
select ID, NNO, DIDX, dense_rank() over (order by DIDX) as rn
from Test
), groups as (
select DIDX - rn as grp, min(DIDX) as range_start, max(DIDX) as range_end
from data
group by DIDX - rn
having count(distinct NNO) > 1
), data2 as (
select *, lead(NNO) over (partition by grp order by DIDX) as next_NNO
from Test t inner join groups g
on t.DIDX between g.range_start and g.range_end
)
select * from data2 where NNO <> next_NNO;
If you're looking for a test to run prior to inserting a row:
with data as (
select NNO, DIDX, row_number() over (order by DIDX) as rn
from Test
)
select case when min(DIDX) is not null then 'Fail' else 'Pass' end as InsertTest
from data
group by DIDX - rn
having #proposed_DIDX between min(DIDX) and max(DIDX)
and #proposed_NNO <> min(NNO);
OK, so inspired by given answers (which I voted up for helping), I came with this code that seems to provide the desired result. I'm not sure though, if it's the cleanest and shortest possible way.
;WITH CTE AS (
SELECT NNO,
DMIN,
DMAX,
SMIN,
SMAX,
LEAD(DMIN) OVER (ORDER BY DMIN, SMIN) as nextDMIN,
LAG(DMAX) OVER (ORDER BY DMIN, SMIN) as prevDMAX,
LAG(SMAX) OVER (ORDER BY DMIN, SMIN) as prevSMAX,
LEAD(SMIN) OVER (ORDER BY DMIN, SMIN) as nextSMIN,
CNT
FROM (
SELECT NNO,
MIN(DIDX) as DMIN,
MAX(DIDX) as DMAX,
MIN(SIDX) as SMIN,
MAX(SIDX) as SMAX,
COUNT(NNO) as CNT
FROM Test
GROUP BY NNO
) as SRC
)
SELECT *
FROM CTE
WHERE ((prevDMAX > DMIN OR (prevDMAX = DMIN AND prevSMAX > SMIN)) OR
(nextDMIN < DMAX OR (nextDMIN = DMAX AND nextSMIN < SMAX)))
AND CNT > 1
Perhaps I should give a little explanation. The code finds MIN and MAX border values for each parameter SMIN and DMIN and then find those values for previous and next rows. We also COUNT number of rows in a group.
The conditions in brackets basicaly say, that no record of other group can be in DMIN and DMAX range, and if it's on the borders of the range, then it has to be outside SMIN or SMAX. Finally, a broken group must have more then 1 row; otherwise the query would return not only offendee group, but also first offender.
I should say, that this code has a little flaw and that is a case when there would be an offender with more than one row intact in between ofendee group rows. I should be able to tackle this in post processing, where I have to shuffle rows to achieve intact groups.

Max analytical function plus windowed ordering is not working as expected

I have a table with status and location, the following is the data. I'd like to get max status partitioned by location using custom ordering. Any idea what needs to change? Right now it is giving only max value, irrespective of the ordering i mentioned.
The custom ordering is 1 > 3 > 2
status | location
1 | 11
2 | 11
2 | 12
3 | 12
3 | 11
Expected result for location 11 : 1
Expected result for location 12 : 3
Query:
select max(status) over (partition by location order by decode(status, '1',6,'3',5,'2',4,3)
rows between unbounded preceding and unbounded following) mx from items;
http://sqlfiddle.com/#!4/ed9e7e/13
create table items
( status varchar2(1), location number(9)
);
insert into items values('1',123);
insert into items values('2',123);
insert into items values('3',123);
insert into items values('4',123);
I think you want first_value():
select first_value(status) over (partition by location
order by decode(status, '1', 6, '3', 5, '2', 4, 3)
) mx
from items;
I'm not a big fan of decode(), but it is a concise way to express what you want. I prefer case expressions.
I see a couple of issues. Your decode doesn't seem to match up with you are are saying you want. Secondly, I don't think MAX() is the function you want to use because it is returning maximum status without respect to your custom order. Instead you should assign row numbers partitioned by location and ordered by your custom sort order. Then pick all the rows with where row number is 1.
create table items
( status varchar2(1), location number(9)
);
insert into items values('1',11);
insert into items values('2',11);
insert into items values('2',12);
insert into items values('3',12);
insert into items values('3',11);
select x.location, x.status
from (
select ROW_NUMBER() over (partition by location order by decode(status, '1',1,'2',3,'3',2,4)) as rn,
status, location from items) x
where x.rn = 1
SQL Fiddle.
I suggest that using a simple GROUP BY might prove easier:
SELECT LOCATION,
DECODE(MAX(ORDERING), 6, '1', 5, '3', 4, '2', 3) AS STATUS
FROM (SELECT LOCATION,
STATUS AS STATUS,
DECODE(STATUS, '1', 6, '3', 5, '2', 4, 3) AS ORDERING
FROM ITEMS)
GROUP BY LOCATION
ORDER BY LOCATION
db<>fiddle here

SQL Server SSRS Multiple lookup values

I have a huge set of data. Some pf the data has multiple values, kinda looking like this:
Column 1 Column 2
A 1
A 10
A 1E
B 2F
B 1BH
C WBH
D 3X
D 2
D 1
D 10
D 11
I would like to select the unique values in Column 1 and display all relevant values of Column 2 in as string separated by comma (using SSRS). i.e.
Column 1 Column 2
A 01, 10, 1E
B 2F, 1BH
C WBH
D 02, 01, 10, 11
In addition, any value in Column 1 that is less than 10, I would like it to be preceded by a zero.
I know I can use SELECT DISTINCT to get all unique values of Column 1. But I am unsure how to go around Column 2?
With regards to having a zero preceding numbers less than 10, I can do this:
SELECT RIGHT('0' + convert(varchar(2), value()), 2)
I am unsure how to put it all together to get the result I want.
Thank you.
I think this is what you want.
DECLARE #Input TABLE
(
ProductID INT,
Price INT
)
INSERT INTO #Input VALUES (6,22), (6,35), (6,77), (6, 88), (6,55),(6,200),(7,6),(7,4),(8,5),(8,5)
;WITH CTE AS
(SELECT ProductID, MAX(Price) AS Max_Price, MIN(Price) AS Min_Price
FROM #Input
GROUP BY ProductID
)
SELECT ProductID, CASE WHEN Max_Price > Min_Price THEN CONVERT(VARCHAR(10), Min_Price) + ', ' + CONVERT(VARCHAR(10),Max_Price)
ELSE CONVERT(VARCHAR(10), Min_Price) END AS Price_Range
FROM CTE

SQL Server Sum a specific number of rows based on another column

Here are the important columns in my table
ItemId RowID CalculatedNum
1 1 3
1 2 0
1 3 5
1 4 25
1 5 0
1 6 8
1 7 14
1 8 2
.....
The rowID increments to 141 before the ItemID increments to 2. This cycle repeats for about 122 million rows.
I need to SUM the CalculatedNum field in groups of 6. So sum 1-6, then 7-12, etc. I know I end up with an odd number at the end. I can discard the last three rows (numbers 139, 140 and 141). I need it to start the SUM cycle again when I get to the next ItemID.
I know I need to group by the ItemID but I am having trouble trying to figure out how to get SQL to SUM just 6 CalculatedNum's at a time. Everything else I have come across SUMs based on a column where the values are the same.
I did find something on Microsoft's site that used the ROW_NUMBER function but I couldn't quite make sense of it. Please let me know if this question is not clear.
Thank you
You need to group by (RowId - 1) / 6 and ItemId. Like this:
drop table if exists dbo.Items;
create table dbo.Items (
ItemId int
, RowId int
, CalculatedNum int
);
insert into dbo.Items (ItemId, RowId, CalculatedNum)
values (1, 1, 3), (1, 2, 0), (1, 3, 5), (1, 4, 25)
, (1, 5, 0), (1, 6, 8), (1, 7, 14), (1, 8, 2);
select
tt.ItemId
, sum(tt.CalculatedNum) as CalcSum
from (
select
*
, (t.RowId - 1) / 6 as Grp
from dbo.Items t
) tt
group by tt.ItemId, tt.Grp
You could use integer division and group by.
SELECT ItemId, (RowId-1)/6 as Batch, sum(CalculatedNum)
FROM your_table GROUP BY ItemId, Batch
To discard incomplete batches:
SELECT ItemId, (RowId-1)/6 as Batch, sum(CalculatedNum), count(*) as Cnt
FROM your_table GROUP BY ItemId, Batch HAVING Cnt = 6
EDIT: Fix an off by one error.
To ensure you're querying 6 rows at a time you can try to use the modulo function : https://technet.microsoft.com/fr-fr/library/ms173482(v=sql.110).aspx
Hope this can help.
Thanks everyone. This was really helpful.
Here is what we ended up with.
SELECT ItemID, MIN(RowID) AS StartingRow, SUM(CalculatedNum)
FROM dbo.table
GROUP BY ItemID, (RowID - 1) / 6
ORDER BY ItemID, StartingRow
I am not sure why it did not like the integer division in the select statement but I checked the results against a sample of the data and the math is correct.