SQL Aggregation for a smaller result set - sql

I have a database for which I need to aggregate records into another smaller set. This result set should contain the difference between maximum and minumum of specific columns of the original records where they add up to certain SUM, a closed interval constant C.
The constant C determines how the original records are aggregated and no entry in the resulting set ever exceeds it. Naturally I am supposed to run this in natural primary key order..
To illustrate: table has:
[key]
[a]
[b]
[minColumn]
[maxColumn]
[N]
...all are int datatype.
I am after a result set that has entries where the MAX(maxColumn) - MIN(minColumn) for that group such that when their difference is summed up it is less or equal to constant C.
Apart from the MAX(maxColumn) and MIN(minColumn) value I also need the FIRST record column [a] and LAST record column [b] values before creating a new entry in this result set. Finally, the N column should be SUMmed for all original records in a group.
Is there an efficient way to do this without cursors?
-----[Trivial Sample]------------------------------------------------------------
I am attempting to group-by a slightly complicated form of a running sum, constant C.
There is only one table, columns are all of int type and sample data
declare #t table (
PK int primary key
, int a, int b, int minColumn, int maxColumn, int N
)
insert #t values (1,5,6,100,200,1000)
insert #t values (2,7,8,210,300,2000)
insert #t values (3,9,10,420,600,3000)
insert #t values (4,11,12,640,800,4000)
Thus for:
key, a, b, minColumn, maxColumn, N
---------------------------------------
1, 5, 6, 100, 200, 1000
2, 7, 8, 210, 300, 2000
3, 9, 10, 420, 600, 3000
4, 11, 12, 640, 800, 4000
I need the result set to look like, for a constant C of 210 :
firstA | lastB | MIN_minColumn | MAX_maxColumn | SUM_N
5 8 100 300 3000
9 10 420 600 3000
11 12 640 800 4000
[ Adding the bounty and sample as discussed below]
For C = 381, It should contain 2 rows:
firstA | lastB | MIN_minColumn | MAX_maxColumn | SUM_N
5 8 100 300 3000
9 12 420 800 7000
Hope this demonstrates the problem better.. and for a constant C say 1000 you would get 1 record in the result:
firstA | lastB | MIN_minColumn | MAX_maxColumn | SUM_N
5 12 100 800 10000

DECLARE #c int
SELECT #c = 210
SELECT MIN(a) firstA,
MAX(b) lastB,
MIN(minColumn) MIN_minColumn,
MAX(maxColumn) MAX_maxColumn,
SUM(N) SUM_N
FROM #t t
JOIN (SELECT key, floor(sum/#c) as rank
FROM (SELECT key,
(SELECT SUM(t2.maxColumn - t2.minColumn)
FROM #t t2
WHERE t2.key <= t1.key
GROUP BY t1.key) as sum
FROM #t t1) A
) B on B.key = t.key
GROUP BY B.rank
/*
Table A: for each key, calculating SUM[maxColumn-minColumn] of all keys below it.
Table B: for each key, using the sum in A, calculating a rank so that:
sum = (rank + y)*#c where 0 <= y < 1.
ex: #c=210, rank(100) = 0, rank(200) = 0, rank(220) = 1, ...
finally grouping by rank, you'll have what you want.
*/

declare #c int
select #c = 210
select firstA = min(a), lastB = max(b), MIN_minColumn = min(minColumn), MAX_maxColumn = max(maxColumn), SUM_N = sum(N)
from #t
where minColumn <= #c
union all
select a, b, minColumn, maxColumn, N
from #t
where minColumn > #c

I am a little confused on the grouping logic for result you are trying to produce, but from the description of what you are looking for, I think you need a HAVING clause. You should be able to do something like:
SELECT groupingA, groupingB, MAX(a) - MIN(b)
FROM ...
GROUP BY groupingA, groupingB
HAVING (MAX(a) - MIN(b)) < C
...in order to filter out the difference between your max and min values, once you've determined your grouping. Hope this is helpful

Related

SQL Server SSRS Multiple lookup values

I have a huge set of data. Some pf the data has multiple values, kinda looking like this:
Column 1 Column 2
A 1
A 10
A 1E
B 2F
B 1BH
C WBH
D 3X
D 2
D 1
D 10
D 11
I would like to select the unique values in Column 1 and display all relevant values of Column 2 in as string separated by comma (using SSRS). i.e.
Column 1 Column 2
A 01, 10, 1E
B 2F, 1BH
C WBH
D 02, 01, 10, 11
In addition, any value in Column 1 that is less than 10, I would like it to be preceded by a zero.
I know I can use SELECT DISTINCT to get all unique values of Column 1. But I am unsure how to go around Column 2?
With regards to having a zero preceding numbers less than 10, I can do this:
SELECT RIGHT('0' + convert(varchar(2), value()), 2)
I am unsure how to put it all together to get the result I want.
Thank you.
I think this is what you want.
DECLARE #Input TABLE
(
ProductID INT,
Price INT
)
INSERT INTO #Input VALUES (6,22), (6,35), (6,77), (6, 88), (6,55),(6,200),(7,6),(7,4),(8,5),(8,5)
;WITH CTE AS
(SELECT ProductID, MAX(Price) AS Max_Price, MIN(Price) AS Min_Price
FROM #Input
GROUP BY ProductID
)
SELECT ProductID, CASE WHEN Max_Price > Min_Price THEN CONVERT(VARCHAR(10), Min_Price) + ', ' + CONVERT(VARCHAR(10),Max_Price)
ELSE CONVERT(VARCHAR(10), Min_Price) END AS Price_Range
FROM CTE

T-SQL Select to compute a result row on preceeding group/condition

How to achieve this result using a T-SQL select query.
Given this sample table :
create table sample (a int, b int)
insert into sample values (999, 10)
insert into sample values (16, 11)
insert into sample values (10, 12)
insert into sample values (25, 13)
insert into sample values (999, 20)
insert into sample values (14, 12)
insert into sample values (90, 45)
insert into sample values (18, 34)
I'm trying to achieve this output:
a b result
----------- ----------- -----------
999 10 10
16 11 10
10 12 10
25 13 10
999 20 20
14 12 20
90 45 20
18 34 20
The rule is fairly simple: if column 'a' has the special value of 999 the result for that row and following rows (unless the value of 'a' is again 999) will be the value of column 'b'. Assume the first record will have 999 on column 'a'.
Any hint how to implement, if possible, the select query without using a stored procedure or function?
Thank you.
António
You can do what you want if you add a column to specify the ordering:
create table sample (
id int identity(1, 1),
a int,
b int
);
Then you can do what you want by finding the "999" version that is most recent and copying that value. Here is a method using window functions:
select a, b, max(case when a = 999 then b end) over (partition by id_999) as result
from (select s.*,
max(case when a = 999 then id end) over (order by id) as id_999
from sample s
) s;
You need to have an id column
select cn.id, cn.a
, (select top (1) b from sample where sample.id <= cn.id and a = 999 order by id desc)
from sample as cn
order by id

T-SQL Combine Ranges Based On Value

I am using SQL Server 2012 and have been struggling with this query for hours. I am trying to aggregate mile post ranges based off the value in the Value column. The results should have unique segments with the highest value from the Value field for each segment. Here's an example:
Mile_Marker_Start | Mile_Marker_End | Value
0 100 5
50 150 6
100 200 10
75 300 9
150 200 7
And here's the result I'm looking for:
Mile_Marker_Start | Mile_Marker_End | Value
0 50 5
50 75 6
75 100 9
100 200 10
200 300 9
As you can see, the row with a value of 9 got split into 2 rows because Value 10 was bigger. Also, the row with Value 7 does not display because Value 10 was bigger. Can this be done without using a cursor? Any help would be much appreciated.
Thanks
I believe the following now does what you need. I'd recommend running all the parts separately so you can see what they do and how they work.
DECLARE #input AS TABLE
(Mile_Marker_Start int, Mile_Marker_End int, Value int)
INSERT INTO #input VALUES
(0,100,5), (50,150,6), (100,200,10), (75,300,9), (150,200,7)
DECLARE #staging as table
(Mile_Marker int)
INSERT INTO #staging
SELECT Mile_Marker_Start from #input
UNION -- this will remove duplicates
SELECT Mile_Marker_End from #input
; -- we need semi-colon for the following CTE
-- this CTE gets the right values, but the rows aren't "collapsed"
WITH all_markers AS
(
SELECT
groups.Mile_Marker_Start,
groups.Mile_Marker_End,
max(i3.Value) Value
FROM
(
SELECT
s1.Mile_Marker Mile_Marker_Start,
min(s2.Mile_Marker) Mile_Marker_End
FROM
#staging s1
JOIN #staging s2 ON
s1.Mile_Marker < s2.Mile_Marker
GROUP BY
s1.Mile_Marker
) as groups
JOIN #input i3 ON
i3.Mile_Marker_Start < groups.Mile_Marker_End AND
i3.Mile_Marker_End > groups.Mile_Marker_Start
GROUP BY
groups.Mile_Marker_Start,
groups.Mile_Marker_End
)
SELECT
MIN(collapse.Mile_Marker_Start) as Mile_Marker_Start,
MAX(collapse.Mile_Marker_End) as Mile_Marker_End,
collapse.Value
FROM
(-- Subquery get's IDs for the groups we're collapsing together
SELECT
am.*,
ROW_NUMBER() OVER (ORDER BY am.Mile_Marker_Start) - ROW_NUMBER() OVER (PARTITION BY am.Value ORDER BY am.Mile_Marker_Start) GroupID
FROM
all_markers am
) AS COLLAPSE
GROUP BY
collapse.GroupID,
collapse.Value
ORDER BY
MIN(collapse.Mile_Marker_Start)
Since you are on 2012 you could maybe use LEAD. Here is my code but as noted on your question by #stevelovell , we need clarification on how you are getting your result table.
--test date
declare #tablename TABLE
(
Mile_Marker_Start int,
Mile_Marker_End int,
Value int
);
insert into #tablename
values(0,100, 5),
(50,150, 6),
(100,200,10),
(75,300, 9),
(150,200, 7);
--query
select *
from #tablename
order by Mile_Marker_Start
select Mile_Marker_Start,
case when lead(mile_marker_start) over(order by mile_marker_start) < Mile_Marker_End THEN
lead(mile_marker_start) over(order by mile_marker_start)
ELSE
Mile_marker_end
END
AS MILE_MARKER_END,
Value
from #tablename
order by Mile_Marker_Start
Once you update your notes I will come back and update my answer.
Update: wasn't able to get LEAD and the other windowing functions to work with your requirements. With the way you need to move up and down the table current, and calculated values...

SQL Oracle, next free number from numeric column data

I have a table with an numeric column. There are data records, lets take for Example { 1,7, 10, 11, 12, 19, 20}. I want to use SQL to get the next "free" number from a specific x:
>8 for x=7
>8 for x=8
>13 for x=10
>21 for x=20
Does anybody have an idea?
OK: I want to insert something with an 'x'. The column is unique, so I cannot put something with x=7 in the table when there already is a '7' in there. So I want a routine that returns me '8' if there is already a record with '7'. Or '9' if there already is an '8'.
IN Pseudo SQL:
x = 7 // for example
select COL from myTable where COL= (x or if x does not exist, the y : y > x, y - x smallest possible)
SQL Fiddle
Oracle 11g R2 Schema Setup:
CREATE TABLE NUMBERS
("NUM" int)
;
INSERT ALL
INTO NUMBERS ("NUM")
VALUES (1)
INTO NUMBERS ("NUM")
VALUES (7)
INTO NUMBERS ("NUM")
VALUES (10)
INTO NUMBERS ("NUM")
VALUES (11)
INTO NUMBERS ("NUM")
VALUES (12)
INTO NUMBERS ("NUM")
VALUES (19)
INTO NUMBERS ("NUM")
VALUES (20)
SELECT * FROM dual
;
Query 1:
select min(n.VAL) as NextFree
from (
SELECT LEVEL as VAL
FROM DUAL
CONNECT BY LEVEL <= 100000
ORDER BY LEVEL
) n
left outer join NUMBERS d on n.VAL = d.NUM
where d.NUM is null
and n.VAL >= 10
Results:
| NEXTFREE |
|----------|
| 13 |
Similar to the answer above but there's no need for a left join:
select nvl(nextfree, 10) as nextfree from (
select max(d.NUM) + 1 as nextfree
from NUMBERS d
CONNECT BY PRIOR NUM + 1 = NUM
START WITH NUM = 10
)
| NEXTFREE |
|----------|
| 13 |

sql query logic

I have following data set
a b c
`1` 2 3
3 6 9
9 2 11
As you can see column a's first value is fixed (i.e. 1), but from second row it picks up the value of column c of previous record.
Column b's values are random and column c's value is calculated as c = a + b
I need to write a sql query which will select this data in above format. I tried writing using lag function but couldn't achieve.
Please help.
Edit :
Column b exists in table only, a and c needs to calculated based on the values of b.
Hanumant
SQL> select a
2 , b
3 , c
4 from dual
5 model
6 dimension by (0 i)
7 measures (0 a, 0 b, 0 c)
8 rules iterate (5)
9 ( a[iteration_number] = nvl(c[iteration_number-1],1)
10 , b[iteration_number] = ceil(dbms_random.value(0,10))
11 , c[iteration_number] = a[iteration_number] + b[iteration_number]
12 )
13 order by i
14 /
A B C
---------- ---------- ----------
1 4 5
5 8 13
13 8 21
21 2 23
23 10 33
5 rows selected.
Regards,
Rob.
Without knowing the relation between the rows ,how can we calculate the sum of the previous row a and b column to current row a column .I have created two more column id and parent in the table to find the relation between the two rows.
parent is the column which tell us about the previous row ,and id is the primary key of the row .
create table test1 (a number ,b number ,c number ,id number ,parent number);
Insert into TEST1 (A, B, C, ID) Values (1, 2, 3, 1);
Insert into TEST1 (B, PARENT, ID) Values (6, 1, 2);
Insert into TEST1 (B, PARENT, ID) Values (4, 2, 3);
WITH recursive (a, b, c,rn) AS
(SELECT a,b,c,id rn
FROM test1
WHERE parent IS NULL
UNION ALL
SELECT (rec.a+ rec.b) a
,t1.b b
,(rec.a+ rec.b+t1.b) c
,t1.id rn
FROM recursive rec,test1 t1
WHERE t1.parent = rec.rn
)
SELECT a,b,c
FROM recursive;
The WITH keyword defines the name recursive for the subquery that is to follow
WITH recursive (a, b, c,rn) AS
Next comes the first part of the named subquery
SELECT a,b,c,id rn
FROM test1
WHERE parent IS NULL
The named subquery is a UNION ALL of two queries. This, the first query, defines the starting point for the recursion. As in my CONNECT BY query, I want to know what is the start with record.
Next up is the part that was most confusing :
SELECT (rec.a+ rec.b) a
,t1.b b
,(rec.a+ rec.b+t1.b) c
,t1.id rn
FROM recursive rec,test1 t1
WHERE t1.parent = rec.rn
This is how it works :
WITH query: 1. The parent query executes:
SELECT a,b,c
FROM recursive;
This triggers execution of the named subquery. 2 The first query in the subquery's union executes, giving us a seed row with which to begin the recursion:
SELECT a,b,c,id rn
FROM test1
WHERE parent IS NULL
The seed row in this case will be for id =1 having parent is null. Let's refer to the seed row from here on out as the "new results", new in the sense that we haven't finished processing them yet.
The second query in the subquery's union executes:
SELECT (rec.a+ rec.b) a
,t1.b b
,(rec.a+ rec.b+t1.b) c
,t1.id rn
FROM recursive rec,test1 t1
WHERE t1.parent = rec.rn