Subtract value to multiple rows - sql

Well I am stuck at a point where I need to distribute a value across multiple rows. Since I do not know the specific term, I would put it in the form of example below for better understanding:
Assuming the value of x to be 20, I need to distribute/subtract it to rows in descending order.
TABLE:
ID Value1
1 6
2 5
3 4
4 3
5 9
Result should look like: (x=20)
ID Value1 Answer
1 6 14
2 5 9
3 4 5
4 3 2
5 9 0
Can anyone just give me an idea how I could go with this?

Untested for syntax, but the idea should work in SQL Server 2005 and newer.
SQL Server 2012 has SUM OVER clause which makes this even handier.
SELECT ID, Value1, CASE WHEN 20-SumA < 0 THEN 0 ELSE 20-SumA END AS Answer
FROM TABLE A
CROSS APPLY (SELECT SUM(B.Answer) SumA FROM TABLE B
WHERE B.ID <= A.ID) CA

It is perhaps easier to think of this problem in a different way. You want to calculate the cumulative sum of value1 and then subtract that value from #X. If the difference is negative, then put in 0.
If you are using SQL Server 2012, then you have cumulative sum built-in. You can do this as:
select id, value1,
(case when #X - cumvalue1 < 0 then 0 else #X - cumvalue1 end) as answer
from (select id, value1,
sum(value1) over (order by id) as cumvalue1
from table t
) t;
If you don't have cumulative sum, you can do this with a subquery instead:
select id, value1,
(case when #X - cumvalue1 < 0 then 0 else #X - cumvalue1 end) as answer
from (select id, value1,
(select sum(value1)
from table t2
where t2.id <= t.id
) as cumvalue1
from table t
) t;

I don't understand your question. I know what I think you're trying to do. But your example doesn't make sense.
You say you want to distribute 20 over the 5 rows, yet the sum of the difference between Value1 and Answer is only 3 (8+4+1+-1+-9).
And how do you want to distribute the values? Using a spread/split based on the value in Value1?
Edit: I made an example which splits 20 over the values you've specified above:
DECLARE #x FLOAT = 20.0
DECLARE #values TABLE (
ID INT,
VALUE FLOAT,
NEWVAL FLOAT)
INSERT INTO #values (ID, VALUE) VALUES (1,6), (2,5),(3,4),(4,3),(5,9)
UPDATE f
SET [NEWVAL] = [newValue]
FROM #values f
INNER JOIN (
SELECT
ID,
value + ((VALUE / [maxValue]) * #x) [newValue]
FROM
#values
CROSS APPLY (
SELECT
SUM(value) [maxValue]
FROM
#values
) m
) a ON a.ID = f.ID
SELECT * FROM #values
Unfortunately I had to change your values to floats for this to work. If you require them as integers, you'll need to use rounding and then calculate the difference of the sum of new value - #x and then spread the difference over the rows (if > 1 then add to lowest number, if < 1 subtract from largest value). Your rounding should be usually just 1 or 2.
I don't even know if I this is what you're trying to do yet.

Related

SQL aggregate and filter functions

Consider following table:
Number | Value
1 a
1 b
1 a
2 a
2 a
3 c
4 a
5 d
5 a
I want to choose every row, where the value for one number is the same, so my result should be:
Number | Value
2 a
3 c
4 a
I manage to get the right numbers by using nested
SQL-Statements like below. I am wondering if there is a simpler solution for my problem.
SELECT
a.n,
COUNT(n)
FROM
(
SELECT number n , value k
FROM testtable
GROUP BY number, value
) a
GROUP BY n
HAVING COUNT(n) = 1
You can try this
SELECT NUMBER,MAX(VALUE) AS VALUE FROM TESTTABLE
GROUP BY NUMBER
HAVING MAX(VALUE)=MIN(VALUE)
You can try also this:
SELECT DISTINCT t.number, t.value
FROM testtable t
LEFT JOIN testtable t_other
ON t.number = t_other.number AND t.value <> t_other.value
WHERE t_other.number IS NULL
Another alternative using exists.
select distinct num, val from testtable a
where not exists (
select 1 from testtable b
where a.num = b.num
and a.val <> b.val
)
http://sqlfiddle.com/#!9/dd080dd/5

Create a new table with columns with case statements and max function

I have some problems in creating a new table from an old one with new columns defined by case statements.
I need to add to a new table three columns, where I compute the maximum based on different conditions. Specifically,
if time is between 1 and 3, I define a variable max_var_1_3 as max((-1)*var),
if time is between 1 and 6, I define a variable max_var_1_6 as max((-1)*var),
if time is between 1 and 12, I define a variable max_var_1_12 as max((-1)*var),
The max function needs to take the maximum value of the variable var in the window between 1 and 3, 1 and 6, 1 and 12 respectively.
I wrote this
create table new as(
select t1.*,
(case when time between 1 and 3 then MAX((-1)*var)
else var
end) as max_var_1_3,
(case when time between 1 and 6 then MAX((-1)*var)
else var
end) as max_var_1_6,
(case when time between 1 and 12 then MAX((-1)*var)
else var
end) as max_var_1_12
from old_table t1
group by time
) with data primary index time
but unfortunately it is not working. The old_table has already some columns, and I would like to import all of them and then compare the old table with the new one. I got an error that says that should be something between ) and ',', but I cannot understand what. I am using Teradata SQL.
Could you please help me?
Many thanks
The problem is that you have GROUP BY time in your query while trying to return all the other values with your SELECT t1.*. To make your query work as-is, you'd need to add each column from t1.* to your GROUP BY clause.
If you want to find the MAX value within the different time ranges AND also return all the rows, then you can use a window function. Something like this:
CREATE TABLE new AS (
SELECT
t1.*,
CASE
WHEN t1.time BETWEEN 1 AND 3 THEN (
MAX(CASE WHEN t1.time BETWEEN 1 AND 3 THEN (-1 * t1.var) ELSE NULL END) OVER()
)
ELSE t1.var
END AS max_var_1_3,
CASE
WHEN t1.time BETWEEN 1 AND 6 THEN (
MAX(CASE WHEN t1.time BETWEEN 1 AND 6 THEN (-1 * t1.var) ELSE NULL END) OVER()
)
ELSE t1.var
END AS max_var_1_6,
CASE
WHEN t1.time BETWEEN 1 AND 12 THEN (
MAX(CASE WHEN t1.time BETWEEN 1 AND 12 THEN (-1 * t1.var) ELSE NULL END) OVER()
)
ELSE t1.var
END AS max_var_1_12,
FROM old_table t1
) WITH DATA PRIMARY INDEX (time)
;
Here's the logic:
check if a row falls in the range
if it does, return the desired MAX value for rows in that range
otherwise, just return that given row's default value (var)
return all rows along with the three new columns
If you have performance issues, you could also move the max_var calculations to a CTE, since they only need to be calculated once. Also to avoid confusion, you may want to explicitly specify the values in your SELECT instead of using t1.*.
I don't have a TD system to test, but try it out and see if that works.
I cannot help with the CREATE TABLE AS, but the query you want is this:
SELECT
t.*,
(SELECT MAX(-1 * var) FROM old_table WHERE time BETWEEN 1 AND 3) AS max_var_1_3,
(SELECT MAX(-1 * var) FROM old_table WHERE time BETWEEN 1 AND 6) AS max_var_1_6,
(SELECT MAX(-1 * var) FROM old_table WHERE time BETWEEN 1 AND 12) AS max_var_1_12
FROM old_table t;

Can I get the minimum of 2 columns which is greater than a given value using only one scan of a table

This is my example data (there are no indexes and I do not want to create any):
CREATE TABLE tblTest ( a INT , b INT );
INSERT INTO tblTest ( a, b ) VALUES
( 1 , 2 ),
( 5 , 1 ),
( 1 , 4 ),
( 3 , 2 )
I want the minimum value in of both column a and column b which is greater then a given value. E.g. if the given value is 3 then I want 4 to be returned.
This is my current solution:
SELECT MIN (subMin) FROM
(
SELECT MIN (a) as subMin FROM tblTest
WHERE a > 3 -- Returns 5
UNION
SELECT MIN (b) as subMin FROM tblTest
WHERE b > 3 -- Returns 4
)
This searches the table twice - once to get min(a) once to get min(b).
I believe it should be faster to do this with just one pass. Is this possible?
You want to use conditional aggregatino for this:
select min(case when a > 3 then a end) as minA,
min(case when b > 3 then b end) as minB
from tblTest;
To get the minimum of both values, you can use a SQLite extension, which handles multiple values for min():
select min(min(case when a > 3 then a end),
min(case when b > 3 then b end)
)
from tblTest
The only issue is that the min will return NULL if either argument is NULL. You can fix this by doing:
select coalesce(min(min(case when a > 3 then a end),
min(case when b > 3 then b end)
),
min(case when a > 3 then a end),
min(case when b > 3 then b end)
)
from tblTest
This version will return the minimum value, subject to your conditions. If one of the conditions has no rows, it will still return the minimum of the other value.
From the top of my head, you could modify the table and add a min value column to store the minimum value of the two columns. then query that column.
Or you can do this:
select min(val)
from
(
select min(col1, col2) as val
from table1
)
where
val > 3
The outer SELECT, queries the memory, not the table itself.
Check SQL Fiddle

SQL (TSQL) - Select values in a column where another column is not null?

I will keep this simple- I would like to know if there is a good way to select all the values in a column when it never has a null in another column. For example.
A B
----- -----
1 7
2 7
NULL 7
4 9
1 9
2 9
From the above set I would just want 9 from B and not 7 because 7 has a NULL in A. Obviously I could wrap this as a subquery and USE the IN clause etc. but this is already part of a pretty unique set and am looking to keep this efficient.
I should note that for my purposes this would only be a one-way comparison... I would only be returning values in B and examining A.
I imagine there is an easy way to do this that I am missing, but being in the thick of things I don't see it right now.
You can do something like this:
select *
from t
where t.b not in (select b from t where a is null);
If you want only distinct b values, then you can do:
select b
from t
group by b
having sum(case when a is null then 1 else 0 end) = 0;
And, finally, you could use window functions:
select a, b
from (select t.*,
sum(case when a is null then 1 else 0 end) over (partition by b) as NullCnt
from t
) t
where NullCnt = 0;
The query below will only output one column in the final result. The records are grouped by column B and test if the record is null or not. When the record is null, the value for the group will increment each time by 1. The HAVING clause filters only the group which has a value of 0.
SELECT B
FROM TableName
GROUP BY B
HAVING SUM(CASE WHEN A IS NULL THEN 1 ELSE 0 END) = 0
If you want to get all the rows from the records, you can use join.
SELECT a.*
FROM TableName a
INNER JOIN
(
SELECT B
FROM TableName
GROUP BY B
HAVING SUM(CASE WHEN A IS NULL THEN 1 ELSE 0 END) = 0
) b ON a.b = b.b

group by on a range

i have a table with employee names and their vendor experiences.
i have to create a table with the following data
data given to me is like
empname vendor experience
a 1
b 2
c 10
d 11
e 20
f 12
g 21
h 22
i want to generate a SQL query to display data like this
vendor_experience(months) count
0-6 2
0-12 5
0-18 5
more 8
please help me with the query.
You might employ case statement to get counts of exclusive ranges:
select case when [vendor experience] <= 6 then '0-6'
when [vendor experience] <= 12 then '0-12'
when [vendor experience] <= 18 then '0-18'
else 'more'
end [vendor_experience(months)],
count (*) [count]
from experiences
group by
case when [vendor experience] <= 6 then '0-6'
when [vendor experience] <= 12 then '0-12'
when [vendor experience] <= 18 then '0-18'
else 'more'
end
This produces the same result as yours (inclusive ranges):
; with ranges as
(
select 6 as val, 0 as count_all
union all
select 12, 0
union all
select 18, 0
union all
select 0, 1
)
select case when ranges.count_all = 1
then 'more'
else '0-' + convert (varchar(10), ranges.val)
end [vendor_experience(months)],
sum (case when ranges.count_all = 1
or experiences.[vendor experience] <= ranges.val
then 1 end) [count]
from experiences
cross join ranges
group by ranges.val, ranges.count_all
count_all is set to 1 to mark open-ending range.
Sql Fiddle is here.
UPDATE: an attempt at explanation.
The first part starting with with and ending with closing bracket is called CTE. Sometimes it is referred to as inline view because it can be used multiple times in the same query and under some circumstances is updateable. Here it is used to prepare data for ranges and is appropriately named ranges. This name one uses in main query. Val is maximum value of a range, count_all is 1 if range has no upper end (18+, more, or however you wish to call it). Data rows are combined by means of union all. You might copy/paste section between parenthesis only and run it just to see the results.
Main body joins experiences table with ranges using cross join. This creates combinations of all rows from experiences and ranges. For row d 11 there will be 4 rows,
empname vendor experience val count_all
d 11 6 0
d 11 12 0
d 11 18 0
d 11 0 1
First case statement in select list produces caption by checking count_all - if it is one, outputs more, else constructs caption using upper range value. Second case statement counts using sum(1). As aggregate functions ignore nulls, and case having no else evaluates to null if match was not found, it is sufficient to check if count_all is true (meaning that this row from experiences is counted in this range) or if vendor experience is less or equal to upper range value of current range. In example above 11 will not be counted for first range but will be counted for all the rest.
Results are then grouped by val and count_all. To better see how it works you might remove group by and sum() and look at numbers before aggregation. Order by empname, val will help to see how values of [count] change depending on different val per an employee.
Note: I did my best with my current level of english language. Please don't hesitate to ask for clarification if you need one (or two, or as many as you need).
A bit more dynamic, implement a table for the groupings:
create table #t (name varchar(10),e int)
insert into #t values ('a',0)
insert into #t values ('b',4)
insert into #t values ('c',3)
insert into #t values ('d',13)
insert into #t values ('e',25)
insert into #t values ('f',4)
insert into #t values ('g',19)
insert into #t values ('h',15)
insert into #t values ('i',7)
create table #g (t int, n varchar(10))
insert into #g values (6, '0-6')
insert into #g values (12, '0-12')
insert into #g values (18, '0-18')
insert into #g values (99999, 'more')
select #g.n
,COUNT(*)
from #g
inner join #t on #t.e <= #g.t
group by #g.n
you might want to play around with the value 99999 for example.
Here is a way to get the cumulative values:
select sum(mon0_6) as mon0_6, sum(mon0_12) as mon0_12, sum(mon0_18) as mon0_18,
sum(more) as more
from (select e.*,
(case when [vendor experience] <= 6 then 1 else 0 end) as mon0_6,
(case when [vendor experience] <= 12 then 1 else 0 end) as mon0_12,
(case when [vendor experience] <= 18 then 1 else 0 end) as mon0_18,
1 as more
) e
This puts them in separate columns. You can then use unpivot to put them in separate rows.
However, you might consider doing the cumulative sum at the application layer. I often do this sort of thing in Excel.
Doing a cumulative sum in SQL Server 2008 requires a self-join, either explicitly or via a correlated subquery. SQL Server 2012 supports much simpler syntax for cumulative sums (the over clause takes an order by argument).
Try this:
INSERT INTO ResultTable ([vendor_experience(months)], count)
Select *FROM
(
(SELECT '0-6', Count(*) From TableA WHERE [vendor experience] <= 6
UNION ALL
SELECT '0-12', Count(*) From TableA WHERE [vendor experience] <= 12
UNION ALL
SELECT '0-18', Count(*) From TableA WHERE [vendor experience] <= 18
UNION ALL
SELECT 'more', Count(*) From TableA) as Temp
)
If duplicate counts not needed, then try this:
select t.[vendor_experience(months)], count(*) as count
from (
select case
when [vendor experience] between 0 and 6 then ' 0-6'
when [vendor experience] between 7 and 12 then '0-12'
when [vendor experience] between 13 and 18 then '0-18'
when [vendor experience] >= 19 then 'more'
else 'other' end as [vendor_experience(months)]
from TableA) t
group by t.[vendor_experience(months)]