How to get a difference between two rows - sql

I want to find the difference between two rows on the same column group by id
ID Value1 Value2
a 500 200
b 300 200
a 100 300
b 300 400
....
Expected output
ID Value1 Value2
a 400 -100
b 0 -200
....
How to make a query for the above condition.

You can use following:
SELECT
ID,
MAX(Value1) - MIN(Value1),
MIN(Value2) - MAX(Value2)
FROM
myTableName
GROUP BY
ID
But there is one assumption: the second row has always greater Value1 and lower Value2 than first one.

You can try:
SELECT t1.ID, max(t2.VALUE1 - t1.VALUE1)
FROM TABLE1 t1
left join TABLE1 t2 on t1.id = t2.id
group by t1.id
SQL FIDDLE DEMO:

This query will give a absolute difference between max value of ID and min value of same ID:
SELECT ID
, ABS(MAX(VALUE1) - MIN(VALUE1)) AS v1Diff
, ABS(MAX(VALUE2) - MIN(VALUE2)) AS v2Diff
FROM TABLE1
GROUP BY ID
Sql Fiidle
But if you want get a real difference(negative diff) then we need to know which row is first and which row is next. Then we can count difference like firstRowValue - nextRowValue.
Maybe your table has some RowID or DateTime column from where we can ordering a rows from same ID.
What column/columns are Primery Key in your table?

Use option with CTE and ROW_NUMBER() ranking function
;WITH cte AS
(
SELECT ID,
CASE ROW_NUMBER() OVER(PARTITION BY ID ORDER BY 1/0) % 2
WHEN 1 THEN Value1
WHEN 0 THEN -1 * Value1 END AS Value1,
CASE ROW_NUMBER() OVER(PARTITION BY ID ORDER BY 1/0) % 2
WHEN 1 THEN Value2
WHEN 0 THEN -1 * Value2 END AS Value2
FROM dbo.test22
)
SELECT ID, SUM(Value1) AS Value1, SUM(Value2) AS Value2
FROM cte
GROUP BY ID
Demo on SQLFiddle

Related

How to select the top 3 values from a group based on date and exclude duplicate value?

If I three columns and 1 column has ID, 1 column has value and 1 column has date. Example, ID column has ID1, ID2, ID3. The value for each ID has a numeric value, say 1,2,3,4,5 for each ID.
How do I only get 3 results for each ID based on the most recent date descending.
I am using Sybase SQL. Is there any way I can write this?
I tried to use Row_number() and rank() but I don't get to use either of those functions with my SQL tool.
ID value Date
1 3 20190511
1 1 20190503
1 5 20190401
2 2 20190520
2 1 20190514
2 4 20190503
3 1 20190516
3 5 20190415
3 3 20190402
If you don't have row_number try this
SELECT *
FROM yourTable t1
WHERE (SELECT COUNT(*)
FROM yourTable t2
WHERE t1.id = t2.id
AND t1.date < t2.date) < 3
So if one id have 3 or more older rows wont appear.
with row_number
SELECT *
FROM (SELECT *, ROW_NUMBER() OVER (PARTITION BY id ORDER BY date DESC) as rn
FROM YourTable t1
) as t
WHERE t.rn <= 3
I assume you cant have multiple rows in same date. In that case you may want use RANK() or DENSE_RANK() and decide how handle ties.
One method uses a correlated subquery with in:
select t.*
from t
where t.date in (select top (3) t2.date
from t t2
where t2.id = t.id
order by t2.date desc
);
Note that this assumes that the dates are unique.

sql - getting sum of same column from multiple tables

I have a few tables in my DB. Let's call them table1, table2, table3.
All of them have a column named value.
I need to create a query that will return a single number, where this number is the sum of all the value columns from all the tables together...
I've tried the following way:
SELECT (SELECT SUM(value) FROM table1) + (SELECT SUM(value) FROM table2) + (SELECT SUM(value) FROM table3) as total_sum
But when at least one of the inner SUM is NULL, the entire total value (total_sum here) is NULL, so that's not very trustworthy.
When there is no value in a certain inner SUM query, I need it to return 0, so it doesn't affect the rest of the SUM.
To make it more clear, let's say I have the following 2 tables:
TABLE1:
ID | NAME | VALUE
1 Name1 1000
2 Name2 2000
3 Name3 3000
TABLE2:
ID | NAME | VALUE
1 Name1 1500
2 Name2 2500
3 Name3 3500
Eventually, the query I need will return a single value - 13500, which is the total sum of all the values under the VALUE column of all the tables here.
All the other columns have no meaning for the needed query, and I even don't care much for performance in this case.
You can achieve it using Coalesce as follows
SELECT
(SELECT coalesce(SUM(value),0) FROM table1) +
(SELECT coalesce(SUM(value),0) FROM table2) +
(SELECT coalesce(SUM(value),0) FROM table3) as total_sum
Another approach is to use union all to merge all values into single table
select distinct coalesce(sum(a.value), 0) as total_sum from
(select value from table1
union all
select value from table 2
union all
select value from table 3) a;
You can use the ISNULL function to take care of the NULLs.
SELECT ISNULL((
SELECT SUM(value) FROM table1
)
, 0
) + ISNULL((
SELECT SUM(value) FROM table2
)
, 0
) + ISNULL((
SELECT SUM(value) FROM table3
)
, 0
) AS total_sum;
You could simply sum all of them:
select sum(total) as Total
from (
select sum(value) as total from Table1
union all
select sum(value) as total from Table2
union all
select sum(value) as total from Table3
) t;

find row number by group in SQL server table with duplicated rows

I need to count the row number by group in a table with some duplications.
Table:
id va1ue1 value2
1 3974 39
1 3974 39
1 972 5
1 972 10
SQL:
select id, value1, value2, COUNT(*) cnt
FROM table
group by id, value1, value2
having COUNT(*) > 1
The code only count the duplicated rows.
I need:
id, value1, value2
1 972 5
1 972 10
I do not need to count the duplicated rows, I only need the rows that value1 has more than one distinct values in value2 column.
Thanks
Use DISTINCT:
select id, value1, count(distinct value2) cnt
from table
group by id, value1
having count(distinct value2) > 1
If you want detais then:
select * from table t1
cross apply(select cnt from(
select count(distinct value2) cnt
from table t2
where t1.id = t2.id and t1.value1 = t2.value1) t
where cnt > 1)ca
In SQL Server 2008, you can use a trick to count distinct values using window functions. You might find this a nice solution:
select t.id, t.value1, t.value2
from (select t.*, sum(case when seqnum = 1 then 1 else 0 end) over (partition by value1) as numvals
from (select t.*, row_number() over (partition by value1, value2 order by (select null)) as seqnum
from table t
) t
) t
where numvals > 1;
Try it this way without a GROUP BY:
select id, value1, value2
FROM table AS T1
where 1 < (
select COUNT(*)
FROM table AS T2
where T1.value1 = T2.value1)
Try this
;WITH CTE
AS ( SELECT id ,
value1 ,
value2 ,
COUNT(*) cnt
FROM table
GROUP BY id ,
value1 ,
value2
HAVING COUNT(*) > 1
)
SELECT *
FROM table1
WHERE value1 IN ( SELECT value1
FROM CTE )
Simply use a NOT after HAVING, which precisely gets you the rows which are NOT duplicated.
select id, value1, value2
FROM [table]
group by id, value1, value2
having NOT COUNT(*) > 1
Fiddle here.
If you want the actual rows from the table, not just the qualifying id, value1 pairs, you could do this:
WITH discrepancies AS (
SELECT,
id,
value1,
value2,
distinctcount = COUNT(DISTINCT value2) OVER (PARTITION BY id, value1)
FROM
dbo.atable
)
SELECT
id,
value1,
value2
FROM
discrepancies
WHERE
distinctcount > 1
;
if SQL Server 2008 supported COUNT(DISTINCT ...) with an OVER clause.
Basically, it would be the same idea as Giorgi Nakeuri's one, more or less, except you would not be hitting the table more than once.
Alas, there is no support for COUNT(DISTINCT ...) OVER ... in SQL Server so far. Still, you can use a different method, which will still allow you to touch the table just once and return detail rows nevertheless:
WITH discrepancies AS (
SELECT,
id,
value1,
value2,
minvalue2 = MIN(value2) OVER (PARTITION BY id, value1),
maxvalue2 = MAX(value2) OVER (PARTITION BY id, value1)
FROM
dbo.atable
)
SELECT
id,
value1,
value2
FROM
discrepancies
WHERE
minvalue2 <> maxvalue2
;
The idea here is to get MIN(value2) and MAX(value2) per each id, value1 and to see if those differ. If they do, that means you have a discrepancy in this id, value1 subset and you want that row to be returned.
The method takes advantage of aggregates with an OVER clause to avoid a self-join, and that is precisely the reason why the table is accessed just once here.

Duplicate Counts - TSQL

I want to get All records that has duplicate values for SOME of the fields (i.e. Key columns).
My code:
CREATE TABLE #TEMP (ID int, Descp varchar(5), Extra varchar(6))
INSERT INTO #Temp
SELECT 1,'One','Extra1'
UNION ALL
SELECT 2,'Two','Extra2'
UNION ALL
SELECT 3,'Three','Extra3'
UNION ALL
SELECT 1,'One','Extra4'
SELECT ID, Descp, Extra FROM #TEMP
;WITH Temp_CTE AS
(SELECT *
, ROW_NUMBER() OVER (PARTITION BY ID, Descp ORDER BY (SELECT 0))
AS DuplicateRowNumber
FROM #TEMP
)
SELECT * FROM Temp_cte
DROP TABLE #TEMP
The last column tells me how many times each row has appeared based on ID and Descp values.
I want that row but I ALSO need another column* that indicates both rows for ID = 1 and Descp = 'One' has showed up more than once.
So an extra column* (i.e. MultipleOccurances (bool)) which has 1 for two rows with ID = 1 and Descp = 'One' and 0 for other rows as they are only showing up once.
How can I achieve that? (I want to avoid using Count(1)>1 or something if possible.
Edit:
Desired output:
ID Descp Extra DuplicateRowNumber IsMultiple
1 One Extra1 1 1
1 One Extra4 2 1
2 Two Extra2 1 0
3 Three Extra3 1 0
SQL Fiddle
You say "I want to avoid using Count" but it is probably the best way. It uses the partitioning you already have on the row_number
SELECT *,
ROW_NUMBER() OVER (PARTITION BY ID, Descp
ORDER BY (SELECT 0)) AS DuplicateRowNumber,
CASE
WHEN COUNT(*) OVER (PARTITION BY ID, Descp) > 1 THEN 1
ELSE 0
END AS IsMultiple
FROM #Temp
And the execution plan just shows a single sort
Well, I have this solution, but using a Count...
SELECT T1.*,
ROW_NUMBER() OVER (PARTITION BY T1.ID, T1.Descp ORDER BY (SELECT 0)) AS DuplicateRowNumber,
CASE WHEN T2.C = 1 THEN 0 ELSE 1 END MultipleOcurrences FROM #temp T1
INNER JOIN
(SELECT ID, Descp, COUNT(1) C FROM #TEMP GROUP BY ID, Descp) T2
ON T1.ID = T2.ID AND T1.Descp = T2.Descp

SQL Server / T-SQL : How to update equal percentages of a resultset?

I need a way to take a resultset of KeyIDs and divide it up as equally as possible and update records differently for each division based on the KeyIDs. In other words, there is
SELECT KeyID
FROM TableA
WHERE (some criteria exists)
I want to update TableA 3 different ways by 3 equal portions of KeyIDs.
UPDATE TableA
SET FieldA = Value1
WHERE KeyID IN (the first 1/3 of the SELECT resultset above)
UPDATE TableA
SET FieldA = Value2
WHERE KeyID IN (the second 1/3 of the SELECT resultset above)
UPDATE TableA
SET FieldA = Value3
WHERE KeyID IN (the third 1/3 of the SELECT resultset above)
or something to that effect. Thanks for any and all of your responses.
With TiledItems As
(
Select KeyId
, NTILE(3) OVER( ORDER BY ... ) As NTileNum
From TableA
Where ...
)
Update TableA
Set FieldA = Case TI.NTileNum
When 1 Then Value1
When 2 Then Value2
When 3 Then Value3
End
From TableA As A
Join TiledItems As TI
On TI.KeyId = A.KeyId
Unfortunately I haven't got time to knock up a complete solution but the gist of one would be to use a CTE with the NTILE function http://msdn.microsoft.com/en-us/library/ms175126.aspx to divide into 3 groups then join onto that CTE in your UPDATE statement and do a CASE statement against the NTILE group to determine whether to use Value1, Value2, or Value3.
Edit
See Thomas's answer for the code for this as looks like he had the same idea!
For a simple distribution, create a random ranking and modulo by 3...
UPDATE
A
SET
FieldA =
CASE Ranking % 3
WHEN 1 THEN B.Value1
WHEN 2 THEN B.Value2
WHEN 0 THEN B.Value3
END
FROM
TableA A
inner join
(SELECT
ID,
ROW_NUMBER() OVER (ORDER BY ID /*or something*/) AS Ranking,
Value1, Value2, Value3
FROM
TableA
) B on A.ID = B.ID
where (some criteria exists)
You can change the ORDER BY for the ROW_NUMBER(), or use NTILE and remove the modulo
If the keys are evenly-distributed, then you could use the modulus (%) operator to select out unique thirds of the result set.
update TableA set FieldA = Value1 where KeyID % 3 = 0;
update TableA set FieldA = Value2 where KeyID % 3 = 1;
update TableA set FieldA = Value3 where KeyID % 3 = 2;
Interpreting what you say literally, you could number the rows in the returned row set, and then select the different segements based on their row number.
E.g.
UPDATE TableA
SET FieldA = Value1
WHERE KeyID IN (SELECT * FROM (SELECT <your rows>, ROW_NUMBER() (ORDER BY <anyRow>) AS RowNumber FROM <yourTable> ) base
WHERE RowNumber<Count(RowNumber)/3)
UPDATE TableA
SET FieldA = Value1
WHERE KeyID IN (SELECT * FROM (SELECT <your rows>, ROW_NUMBER() (ORDER BY <anyRow>) AS RowNumber FROM <yourTable> ) base
WHERE RowNumber<Count(RowNumber)*2/3 && RowNumber>=Count(RowNumber)/3)
UPDATE TableA
SET FieldA = Value1
WHERE KeyID IN (SELECT * FROM (SELECT <your rows>, ROW_NUMBER() (ORDER BY <anyRow>) AS RowNumber FROM <yourTable> ) base
WHERE owNumber>=Count(RowNumber)*2/3)
WITH Query (OtherKeyID, PCT)
AS
(
SELECT KeyID, (ROW_NUMBER() OVER (ORDER BY KeyID)) / foo.CNT AS PCT
FROM TableA
JOIN (SELECT CONVERT(float, COUNT(1)) AS CNT FROM TableA) foo ON 1 = 1
WHERE (criteria)
)
UPDATE TableA
SET FieldA = (CASE
WHEN PCT < .3333 THEN Value1
WHEN PCT BETWEEN .3333 and .6666 THEN Value2
WHEN PCT > .6666 THEN Value3 ELSE NULL END)
FROM Query
WHERE KeyID = OtherKeyID AND PCT < .3333
Note that you can alter the ORDER BY clause in the query to any valid expression, which will allow you to define your "first third" by any criteria.