SQL - subtract value from same column - sql

I have a table as follows
fab_id x y z m
12 14 10 3 5
12 10 10 3 4
Here im using group by clause on id .Now i want to subtract those column values which have similar id.
e.g group by on id (12). Now to subtract (14-10)X, (10-10)Y, (3-3)z, (5-4)m
I know there is a aggregate function sum for addition but is there any function which i can use to subtract this value.
Or is there any other method to achieve the results.
Note- There may be a change that value may come in -ve. So any function handle this?
one more example - (order by correction_date desc so result will show recent correction first)
fab_id x y z m correction_date
14 20 12 4 4 2014-05-05 09:03
14 24 12 4 3 2014-05-05 08:05
14 26 12 4 6 2014-05-05 07:12
so result to achieve group by on id (14). Now to subtract (26-20)X, (12-12)Y, (4-4)z, (6-4)m

Now, that you have given more information on how to deal with more records and that you revealed that there is a time column involved, here is a possible solution. The query selects the first and last record per fab_id and subtracts the values:
select
fab_info.fab_id,
earliest_fab.x - latest_fab.x,
earliest_fab.y - latest_fab.y,
earliest_fab.z - latest_fab.z,
earliest_fab.m - latest_fab.m
from
(
select
fab_id,
min(correction_date) as min_correction_date,
max(correction_date) as max_correction_date
from fab
group by fab_id
) as fab_info
inner join fab as earliest_fab on
earliest_fab.fab_id = fab_info.fab_id and
earliest_fab.min_correction_date = fab_info.min_correction_date
inner join fab as latest_fab on
latest_fab.fab_id = fab_info.fab_id and
latest_fab.min_correction_date = fab_info.max_correction_date;

Provided you always want to subtract the least value from the greatest value:
select
fab_id,
max(x) - min(x),
max(y) - min(y),
max(z) - min(z),
max(m) - min(m)
from fab
group by fab_id;

Seeing as you say there will always be two rows, you can simply do a 'self join' and subtract the values from each other:
SELECT t1.fab_id, t1.x - t2.x as diffx, t1.y - t2.y as diffy, <remainder columns here>
from <table> t1
inner join <table> t2 on t1.fab_id = t2.fab_id and t1.correctiondate > t2.correctiondate
If you have more than two rows, then you'll need to make subqueries or use window ranking functions to figure out the largest and smallest correctiondate for each fab_id and then you can do the very same as above by joining those two subqueries together instead of

Unfortunately, it's SQL Server 2012 that has the handy FIRST_VALUE()/LAST_VALUE() OLAP functions, so in the case of more than 2 rows we have to do something a little different:
SELECT fab_id, SUM(CASE WHEN latest = 1 THEN -x ELSE x END) AS x,
SUM(CASE WHEN latest = 1 THEN -y ELSE y END) AS y,
SUM(CASE WHEN latest = 1 THEN -z ELSE z END) AS z,
SUM(CASE WHEN latest = 1 THEN -m ELSE m END) AS m
FROM (SELECT fab_id, x, y, z, m,
ROW_NUMBER() OVER(PARTITION BY fab_id
ORDER BY correction_date ASC) AS earliest,
ROW_NUMBER() OVER(PARTITION BY fab_id
ORDER BY correction_date DESC) AS latest
FROM myTable) fab
WHERE earliest = 1
OR latest = 1
GROUP BY fab_id
HAVING COUNT(*) >= 2
(and working fiddle. Thanks to #AK47 for the initial setup.)
Which yields the expected:
FAB_ID X Y Z M
12 4 0 0 1
14 6 0 0 2
Note that HAVING COUNT(*) >= 2 is so that only rows with changes are considered (you'd get some null result columns otherwise).

;with Ordered as
(
select
fab_id,x,y,z,m,date,
row_Number() over (partition by fab_id order by date desc) as Latest,
row_Number() over (partition by fab_id order by date) as Oldest
from fab
)
select
O1.fab_id,
O1.x-O2.x,
O1.y-O2.y,
O1.z-O2.z,
O1.m-O2.m
from Ordered O1
join Ordered O2 on
O1.fab_id = O2.fab_id
where O1.latest = 1 and O2.oldest = 1

I think if you have consistent set or two rows, then following code should work for you.
select fab_id ,max(x) - min(x) as x
,max(y) - min(y) as y
,max(z) - min(z) as z
,max(m) - main(m) as m
from Mytable
group by fab_id
It will work, even if you get more than 2 rows in a group, but subtraction will be from max value of min value. hope it helps you.
EDIT : SQL Fiddle DEMO

A CTE could help:
WITH cte AS (
SELECT
-- Get the row numbers per fab_id ordered by the correction date
ROW_NUMBER() OVER (PARTITION BY fab_id ORDER BY correction_date ASC) AS rid
, fab_id, x, y, z, m
FROM
YourTable
)
SELECT
fab_id
-- If the row number is 1 then, this is our base value
-- If the row number is not 1 then, we want to subtract it (or add the negative value)
, SUM(CASE WHEN rid = 1 THEN x ELSE x * -1 END) AS x
, SUM(CASE WHEN rid = 1 THEN y ELSE y * -1 END) AS y
, SUM(CASE WHEN rid = 1 THEN z ELSE z * -1 END) AS z
, SUM(CASE WHEN rid = 1 THEN m ELSE m * -1 END) AS m
FROM
cte
GROUP BY
fab_id
Remember, 40-10-20 equals to 40 + (-10) + (-20)

Related

How to compare a number with count result then use it in limit statement in redshift/sql

I have a table with two columns id and flag.
The data is very imbalanced. Only a few flag has value 1 and others are 0.
id flag
1 0
2 0
3 0
4 0
5 1
6 1
7 0
Now I want to create a balanced table. Therefore, I want get a subset from flag = 0 based on the number of records where flag = 1. Also, I don't want the number to be greater than 1000.
I am thinking about a code like this:
select *
from table
where flag = 0
order by random()
limit (least(1000,
select count(*)
from table
where flag = 1));
Expected result(Only two records have flag as 1 so I get two records with flag as 0, if there are more than 1000 records have flag as 1 I will only get 1000.):
id flag
2 0
7 0
If you want a balanced sample:
select t.*
from (select t.*, row_number() over (partition by flag order by flag) as seqnum,
sum(case when flag = 1 then 1 else 0 end) over () as cnt_1
from t
) t
where seqnum <= cnt_1;
You can change this to:
where seqnum <= least(cnt_1, 1000)
If you want an overall maximum.
You can use row_number to simulate LIMIT.
select * from (
select column1, column2, row_number() OVER() AS rownum
from table
where flag = 0 )
where rownum < 1000
If I’ve made a bad assumption please comment and I’ll refocus my answer.

Select rows until condition met

I would like to write an Oracle query which returns a specific set of information. Using the table below, if given an id, it will return the id and value of B. Also, if B=T, it will return the next row as well. If that next row has a B=T, it will return that, and so on until a F is encountered.
So, given 3 it would just return one row: (3,F). Given 4 it would return 3 rows: ((4,T),(5,T),(6,F))
id B
1 F
2 F
3 F
4 T
5 T
6 F
7 T
8 F
Thank you in advance!
Use a sub-query to find out at what point you should stop, then return all row from your starting point to the calculated stop point.
SELECT
*
FROM
yourTable
WHERE
id >= 4
AND id <= (SELECT MIN(id) FROM yourTable WHERE b = 'F' AND id >= 4)
Note, this assumes that the last record is always an 'F'. You can deal with the last record being a 'T' using a COALESCE.
SELECT
*
FROM
yourTable
WHERE
id >= 4
AND id <= COALESCE(
(SELECT MIN(id) FROM yourTable WHERE b = 'F' AND id >= 4),
(SELECT MAX(id) FROM yourTable )
)

how to calculate count(*) in various percentiles

Say, I have a table holding integer values from 0 up to 9,999 and I want to make a distribution plot of the population of values in each percentile.
Below is what comes to mind. Is there a better way?
CREATE TABLE A(x INTEGER);
SELECT
(SELECT COUNT(*) FROM A WHERE x>=0 AND x<10) AS prcntl_01,
(SELECT COUNT(*) FROM A WHERE x>=10 AND x<20) AS prcntl_02,
(SELECT COUNT(*) FROM A WHERE x>=20 AND x<30) AS prcntl_03,
(SELECT COUNT(*) FROM A WHERE x>=30 AND x<40) AS prcntl_04,
(SELECT COUNT(*) FROM A WHERE x>=40 AND x<50) AS prcntl_05,
...
(SELECT COUNT(*) FROM A WHERE x>=990 AND x<1000) AS prcntl_100,
The size of the SQL statement is not a consideration as I can generate it on the fly. I am just wondering if there is an idiomatic way to get population counts in each percentile.
Use conditional aggregation instead of multiple queries:
SELECT sum(case when x >= 0 AND x < 10 then 1 else 0 end) as prcntl_01,
sum(case when x >= 10 AND x < 20 then 1 else 0 end) as prcntl_02,
. . .
sum(case when x >= 990 AND x < 1000 then 1 else 0 end) as prcntl_100
FROM A;
If you want the values in separate rows rather than columns, you can simply do:
select n as which,
sum(case when x >= (n - 1)*10 and x < n*10 - 1 then 1 else 0 end) as percentile
from A cross join
generate_series(1, 100) as n
group by n;
This limits the amount of code you have to write.

T-Sql: turn multiple rows into one row

How does one turn these multiple rows into one row? N and Y are bool values.
Id IsPnt IsPms, IsPdt
1 N Y N
1 N Y N
1 Y N N
into this
Id IsPnt IsPms, IsPdt
1 Y Y N
Edit:
The query that produces the resultset looks like this
select b.id,
CASE mpft.PlanIndCd WHEN 'PBMN' THEN 1 ELSE 0 END AS IsPnt,
CASE mpft.PlanIndCd WHEN 'PBMT' THEN 1 ELSE 0 END AS IsPbt,
CASE mpft.PlanIndCd WHEN 'PBMS' THEN 1 ELSE 0 END AS IsPms
from vw_D_SomveViewName pb
-- bunch of joins
where mpft.PlanIndCd in ('HANR', 'PBMN','PBMT','PBMS','HAWR')
You can simply use MAX() on this if the values are really Y and N only.
SELECT ID, MAX(IsPnt) IsPnt, MAX(IsPms) IsPms, MAX(IsPdt) IsPdt
FROM tableName
GROUP BY ID
UPDATE 1
SELECT b.id,
MAX(CASE mpft.PlanIndCd WHEN 'PBMN' THEN 1 ELSE 0 END) AS IsPnt,
MAX(CASE mpft.PlanIndCd WHEN 'PBMT' THEN 1 ELSE 0 END) AS IsPbt,
MAX(CASE mpft.PlanIndCd WHEN 'PBMS' THEN 1 ELSE 0 END) AS IsPms
FROM vw_D_SomveViewName pb
-- bunch of joins
WHERE mpft.PlanIndCd in ('HANR', 'PBMN','PBMT','PBMS','HAWR')
GROUP BY b.ID
Will this work?
select
id,
max(IsPnt),
max(IsPms),
max(IsPdt)
from
table
GROUP BY
id
After the edit of your question, you can simply use the PIVOT table operator directly instead of using the MAX expression, something like:
SELECT
Id,
PBMN AS IsPnt,
PBMT AS IsPbt,
PBMS AS IsPms
FROM
(
SELECT
id,
mpft.PlanIndCd,
ROW_NUMBER() OVER(PARTITION BY id
ORDER BY ( SELECT 1)) AS RN
from vw_D_SomveViewName pb
-- bunch of joins
where mpft.PlanIndCd in ('HANR', 'PBMN','PBMT','PBMS','HAWR')
) AS t
PIVOt
(
MAX(RN)
FOR PlanIndCd IN ([PBMN], [PBMT], [PBMS])
) AS p;
You can see it in action in the following demo example:
Demo on SQL Fiddle
select Id, MAX(IPnt), MAX(IsPms), MAX(IsPdt)
from table etc

Group and update multiple records sql based on another table

I am stuck trying to figure out a way to execute an update over an entire table on field 'Factor'.
'Factor' is determined by the number of repeated records with the same 'location' and 'date'.
There is a factor calculation table:
Location - Count - Factor
X - 1 - 1.0
X - 2 - 0.8
X - 3+ - 0.5
Please help me out!
Try this:
UPDATE A
SET Factor = CASE WHEN B.N = 1 THEN 1.0
WHEN B.N = 2 THEN 0.8
WHEN B.N >= 3 THEN 0.5 END
FROM YourTable A
INNER JOIN (SELECT Location, COUNT(*) N
FROM YourTable
GROUP BY Location) B
ON A.Location = B.Location
Another way:
;WITH CTE AS
(
SELECT *, COUNT(*) OVER(PARTITION BY Location) N
FROM YourTable
)
UPDATE CTE
SET Factor = CASE WHEN N = 1 THEN 1.0
WHEN N = 2 THEN 0.8
WHEN N >= 3 THEN 0.5 END
Use derived table with OVER clause
UPDATE x
SET x.Factor = x.NewFactor
FROM (
SELECT Factor,
CASE COUNT(*) OVER (PARTITION BY Location, [date])
WHEN 1 THEN 1.0
WHEN 2 THEN 0.8
ELSE 0.5 END AS NewFactor
FROM dbo.test30
) x
Demo on SQLFiddle