How to subtract two columns in different table - sql

I have a table of ward
ward_number | class | capacity
________________________________________
1 | A1 | 1
2 | A1 | 2
3 | B1 | 3
4 | C | 4
5 | B2 | 5
capacity = how many beds there is in the ward
I also have a table called ward_stay:
ward_number | from_date | to_date
_____________________________________________
2 | 2015-01-01 | 2015-03-08
3 | 2015-01-16 | 2015-02-18
6 | 2015-03-05 | 2015-03-18
3 | 2015-04-15 | 2015-04-20
1 | 2015-05-19 | 2015-05-30
I want to count the number of beds available in ward with class 'B1' on date '2015-04-15':
ward_number | count
_____________________
3 | 2
How to get the count is basically capacity - the number of times ward_number 3 appears
I managed to get the number of times ward_number 3 appears but I don't know how to subtract capacity from this result.
Here's my code:
select count(ward_number) AS 'result'
from ward_stay
where ward_number = (select ward_number
from ward
where class = 'B1');
How do I subtract capacity from this result?

SQL Fiddle Demo
Using 2015-01-17 instead I calculate the total of occupied bed on that day. Then join back to substract from original capacity. in case all bed are free the LEFT JOIN will return NULL, so COALESCE will put 0
SELECT w."ward_number", "capacity" - COALESCE(occupied, 0) as "count"
FROM wards w
LEFT JOIN (
SELECT "ward_number", COUNT(*) occupied
FROM ward_stay
WHERE to_date('2015-01-17', 'yyyy-mm-dd') BETWEEN "from_date" and "to_date"
GROUP BY "ward_number"
) o
ON w."ward_number" = o."ward_number"
WHERE w."class" = 'B1'
OUTPUT
| ward_number | count |
|-------------|-------|
| 3 | 2 |

select w.ward_number,
w.capacity - count(ws.ward_number) AS "result"
from ward as w left join ward_stay as ws
on ws.ward_number = w.ward_number
and date '2015-05-19' between ws.from_date and ws.to_date
where w.class = 'B1' -- which class
-- bed not occupied on that date
group by w.ward_number, w.capacity
having w.capacity - count(*) > 0 -- only available wards
See fiddle

You need to aggregate both tables before returning, because you have multiple rows for the same word type in both. So:
select c.class, (c.capacity - coalesce(wc.occupied)) as available
from (select class, sum(capacity) as capacity
from ward
group by class
) c left join
(select w.class, count(*) as occupied
from ward_stay ws join
ward s
on ws.ward_number = w.ward_number and
'2015-05-19' between ws.from_date and ws.to_date
) wc
on w.class = wc.class;
Note: this is standard SQL except for the date constant. This works in most databases; some might have other formats (or it might depend on internationalization settings).
Strictly speaking the aggregation on ward is not necessary for "B1". But it is clearly necessary for "A1".

Related

Sum of two tables using SQL

I'm trying to get the sum of two columns, but it seems to be adding incorrectly. I have a table Tbl_Booths and another table called Tbl_Extras.
In the Tbl_Booths:
BoothId | ExhId | BoothPrice
1 | 1 | 400
2 | 1 | 500
3 | 2 | 400
4 | 3 | 600
So totalBoothPrice for ExhId = 1 is 900
Tbl_Extras:
ExtraId | ExhId | Item | ItemCost
1 | 1 | PowerSupply | 400
2 | 2 | PowerSupply | 400
3 | 1 | Lights | 600
4 | 3 | PowerSupply | 400
5 | 4 | Lights | 400
So totalItemCost for ExhId = 1 is 1000
I need to find a way to get the sum of totalBoothPrice + totalItemCost
The value should of course be 900 + 1000 = 1900
I'm a total beginner to SQL so please have patience :-)
Thank you in advance for any input you can give me, since I'm going made here !
It is used in a Caspio database system.
You can use union all to combine the two tables and then aggregate:
select exhid, sum(price)
from ((select exhid, boothprice as price
from tbl_booths
) union all
(select exhid, itemcost as price
from tbl_extras
)
) e
group by exhid;
This returns the sum for all exhid values. If you want to filter them, then you can use a where clause in either the outer query or both subqueries.
Here is a db<>fiddle.
Booth totals:
select exhid, sum(boothprice) as total_booth_price
from tbl_booths
group by exhid;
Extra totals:
select exhid, sum(itemcost) as total_item_cost
from tbl_extras
group by exhid;
Joined:
select
exhid,
b.total_booth_price,
e.total_item_cost,
b.total_booth_price + e.total_item_cost as total
from
(
select exhid, sum(boothprice) as total_booth_price
from tbl_booths
group by exhid
) b
join
(
select exhid, sum(itemcost) as total_item_cost
from tbl_extras
group by exhid
) e using (exhid)
order by exhid;
This only shows exhids that have both booth and extras, though. If one can be missing use a left outer join. If one or the other can be missing, you'd want a full outer join, which MySQL doesn't support.

Find uncovered periods without exploding each combination

I have the following two tables
People
+--------+---------------+-------------+
| Name | ContractStart | ContractEnd |
+--------+---------------+-------------+
| Kate | 20180101 | 20181231 |
| Sawyer | 20180101 | 20181231 |
| Ben | 20170601 | 20181231 |
+--------+---------------+-------------+
Shifts
+---------+--------+------------+----------+
| Station | Name | ShiftStart | ShiftEnd |
+---------+--------+------------+----------+
| Swan | Kate | 20180101 | 20180131 |
| Arrow | Kate | 20180301 | 20180331 |
| Arrow | Kate | 20180401 | 20181231 |
| Flame | Sawyer | 20180101 | 20181231 |
| Swan | Ben | 20180101 | 20181231 |
+---------+--------+------------+----------+
It means that, for example, Kate will be available from 20180101 to 20181231. In this period of time she will work at station Swan from 20180101 to 20180131, at station Arrow from 20180301 to 20180331 and from 20180401 to 20181231.
My goal is to come to the following table
+------+---------------+-------------+
| | VacationStart | VacationEnd |
+------+---------------+-------------+
| Kate | 20180201 | 20180228 |
| Ben | 20170601 | 20171231 |
+------+---------------+-------------+
that means that Kate will be free from 20180201 to 20180228.
My first idea was to create a table with every day of the 2017 and 2018, let's say a CalTable, then JOIN the table with People to find every day that every person should be available. At this point JOIN again the resulting table with Shifts to have evidence of the days NOT BETWEEN ShiftStart AND ShiftEnd.
This steps give me correct results but are very slow, considering that I have almost 1.000.000 of person and usually between ContractStart and ContractEnd the are 10-20 years.
What could be a correct approach to get the results in a more clever and fast way?
Thanks.
This is the data of the example on db<>Fiddle
For # A_Name_Does_Not_Matter this is my attempt
CREATE TABLE #CalTable([ID] VARCHAR(8) NOT NULL)
DECLARE #num int
SET #num = 20170101
WHILE (#num <= 20181231)
BEGIN
INSERT INTO #CalTable([ID])
SELECT #num AS [ID]
SET #num = #num + 1
END
SELECT X.[Name], X.[TIMEID]
FROM (
-- All day availables
SELECT DISTINCT A.[Name],B.[ID] AS [TIMEID]
FROM #People A INNER JOIN #CalTable B
ON B.[ID] BETWEEN A.[ContractStart] AND A.[ContractEnd]
) X
LEFT JOIN (
-- Working day
SELECT DISTINCT A.[Name],B.[ID] AS [TIMEID]
FROM #People A INNER JOIN #CalTable B
ON B.[ID] BETWEEN A.[ContractStart] AND A.[ContractEnd]
INNER JOIN #Shifts C ON A.[Name]=C.[Name] AND B.[ID] BETWEEN C.[ShiftStart] AND C.[ShiftEnd]
) Z
ON X.[Name]=Z.[Name] AND X.[TIMEID]=Z.[TIMEID]
WHERE Z.[Name] IS NULL
ORDER BY X.[Name],X.[TIMEID]
and then aggregate the dates witk this query.
so a persons start date could be the start of a vacation, and you can find the end of that vacation by finding the date of their first shift (minus 1 day) by using CROSS APPLY to get the TOP 1 shift, ORDERED BY DATE
In an unusual situation that they have no shifts, their vacation ends on their contract end date.
Future vacations then start the day after a shift, and end the day before the next shift (can be found by OUTER APPLY) and defaulted to contracted end date if there is no further shift
SELECT p.name, p.contractStart vacationstart, p.ContractEnd vacationend from people p WHERE not exists(select 1 from shifts s where p.name = s.name)
UNION
SELECT p2.name,
p2.contractStart vacationstart,
dateadd(day,-1,DQ.ShiftStart) as vacationend
from PEOPLE P2
CROSS APPLY
(SELECT TOP 1 s2.ShiftStart FROM shifts s2 WHERE p2.name = s2.name order by sfiftstart) DQ
WHERE DQ.ShiftStart > p2.contractstart
UNION
select P3.NAME,
dateadd(day,1,s3.ShiftEnd) vacationstart,
COALESCE(dateadd(day,-1, DQ2.shiftStart),P3.ContractEnd) --you might have to add handling yourself for removing a case where they work on their contract end date
FROM people p3 JOIN shifts s3 on p3.name = s3.name
OUTER APPLY (SELECT TOP 1 s4.shiftStart
from shifts s4
where s4.name = p3.name
and
s4.shiftstart > s3.shiftstart
order by s4.shiftstart) DQ2
it's hard for me to verify without test data.
For an employee, what I seek is.
Contract Start, Shift1Start - 1
Shift1End + 1, Shift2Start - 1
Shift2End + 1, Shift3Start - 1
Shift3End + 1, ContractEnd
then add the case with 'no shifts'
finally shifts may be contiguous, leading to vacations of duration of zero or less - you could filter these by making the query a sub query, and simply filtering

SQL union / join / intersect multiple select statements

I have two select statements. One gets a list (if any) of logged voltage data in the past 60 seconds and related chamber names, and one gets a list (if any) of logged arc event data in the past 5 minutes. I am trying to append the arc count data as new columns to the voltage data table. I cannot figure out how to do this.
Note that, there may or may not be arc count rows, for a given chamber name that is in the voltage data table. If there are no rows, I want to set the arc count column value to zero.
Any ideas on how to accomplish this?
Voltage Data:
SELECT DISTINCT dbo.CoatingChambers.Name,
AVG(dbo.CoatingGridVoltage_Data.ChanA_DCVolts) AS ChanADC,
AVG(dbo.CoatingGridVoltage_Data.ChanB_DCVolts) AS ChanBDC,
AVG(dbo.CoatingGridVoltage_Data.ChanA_RFVolts) AS ChanARF,
AVG(dbo.CoatingGridVoltage_Data.ChanB_RFVolts) AS ChanBRF FROM
dbo.CoatingGridVoltage_Data LEFT OUTER JOIN dbo.CoatingChambers ON
dbo.CoatingGridVoltage_Data.CoatingChambersID =
dbo.CoatingChambers.CoatingChambersID WHERE
(dbo.CoatingGridVoltage_Data.DT > DATEADD(second, - 60,
SYSUTCDATETIME())) GROUP BY dbo.CoatingChambers.Name
Returns
Name | ChanADC | ChanBDC | ChanARF | ChanBRF
-----+-------------------+--------------------+---------------------+------------------
OX2 | 2.9099999666214 | -0.485000004371007 | 0.344801843166351 | 0.49748428662618
S2 | 0.100000001490116 | -0.800000016887983 | 0.00690172302226226 | 0.700591623783112
S3 | 4.25666658083598 | 0.5 | 0.96554297208786 | 0.134956782062848
Arc count table:
SELECT CoatingChambers.Name,
SUM(ArcCount) as ArcCount
FROM CoatingChambers
LEFT JOIN CoatingArc_Data
ON dbo.[CoatingArc_Data].CoatingChambersID = dbo.CoatingChambers.CoatingChambersID
where EventDT > DATEADD(mi,-5, GETDATE())
Group by Name
Returns
Name | ArcCount
-----+---------
L1 | 283
L4 | 0
L6 | 1
S2 | 55
To be clear, I want this table (with added arc count column), given the two tables above:
Name | ChanADC | ChanBDC | ChanARF | ChanBRF | ArcCount
-----+-------------------+--------------------+---------------------+-------------------+---------
OX2 | 2.9099999666214 | -0.485000004371007 | 0.344801843166351 | 0.49748428662618 | 0
S2 | 0.100000001490116 | -0.800000016887983 | 0.00690172302226226 | 0.700591623783112 | 55
S3 | 4.25666658083598 | 0.5 | 0.96554297208786 | 0.134956782062848 | 0
You can treat the select statements as virtual tables and just join them together:
select
x.Name,
x.ChanADC,
x.ChanBDC,
x.ChanARF,
x.ChanBRF,
isnull( y.ArcCount, 0 ) ArcCount
from
(
select distinct
cc.Name,
AVG(cgv.ChanA_DCVolts) AS ChanADC,
AVG(cgv.ChanB_DCVolts) AS ChanBDC,
AVG(cgv.ChanA_RFVolts) AS ChanARF,
AVG(cgv.ChanB_RFVolts) AS ChanBRF
from
dbo.CoatingGridVoltage_Data cgv
left outer join
dbo.CoatingChambers cc
on
cgv.CoatingChambersID = cc.CoatingChambersID
where
cgv.DT > dateadd(second, - 60, sysutcdatetime())
group by
cc.Name
) as x
left outer join
(
select
cc.Name,
sum(ac.ArcCount) as ArcCount
from
dbo.CoatingChambers cc
left outer join
dbo.CoatingArc_Data ac
on
ac.CoatingChambersID = cc.CoatingChambersID
where
EventDT > dateadd(mi,-5, getdate())
group by
Name
) as y
on
x.Name = y.Name
Also, it's worthwhile to simplify your names with aliases and format the queries for readability...which I shamelessly took a stab at.

Select multiple (non-aggregate function) columns with GROUP BY

I am trying to select the max value from one column, while grouping by another non-unique id column which has multiple duplicate values. The original database looks something like:
mukey | comppct_r | name | type
65789 | 20 | a | 7n
65789 | 15 | b | 8m
65789 | 1 | c | 1o
65790 | 10 | a | 7n
65790 | 26 | b | 8m
65790 | 5 | c | 1o
...
This works just fine using:
SELECT c.mukey, Max(c.comppct_r) AS ComponentPercent
FROM c
GROUP BY c.mukey;
Which returns a table like:
mukey | ComponentPercent
65789 | 20
65790 | 26
65791 | 50
65792 | 90
I want to be able to add other columns in without affecting the GROUP BY function, to include columns like name and type into the output table like:
mukey | comppct_r | name | type
65789 | 20 | a | 7n
65790 | 26 | b | 8m
65791 | 50 | c | 7n
65792 | 90 | d | 7n
but it always outputs an error saying I need to use an aggregate function with select statement. How should I go about doing this?
You have yourself a greatest-n-per-group problem. This is one of the possible solutions:
select c.mukey, c.comppct_r, c.name, c.type
from c yt
inner join(
select c.mukey, max(c.comppct_r) comppct_r
from c
group by c.mukey
) ss on c.mukey = ss.mukey and c.comppct_r= ss.comppct_r
Another possible approach, same output:
select c1.*
from c c1
left outer join c c2
on (c1.mukey = c2.mukey and c1.comppct_r < c2.comppct_r)
where c2.mukey is null;
There's a comprehensive and explanatory answer on the topic here: SQL Select only rows with Max Value on a Column
Any non-aggregate column should be there in Group By clause .. why??
t1
x1 y1 z1
1 2 5
2 2 7
Now you are trying to write a query like:
select x1,y1,max(z1) from t1 group by y1;
Now this query will result only one row, but what should be the value of x1?? This is basically an undefined behaviour. To overcome this, SQL will error out this query.
Now, coming to the point, you can either chose aggregate function for x1 or you can add x1 to group by. Note that this all depends on your requirement.
If you want all rows with aggregation on z1 grouping by y1, you may use SubQ approach.
Select x1,y1,(select max(z1) from t1 where tt.y1=y1 group by y1)
from t1 tt;
This will produce a result like:
t1
x1 y1 max(z1)
1 2 7
2 2 7
Try using a virtual table as follows:
SELECT vt.*,c.name FROM(
SELECT c.mukey, Max(c.comppct_r) AS ComponentPercent
FROM c
GROUP BY c.muke;
) as VT, c
WHERE VT.mukey = c.mukey
You can't just add additional columns without adding them to the GROUP BY or applying an aggregate function. The reason for that is, that the values of a column can be different inside one group. For example, you could have two rows:
mukey | comppct_r | name | type
65789 | 20 | a | 7n
65789 | 20 | b | 9f
How should the aggregated group look like for the columns name and type?
If name and type is always the same inside a group, just add it to the GROUP BY clause:
SELECT c.mukey, Max(c.comppct_r) AS ComponentPercent
FROM c
GROUP BY c.muke, c.name, c.type;
Use a 'Having' clause
SELECT *
FROM c
GROUP BY c.mukey
HAVING c.comppct_r = Max(c.comppct_r);

query that returns rows where time difference past threshold

this is an odd question. i dunno if it is quite doable.
let's say i have the following table:
person | product | trans | purchase_date
-------+----------+--------+---------------
jim | square | aaaa | 2013-03-04 00:01:00
sarah | circle | aaab | 2013-03-04 00:02:00
john | square | aac1 | 2013-03-04 00:03:00
john | circle | aac2 | 2013-03-04 00:03:10
jim | triangle | aad1 | 2013-03-04 00:04:00
jim | square | abcd | 2013-03-04 00:05:00
sarah | square | efgh | 2013-03-04 00:07:00
jim | circle | ijkl | 2013-03-04 00:22:00
sarah | circle | mnop | 2013-03-04 00:24:00
sarah | square | qrst | 2013-03-04 00:26:00
sarah | circle | uvwx | 2013-03-04 00:44:00
i need to know when the difference between any person's purchases between a square and a circle (or a circle and a square) have exceeded 10 minutes. ideally, i'd like to know that difference as well, but that isn't required.
so as a result, here is what i need:
person | product | trans | purchase_date
-------+----------+--------+---------------
jim | square | abcd | 2013-03-04 00:05:00
jim | circle | ijkl | 2013-03-04 00:22:00
sarah | square | efgh | 2013-03-04 00:07:00
sarah | circle | mnop | 2013-03-04 00:24:00
sarah | square | qrst | 2013-03-04 00:26:00
sarah | circle | uvwx | 2013-03-04 00:44:00
this will run daily, so i will add a "where" clause to ensure the query doesn't get out of hand. also, i am aware that multiple transactions could show up (say there were 20 minutes between the purchase of a circle, then 20 minutes for a square, then 20 minutes for a circle again, which would mean there were 2 instances where the time difference was over 10 minutes).
any advice? i am on postgres 8.1.23
Modern day solution
With modern day Postgres (8.4 or later) you can use the window function row_number() to get a continuous numbering per group. Then you can left join to the previous and next row and see if either of them matches the criteria. Voilá.
WITH x AS (
SELECT *
,row_number() OVER (PARTITION BY person ORDER BY purchase_date) AS rn
FROM tbl
WHERE product IN ('circle', 'square')
)
SELECT x.person, x.product, x.trans, x.purchase_date
FROM x
LEFT JOIN x y ON y.person = x.person AND y.rn = x.rn + 1
LEFT JOIN x z ON z.person = x.person AND z.rn = x.rn - 1
WHERE (y.product <> x.product
AND y.purchase_date > x.purchase_date + interval '10 min')
OR (z.product <> x.product
AND z.purchase_date < x.purchase_date - interval '10 min')
ORDER BY x.person, x.purchase_date;
SQLfiddle.
Solution for Postgres 8.1
I can't test this on Postgres 8.1, no surviving instance available. Tested and works on v8.4 and should work for you, too. Temporary sequences and temporary tables and and CREATE TABLE AS were already available.
Temporary sequence and table are only visible to you, so you can get continuous numbers even with concurrent queries.
CREATE TEMP SEQUENCE s;
CREATE TEMP TABLE x AS
SELECT *, nextval('s') AS rn -- get row-numbers from sequence
FROM (
SELECT *
FROM tbl
WHERE product IN ('circle', 'square')
ORDER BY person, purchase_date -- need to order in a subquery first!
) a;
Then the same SELECT as above should work:
SELECT x.person, x.product, x.trans, x.purchase_date
FROM x
LEFT JOIN x y ON y.person = x.person AND y.rn = x.rn + 1
LEFT JOIN x z ON z.person = x.person AND z.rn = x.rn - 1
WHERE (y.product <> x.product
AND y.purchase_date > x.purchase_date + interval '10 min')
OR (z.product <> x.product
AND z.purchase_date < x.purchase_date - interval '10 min')
ORDER BY x.person, x.purchase_date;
You could try joining the table to itself with an 'ON' clause like this:
SELECT a.Person, CAST((DATEDIFF(mi, b.purchaseDateb a.purchaseDate)/60.0) AS Decimal) AS TimeDiff, a.Product, b.Product FROM <TABLE> a
JOIN <TABLE> b
ON a.Person = b.Person AND b.purchaseDate > a.purchaseDate
WHERE
(a.Product = 'Circle' AND b.Product = 'Square')
OR
(a.Product = 'Square' AND b.Product = 'Circle')
By joining the table to itself you get rows which combine two purchases by the same person. By limiting it to 'b.purchaseDate > a.purchaseDate' you prevent rows matching themselves. Then you can simply check for different products purchased.
The time difference is the last tricky part. What I included above is based on an answer I found here. It looks like it should work, and there's a couple of variations there you can use if what this outputs doesn't work for you.
You'll need to add a clause on the WHERE statement which uses the same DATEDIFF function to test for time > 10 minutes, but that should pose no great challenge.
Please note that this won't return exactly what you have in your question - this will include a row for Jim's first transaction as well as one for Jim's 2nd square purchase. Both will match to the same circle, and you will get both times (ijkl-abcd AND ijkl-aaaa). Thanks for xQbert's comment for pointing this out.
--Assumes
You want to know differences in minutes for purchase on same day. If dates don't matter eliminate the where clause.
That you only want considerations of circle to square following the purchase_date, not preceding.
.
.
SELECT A.person, A.product, a.Trans, A.Purchase_date, B.Purchase_date,
hours_diff * 60 + DATE_PART('minute', B.purchase_date - A.Purchase_date ) as minuteDifference
FROM yourTable A
LEFT JOIN yourTable B
on A.person = B.Person
and ((A.product = 'square' and b.product = 'circle')
OR (A.Product = 'circle' and b.product = 'square'))
and A.purchase_date <= B.Purchase_date
WHERE (A.purchase_Date::date = B.purchase_date::date OR B.purchase_date is null)
Null B.purchase_dates will tell you when you don't have a circle/square or square circle combo.