SQL: SUMming certain items in a column and subtracting it from another figure in that column - sql

Sorry about the title. It might be a bit confusing! The sample table I'm working with is given below:
ID Quantity Type
-----------------------------------------------
1 14 PO
1 2 PO
1 4 MH
1 3 MH
1 2 MH
2 16 PO
2 12 MH
2 9 MH
Here's what I want to do. I want to sum all quantities of ID = 1 and Type = PO (14 + 2) as SUM_IN. I then want to sum all quantities of ID = 1 and Type = MH (4 + 3 + 2) as SUM_OUT. Once I have this done I want to compare the two and return values only where SUM_OUT > SUM_IN. So for ID = 1 would not be returned where as ID = 2 would, because (12 + 9) > 16.
Is there a way to do this in SQL or will I need to use PL/SQL and variables for the task. I have very little experience in PL/SQL, but logically it seems that variables would be the easiest way to solve the problem. I know that select statements can be stored in variables but I'm not sure how to. Here are my two SQL selects anyway
SELECT SUM(QUANTITY) AS SUM_IN
FROM TRANSLOG
WHERE TYPE IN ('PO')
AND ID = '1'
SELECT SUM(QUANTITY) AS SUM_OUT
FROM TRANSLOG
WHERE TYPE IN ('MH')
AND ID = '1'
So if I could set both these to variables, the task shouldn't be too difficult, right???
Thanks in advance for the help.

select ID,
sum ( Quantity * case Type when 'po' then 1 else 0 end ) as SUM_IN,
sum ( Quantity * case Type when 'mh' then 1 else 0 end ) as SUM_OUT
from translog
group by ID
having sum ( Quantity * case Type when 'po' then 1 else -1 end ) < 0

As you have tagged you question with plsql tag I assume that the RDBMS you are goint to execute query against is Oracle. If so, then here is another approach(using DECODE function
) to get the result set you want.
select *
from (select id
, sum(Quantity*decode(tp, 'PO', 1, 0)) as sum_in
, sum(Quantity*decode(tp, 'MH', 1, 0)) as sum_out
from t1
group by id
order by id )
where sum_out > sum_in
Result:
ID SUM_IN SUM_OUT
-----------------------
2 16 21
If you want to display the rest of the columns along with sum_in, sum_out the following query might be in handy:
select id
, quantity
, Tp
, sum_in
, sum_out
from (select id
, quantity
, tp
, sum(Quantity*decode(tp, 'PO', 1, 0)) over(partition by id) as sum_in
, sum(Quantity*decode(tp, 'MH', 1, 0)) over(partition by id) as sum_out
from t1
)
where sum_out > sum_in
Result:
Id Quantity Tp Sum_In Sum_Out
---------------------------------------------
2 16 PO 16 21
2 12 MH 16 21
2 9 MH 16 21

SELECT CASE WHEN b.SUM_OUT > a.SUM_IN then b.SUM_OUT else '' END as SUM_OUT,
CASE WHEN b.SUM_OUT > a.SUM_IN then a.SUM_IN else '' END as SUM_IN
FROM
(SELECT ID,SUM(QUANTITY) AS SUM_IN
FROM TRANSLOG
WHERE TYPE IN ('PO')
AND ID = '1'
GROUP BY ID,Type
) a
INNER JOIN
(SELECT ID,SUM(QUANTITY) AS SUM_OUT
FROM TRANSLOG
WHERE TYPE IN ('MH')
AND ID = '1'
GROUP BY ID,Type
) b
ON a.ID=b.ID

Related

Calculate conversion rate with the specified conditions

Here is my sample data table:
ID
Status
1
New
1
Processed
2
New
2
Processed
3
New
3
Processed
4
Processed
5
New
What I am trying to solve here is calculate the conversion rate from Status 'New' to Status 'Processed'. From the dataset, only ID no.1,2 and 3 fulfilled the requirements of my problem, and ID no.4 and 5 do not have both stages. So by theory, the conversion rate should be 3/5 * 100% = 60%. How can I select the data in order to calculate the IDs that have both 'New' and 'Processed' status.
This is the code that I have tried but I know its wrong since it extracts all the IDs with no link between it.
SELECT 'Conversion rate from Assigned enquiry to In progess leads' as 'Name', round((a.processed / b.new),3) * 100 as 'Answer'
FROM
(
SELECT cast(count(ID)as float) as processed
from table1
WHERE STATUS_ID = 'Processed'
) as a
cross join
(
SELECT cast(count(ID)as float) as new
from table_1
WHERE STATUS_ID = 'NEW'
) as b
We can use conditional aggregation here:
WITH cte AS (
SELECT CASE WHEN COUNT(CASE WHEN Status = 'New' THEN 1 END) > 0 AND
COUNT(CASE WHEN Status = 'Processed' THEN 1 END) > 0
THEN 1 ELSE 0 END AS cnt
FROM yourTable
GROUP BY ID
)
SELECT 100.0 * SUM(cnt) / COUNT(*)
FROM cte;

Creating SQL values from two columns using the selective aggregate of each column

I have the following four tables:
region_reference, community_grants, HealthWorkers and currency_exchange
and the follow SQL query which works:
SELECT HealthWorkers.worker_id
, community_grants.percentage_price_adjustment
, community_grants.payment_status
, community_grants.chosen
, (region_reference.base_price * currency_exchange.euro_value) AS price
FROM currency_exchange
INNER JOIN (
region_reference INNER JOIN (
HealthWorkers INNER JOIN community_grants
ON HealthWorkers.worker_id = community_grants.worker_id
) ON (
region_reference.community_id = community_grants.community_id
) AND (region_reference.region_id = community_grants.region_id)
)
ON currency_exchange.currency = HealthWorkers.preferred_currency
WHERE (
HealthWorkers.worker_id="malawi_01"
AND community_grants.chosen=True
);
It gives me the following result set:
However, my task is to create an entity that includes just 4 values.
type OverallPriceSummary struct {
Worker_id string `json:"worker_id"`
Total_paid decimal.Decimal `json:"total_paid"`
Total_pledged decimal.Decimal `json:"total_pledged"`
Total_outstanding decimal.Decimal `json:"total_outstanding"`
}
Total_paid is the sum of values for the specified worker_id where payment_status = “1” (combined for all records)
Total_outstanding is the sum of values where payment_status is “0” and chosen is true (combined for all records)
Total_pledged is the sum of Total_paid and Total_outstanding (also combined for all records)
I currently obtain these values by aggregating this manually in my code as postgresql iterates through the resultset but I believe there is a way to avoid this interim SQL query and get what I need from a single SQL query.
I suspect it involves the use of SUM AS and inner queries but I don’t know how to bring it all together. Any help or direction would be much appreciated.
EDIT:
I have provided some sample data below:
region_reference
region_id
region_name
base_price
community_id
1
Lilongwe
100
19
2
Mzuzu
50
19
HealthWorkers
worker_id
worker_name
preferred_currency
billing_address
charity_logo
malawi_01
Raphael Salanga
EUR
Nkhunga Health Centre in Nkhotakota District
12345
community_grants
region_id
campaign_id
worker_id
percentage_price_adjustment
community_id
payment_status
chosen
paid_price
1
1
malawi_01
10
19
0
Yes
0
2
1
malawi_01
0
19
1
Yes
20
3
1
malawi_01
1
19
0
Yes
0
1
1
malawi_01
0
23
0
Yes
30
currency_exchange
currency
currency_symbol
euro_value
EUR
€
1
USD
$
0.84
Consider conditional aggregation using Postgres' FILTER clause where you pivot data to calculated conditional columns.
Below assumes sum of values is the sum of calculated price expressed as: region_reference.base_price * currency_exchange.euro_value. Adjust as needed.
SELECT h.worker_id
, SUM(r.base_price * ce.euro_value) FILTER(WHERE
cg.payment_status = 1
) AS total_paid
, SUM(r.base_price * ce.euro_value) FILTER(WHERE
cg.payment_status = 0 AND
cg.chosen=True
) AS total_outstanding
, SUM(r.base_price * ce.euro_value) FILTER(WHERE
(cg.payment_status = 1) OR
(cg.payment_status = 0 AND cg.chosen=True)
) AS total_pledged
FROM community_grants cg
INNER JOIN region_reference r
ON r.community_id = cg.community_id
AND r.region_id = cg.region_id
INNER JOIN HealthWorkers h
ON h.worker_id = cg.worker_id
AND h.worker_id = 'malawi_01'
INNER JOIN currency_exchange ce
ON ce.currency = h.preferred_currency
GROUP BY h.worker_id
Try something like:
SELECT
worker_id
,sum(case when payment_status = “1”
then paid_price else 0 end) as Total_paid
,sum(case when payment_status = “0” and chosen = true
then paid_price else 0 end) as Total_outstanding
,sum(case when (payment_status = “1”)
or (payment_status = “0” and chosen = true)
then paid_price else 0 end) as Total_pledged
from community_grants
group by worker_id

How to use SQL (postgresql) query to conditionally change value within each group?

I am pretty new to postgresql (or sql), and have not learned how to deal with such "within group" operation. My data is like this:
p_id number
97313 4
97315 10
97315 10
97325 0
97325 15
97326 4
97335 0
97338 0
97338 1
97338 2
97344 5
97345 14
97349 0
97349 5
p_id is not unique and can be viewed as a grouping variable. I would like to change the number within each p_id to achieve such operation:
if for a given p_id, one of the value is 0, but any of the other "number" for that pid is >2, then set the 0 value as NULL. Like the "p_id" 97325, there are "0" and "15" associated with it. I will replace the 0 by NULL, and keep the other 15 unchanged.
But for p_id 97338, the three rows associated with it have number "0" "1" "2", therefore I do not replace the 0 by NULL.
The final data should be like:
p_id number
97313 4
97315 10
97315 10
97325 NULL
97325 15
97326 4
97335 0
97338 0
97338 1
97338 2
97344 5
97345 14
97349 NULL
97349 5
Thank you very much for the help!
A CASE in a COUNT OVER in a CASE:
SELECT
p_id,
(CASE
WHEN number = 0 AND COUNT(CASE WHEN number > 2 THEN number END) OVER (PARTITION BY p_id) > 0
THEN NULL
ELSE number
END) AS number
FROM yourtable
Test it here on rextester.
Works for PostgreSQL 10:
SELECT p_id, CASE WHEN number = 0 AND maxnum > 2 AND counts >= 2 THEN NULL ELSE number END AS number
FROM
(
SELECT a.p_id AS p_id, a.number AS number, b.maxnum AS maxnum, b.counts AS counts
FROM trans a
LEFT JOIN
(
SELECT p_id, MAX(number) AS maxnum, COUNT(1) AS counts
FROM trans
GROUP BY p_id
) b
ON a.p_id = b.p_id
) a1
use case when
select p_id,
case when p_id>2 and number=0 then null else number end as number
from yourtable
http://sqlfiddle.com/#!17/898c3/1
I would express this as:
SELECT p_id,
(CASE WHEN number <> 0 OR MAX(number) OVER (PARTITION BY p_id) <= 2
THEN number
END) as number
FROM t;
If the fate of a record depends on the existence of other records within (the same or another) table, you could use EXISTS(...) :
UPDATE ztable zt
SET number = NULL
WHERE zt.number = 0
AND EXISTS ( SELECT *
FROM ztable x
WHERE x.p_id = zt.p_id
AND x.number > 2
);

Looping in select query

I want to do something like this:
select id,
count(*) as total,
FOR temp IN SELECT DISTINCT somerow FROM mytable ORDER BY somerow LOOP
sum(case when somerow = temp then 1 else 0 end) temp,
END LOOP;
from mytable
group by id
order by id
I created working select:
select id,
count(*) as total,
sum(case when somerow = 'a' then 1 else 0 end) somerow_a,
sum(case when somerow = 'b' then 1 else 0 end) somerow_b,
sum(case when somerow = 'c' then 1 else 0 end) somerow_c,
sum(case when somerow = 'd' then 1 else 0 end) somerow_d,
sum(case when somerow = 'e' then 1 else 0 end) somerow_e,
sum(case when somerow = 'f' then 1 else 0 end) somerow_f,
sum(case when somerow = 'g' then 1 else 0 end) somerow_g,
sum(case when somerow = 'h' then 1 else 0 end) somerow_h,
sum(case when somerow = 'i' then 1 else 0 end) somerow_i,
sum(case when somerow = 'j' then 1 else 0 end) somerow_j,
sum(case when somerow = 'k' then 1 else 0 end) somerow_k
from mytable
group by id
order by id
this works, but it is 'static' - if some new value will be added to 'somerow' I will have to change sql manually to get all the values from somerow column, and that is why I'm wondering if it is possible to do something with for loop.
So what I want to get is this:
id somerow_a somerow_b ....
0 3 2 ....
1 2 10 ....
2 19 3 ....
. ... ...
. ... ...
. ... ...
So what I'd like to do is to count all the rows which has some specific letter in it and group it by id (this id isn't primary key, but it is repeating - for id there are about 80 different values possible).
http://sqlfiddle.com/#!15/18feb/2
Are arrays good for you? (SQL Fiddle)
select
id,
sum(totalcol) as total,
array_agg(somecol) as somecol,
array_agg(totalcol) as totalcol
from (
select id, somecol, count(*) as totalcol
from mytable
group by id, somecol
) s
group by id
;
id | total | somecol | totalcol
----+-------+---------+----------
1 | 6 | {b,a,c} | {2,1,3}
2 | 5 | {d,f} | {2,3}
In 9.2 it is possible to have a set of JSON objects (Fiddle)
select row_to_json(s)
from (
select
id,
sum(totalcol) as total,
array_agg(somecol) as somecol,
array_agg(totalcol) as totalcol
from (
select id, somecol, count(*) as totalcol
from mytable
group by id, somecol
) s
group by id
) s
;
row_to_json
---------------------------------------------------------------
{"id":1,"total":6,"somecol":["b","a","c"],"totalcol":[2,1,3]}
{"id":2,"total":5,"somecol":["d","f"],"totalcol":[2,3]}
In 9.3, with the addition of lateral, a single object (Fiddle)
select to_json(format('{%s}', (string_agg(j, ','))))
from (
select format('%s:%s', to_json(id), to_json(c)) as j
from
(
select
id,
sum(totalcol) as total_sum,
array_agg(somecol) as somecol_array,
array_agg(totalcol) as totalcol_array
from (
select id, somecol, count(*) as totalcol
from mytable
group by id, somecol
) s
group by id
) s
cross join lateral
(
select
total_sum as total,
somecol_array as somecol,
totalcol_array as totalcol
) c
) s
;
to_json
---------------------------------------------------------------------------------------------------------------------------------------
"{1:{\"total\":6,\"somecol\":[\"b\",\"a\",\"c\"],\"totalcol\":[2,1,3]},2:{\"total\":5,\"somecol\":[\"d\",\"f\"],\"totalcol\":[2,3]}}"
In 9.2 it is also possible to have a single object in a more convoluted way using subqueries in instead of lateral
SQL is very rigid about the return type. It demands to know what to return beforehand.
For a completely dynamic number of resulting values, you can only use arrays like #Clodoaldo posted. Effectively a static return type, you do not get individual columns for each value.
If you know the number of columns at call time ("semi-dynamic"), you can create a function taking (and returning) polymorphic parameters. Closely related answer with lots of details:
Dynamic alternative to pivot with CASE and GROUP BY
(You also find a related answer with arrays from #Clodoaldo there.)
Your remaining option is to use two round-trips to the server. The first to determine the the actual query with the actual return type. The second to execute the query based on the first call.
Else, you have to go with a static query. While doing that, I see two nicer options for what you have right now:
1. Simpler expression
select id
, count(*) AS total
, count(somecol = 'a' OR NULL) AS somerow_a
, count(somecol = 'b' OR NULL) AS somerow_b
, ...
from mytable
group by id
order by id;
How does it work?
Compute percents from SUM() in the same SELECT sql query
SQL Fiddle.
2. crosstab()
crosstab() is more complex at first, but written in C, optimized for the task and shorter for long lists. You need the additional module tablefunc installed. Read the basics here if you are not familiar:
PostgreSQL Crosstab Query
SELECT * FROM crosstab(
$$
SELECT id
, count(*) OVER (PARTITION BY id)::int AS total
, somecol
, count(*)::int AS ct -- casting to int, don't think you need bigint?
FROM mytable
GROUP BY 1,3
ORDER BY 1,3
$$
,
$$SELECT unnest('{a,b,c,d}'::text[])$$
) AS f (id int, total int, a int, b int, c int, d int);

Counting if data exists in a row

Hey guys I have the below sample data which i want to query for.
MemberID AGEQ1 AGEQ2 AGEQ2
-----------------------------------------------------------------
1217 2 null null
58458 3 2 null
58459 null null null
58457 null 5 null
299576 6 5 7
What i need to do is to lookup the table and if any AGEx COLUMN contains any data then it counts the number of times there is data for that row in each column
Results example:
for memberID 1217 the count would be 1
for memberID 58458 the count would be 2
for memberID 58459 the count would be 0 or null
for memberID 58457 the count would be 1
for memberID 299576 the count would be 3
This is how it should look like in SQL if i query the entire table
1 Children - 2
2 Children - 1
3 Children - 1
0 Children - 1
So far i have been doing it using the following query which isnt very efficient and does give incorrect tallies as there are multiple combinations that people can answer the AGE question. Also i have to write multiple queries and change the is null to is not null depending on how many children i am looking to count a person has
select COUNT (*) as '1 Children' from Member
where AGEQ1 is not null
and AGEQ2 is null
and AGEQ3 is null
The above query only gives me an answer of 1 but i want to be able to count the other columns for data as well
Hope this is nice and clear and thank you in advance
If all of the columns are integers, you can take advantage of integer math - dividing the column by itself will yield 1, unless the value is NULL, in which case COALESCE can convert the resulting NULL to 0.
SELECT
MemberID,
COALESCE(AGEQ1 / AGEQ1, 0)
+ COALESCE(AGEQ2 / AGEQ2, 0)
+ COALESCE(AGEQ3 / AGEQ3, 0)
+ COALESCE(AGEQ4 / AGEQ4, 0)
+ COALESCE(AGEQ5 / AGEQ5, 0)
+ COALESCE(AGEQ6 / AGEQ6, 0)
FROM dbo.table_name;
To get the number of people with each count of children, then:
;WITH y(y) AS
(
SELECT TOP (7) rn = ROW_NUMBER() OVER
(ORDER BY [object_id]) - 1 FROM sys.objects
),
x AS
(
SELECT
MemberID,
x = COALESCE(AGEQ1 / AGEQ1, 0)
+ COALESCE(AGEQ2 / AGEQ2, 0)
+ COALESCE(AGEQ3 / AGEQ3, 0)
+ COALESCE(AGEQ4 / AGEQ4, 0)
+ COALESCE(AGEQ5 / AGEQ5, 0)
+ COALESCE(AGEQ6 / AGEQ6, 0)
FROM dbo.table_name
)
SELECT
NumberOfChildren = y.y,
NumberOfPeopleWithThatMany = COUNT(x.x)
FROM y LEFT OUTER JOIN x ON y.y = x.x
GROUP BY y.y ORDER BY y.y;
I'd look at using UNPIVOT. That will make your wide column into rows. Since you don't care about what value was in a column, just the presence/absence of value, this will generate a row per not-null column.
The trick then becomes mashing that into the desired output format. It could probably have been done cleaner but I'm a fan of "showing my work" so that others can conform it to their needs.
SQLFiddle
-- Using the above logic
WITH HadAges AS
(
-- Find everyone and determine number of rows
SELECT
UP.MemberID
, count(1) AS rc
FROM
dbo.Member AS M
UNPIVOT
(
ColumnValue for ColumnName in (AGEQ1, AGEQ2, AGEQ3)
) AS UP
GROUP BY
UP.MemberID
)
, NoAge AS
(
-- Account for those that didn't show up
SELECT M.MemberID
FROM
dbo.Member AS M
EXCEPT
SELECT
H.MemberID
FROM
HadAges AS H
)
, NUMBERS AS
(
-- Allowable range is 1-6
SELECT TOP 6
ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS TheCount
FROM
sys.all_columns AS SC
)
, COMBINATION AS
(
-- Link those with rows to their count
SELECT
N.TheCount AS ChildCount
, H.MemberID
FROM
NUMBERS AS N
LEFT OUTER JOIN
HadAges AS H
ON H.rc = N.TheCount
UNION ALL
-- Deal with the unlinked
SELECT
0
, NA.MemberID
FROM
NoAge AS NA
)
SELECT
C.ChildCount
, COUNT(C.MemberID) AS Instances
FROM
COMBINATION AS C
GROUP BY
C.ChildCount;
Try this:
select id, a+b+c+d+e+f
from ( select id,
case when age1 is null then 0 else 1 end a,
case when age2 is null then 0 else 1 end b,
case when age3 is null then 0 else 1 end c,
case when age4 is null then 0 else 1 end d,
case when age5 is null then 0 else 1 end e,
case when age6 is null then 0 else 1 end f
from ages
) as t
See here in fiddle http://sqlfiddle.com/#!3/88020/1
To get the quantity of persons with childs
select childs, count(*) as ct
from (
select id, a+b+c+d+e+f childs
from
(
select
id,
case when age1 is null then 0 else 1 end a,
case when age2 is null then 0 else 1 end b,
case when age3 is null then 0 else 1 end c,
case when age4 is null then 0 else 1 end d,
case when age5 is null then 0 else 1 end e,
case when age6 is null then 0 else 1 end f
from ages ) as t
) ct
group by childs
order by 1
See it here at fiddle http://sqlfiddle.com/#!3/88020/24