SQL query to (group by ) by condition of column - sql

if I want to make a query that gets the count of users grouping ages
to get the counts each year as alone :
select count(*)
from tbl_user
group by age
how can I make a custom group by so I can get ages in ranges for example ...
like this example :
group by ages as 0-18 , 19-25 , 26-...

Use a CASE expression in a subquery and group by that expression in the outer query:
select age_group, count(*)
from (
select case when age between 0 and 18 then '0-18'
when age between 19 and 26 then '19-25'
...
end as age_group
from tbl_user
) t
group by age_group

SUM 1 and CASE WHEN work in MS SQL Server, which version of SQL are you using?
SELECT
SUM(CASE WHEN Age >= 0 AND Age <= 18 THEN 1 ELSE 0 END) AS [0-18],
SUM(CASE WHEN Age >= 19 AND Age <= 25 THEN 1 ELSE 0 END) AS [19-25]
FROM
YourTable

You could use a CASE statement:
SELECT Sum(CASE WHEN age BETWEEN 0 AND 18 THEN 1 ELSE 0 END) as [0-18],
Sum(CASE WHEN age BETWEEN 19 AND 25 THEN 1 ELSE 0 END) as [19-25],
Sum(CASE WHEN age BETWEEN 26 AND 34 THEN 1 ELSE 0 END) as [26-34]
FROM tbl_user
this will "flatten" the data into one row - to get one row per grouping use this as the basis for a View, then select from that.

Data belongs in a table, not in the code. The age categories are data, IMHO.
CREATE TABLE one
( val SERIAL NOT NULL PRIMARY KEY
, age INTEGER NOT NULL
);
INSERT INTO one (age) SELECT generate_series(0,31, 1);
CREATE TABLE age_category
( low INTEGER NOT NULL PRIMARY KEY
, high INTEGER NOT NULL
, description varchar
);
INSERT INTO age_category (low,high,description) VALUES
( 0,19, '0-18')
, ( 19,26, '19-25')
, ( 26,1111, '26-...')
;
SELECT ac.description, COUNT(*)
FROM one o
JOIN age_category ac ON o.age >= ac.low AND o.age < ac.high
GROUP BY ac.description
;

Related

Creating SQL values from two columns using the selective aggregate of each column

I have the following four tables:
region_reference, community_grants, HealthWorkers and currency_exchange
and the follow SQL query which works:
SELECT HealthWorkers.worker_id
, community_grants.percentage_price_adjustment
, community_grants.payment_status
, community_grants.chosen
, (region_reference.base_price * currency_exchange.euro_value) AS price
FROM currency_exchange
INNER JOIN (
region_reference INNER JOIN (
HealthWorkers INNER JOIN community_grants
ON HealthWorkers.worker_id = community_grants.worker_id
) ON (
region_reference.community_id = community_grants.community_id
) AND (region_reference.region_id = community_grants.region_id)
)
ON currency_exchange.currency = HealthWorkers.preferred_currency
WHERE (
HealthWorkers.worker_id="malawi_01"
AND community_grants.chosen=True
);
It gives me the following result set:
However, my task is to create an entity that includes just 4 values.
type OverallPriceSummary struct {
Worker_id string `json:"worker_id"`
Total_paid decimal.Decimal `json:"total_paid"`
Total_pledged decimal.Decimal `json:"total_pledged"`
Total_outstanding decimal.Decimal `json:"total_outstanding"`
}
Total_paid is the sum of values for the specified worker_id where payment_status = “1” (combined for all records)
Total_outstanding is the sum of values where payment_status is “0” and chosen is true (combined for all records)
Total_pledged is the sum of Total_paid and Total_outstanding (also combined for all records)
I currently obtain these values by aggregating this manually in my code as postgresql iterates through the resultset but I believe there is a way to avoid this interim SQL query and get what I need from a single SQL query.
I suspect it involves the use of SUM AS and inner queries but I don’t know how to bring it all together. Any help or direction would be much appreciated.
EDIT:
I have provided some sample data below:
region_reference
region_id
region_name
base_price
community_id
1
Lilongwe
100
19
2
Mzuzu
50
19
HealthWorkers
worker_id
worker_name
preferred_currency
billing_address
charity_logo
malawi_01
Raphael Salanga
EUR
Nkhunga Health Centre in Nkhotakota District
12345
community_grants
region_id
campaign_id
worker_id
percentage_price_adjustment
community_id
payment_status
chosen
paid_price
1
1
malawi_01
10
19
0
Yes
0
2
1
malawi_01
0
19
1
Yes
20
3
1
malawi_01
1
19
0
Yes
0
1
1
malawi_01
0
23
0
Yes
30
currency_exchange
currency
currency_symbol
euro_value
EUR
€
1
USD
$
0.84
Consider conditional aggregation using Postgres' FILTER clause where you pivot data to calculated conditional columns.
Below assumes sum of values is the sum of calculated price expressed as: region_reference.base_price * currency_exchange.euro_value. Adjust as needed.
SELECT h.worker_id
, SUM(r.base_price * ce.euro_value) FILTER(WHERE
cg.payment_status = 1
) AS total_paid
, SUM(r.base_price * ce.euro_value) FILTER(WHERE
cg.payment_status = 0 AND
cg.chosen=True
) AS total_outstanding
, SUM(r.base_price * ce.euro_value) FILTER(WHERE
(cg.payment_status = 1) OR
(cg.payment_status = 0 AND cg.chosen=True)
) AS total_pledged
FROM community_grants cg
INNER JOIN region_reference r
ON r.community_id = cg.community_id
AND r.region_id = cg.region_id
INNER JOIN HealthWorkers h
ON h.worker_id = cg.worker_id
AND h.worker_id = 'malawi_01'
INNER JOIN currency_exchange ce
ON ce.currency = h.preferred_currency
GROUP BY h.worker_id
Try something like:
SELECT
worker_id
,sum(case when payment_status = “1”
then paid_price else 0 end) as Total_paid
,sum(case when payment_status = “0” and chosen = true
then paid_price else 0 end) as Total_outstanding
,sum(case when (payment_status = “1”)
or (payment_status = “0” and chosen = true)
then paid_price else 0 end) as Total_pledged
from community_grants
group by worker_id

UPDATE psql table with calculated value

Table A:
id | dob
Table B:
id | type
I want to calculate age from A.dob and based on that I want to update B.type, I tried following but it is giving me an error.
UPDATE B
SET B.type = CASE
WHEN AGE <= 16 THEN 'C'
WHEN AGE>25 and age<=40 THEN 'Y'
WHEN AGE>40 THEN 'O'
END
from AGE as ( EXTRACT(YEAR FROM age(now(),A.dob)) ), A inner join B on A.id=B.id
where A.dob is not null;
Try something like this:
update b
set type = (case when age <= 16 then 'C'
when age > 25 and age <= 40 then 'Y'
when age > 40 then 'O'
end)
from (select a.*, extact(year from age(now(), a.dob)) as age
from a
) a
where a.id = b.id and a.age is not null;
Notes:
Your logic has no type for 17 to 24 year-olds.
Don't repeat the table being updated in the FROM clause. That is not how Postgres works.
The JOIN conditions -- alas -- go in the WHERE clause.
This uses a subquery to calculate age.
I figure you might as well test for age not being NULL rather than the base column.

How to use SQL (postgresql) query to conditionally change value within each group?

I am pretty new to postgresql (or sql), and have not learned how to deal with such "within group" operation. My data is like this:
p_id number
97313 4
97315 10
97315 10
97325 0
97325 15
97326 4
97335 0
97338 0
97338 1
97338 2
97344 5
97345 14
97349 0
97349 5
p_id is not unique and can be viewed as a grouping variable. I would like to change the number within each p_id to achieve such operation:
if for a given p_id, one of the value is 0, but any of the other "number" for that pid is >2, then set the 0 value as NULL. Like the "p_id" 97325, there are "0" and "15" associated with it. I will replace the 0 by NULL, and keep the other 15 unchanged.
But for p_id 97338, the three rows associated with it have number "0" "1" "2", therefore I do not replace the 0 by NULL.
The final data should be like:
p_id number
97313 4
97315 10
97315 10
97325 NULL
97325 15
97326 4
97335 0
97338 0
97338 1
97338 2
97344 5
97345 14
97349 NULL
97349 5
Thank you very much for the help!
A CASE in a COUNT OVER in a CASE:
SELECT
p_id,
(CASE
WHEN number = 0 AND COUNT(CASE WHEN number > 2 THEN number END) OVER (PARTITION BY p_id) > 0
THEN NULL
ELSE number
END) AS number
FROM yourtable
Test it here on rextester.
Works for PostgreSQL 10:
SELECT p_id, CASE WHEN number = 0 AND maxnum > 2 AND counts >= 2 THEN NULL ELSE number END AS number
FROM
(
SELECT a.p_id AS p_id, a.number AS number, b.maxnum AS maxnum, b.counts AS counts
FROM trans a
LEFT JOIN
(
SELECT p_id, MAX(number) AS maxnum, COUNT(1) AS counts
FROM trans
GROUP BY p_id
) b
ON a.p_id = b.p_id
) a1
use case when
select p_id,
case when p_id>2 and number=0 then null else number end as number
from yourtable
http://sqlfiddle.com/#!17/898c3/1
I would express this as:
SELECT p_id,
(CASE WHEN number <> 0 OR MAX(number) OVER (PARTITION BY p_id) <= 2
THEN number
END) as number
FROM t;
If the fate of a record depends on the existence of other records within (the same or another) table, you could use EXISTS(...) :
UPDATE ztable zt
SET number = NULL
WHERE zt.number = 0
AND EXISTS ( SELECT *
FROM ztable x
WHERE x.p_id = zt.p_id
AND x.number > 2
);

grouping with oracle with clause

i have a table from which i am trying to pull the frequency distribution by age-group and score-group. Below is the query I am running.
with age_map as (select distinct age,case when age is not null
and AGE>=16 AND AGE<36 then '16-35'
when age is not null and AGE>=36 AND
AGE<56 then '36-56'
when age is not null and AGE>=56 then
'56+'
when age is null then 'NA'
end as age_group
from rec_table
where monthofsale = 'Apr 2017'
)
select name,location,b.age_group,sum(weight),count(*)
from rec_table a, age_map b
where a.age = b.age
group by name,location,b.age_group
When running the query, I keep getting the error:
ORA-00979: not a GROUP BY expression
I am pretty sure I am including all the columns. So, wondering if this is not correct?
My expected output is:
Name location age_group weight count
x y 16-35 15 3
p q 36-56 48 7
Any ideas on this?
The group by error is coming from the with clause:
Do the select like this:
select age, case when age is not null and AGE>=16 AND AGE<36 then '16-35'
when age is not null and AGE>=36 AND AGE<56 then '36-56'
when age is not null and AGE>=56 then '56+'
when age is null then 'NA'
end as age_group
from rec_table
where monthofsale = 'Apr 2017'
group by age, case when age is not null and AGE>=16 AND AGE<36 then '16-35'
when age is not null and AGE>=36 AND AGE<56 then '36-56'
when age is not null and AGE>=56 then '56+'
when age is null then 'NA'
end

Looping in select query

I want to do something like this:
select id,
count(*) as total,
FOR temp IN SELECT DISTINCT somerow FROM mytable ORDER BY somerow LOOP
sum(case when somerow = temp then 1 else 0 end) temp,
END LOOP;
from mytable
group by id
order by id
I created working select:
select id,
count(*) as total,
sum(case when somerow = 'a' then 1 else 0 end) somerow_a,
sum(case when somerow = 'b' then 1 else 0 end) somerow_b,
sum(case when somerow = 'c' then 1 else 0 end) somerow_c,
sum(case when somerow = 'd' then 1 else 0 end) somerow_d,
sum(case when somerow = 'e' then 1 else 0 end) somerow_e,
sum(case when somerow = 'f' then 1 else 0 end) somerow_f,
sum(case when somerow = 'g' then 1 else 0 end) somerow_g,
sum(case when somerow = 'h' then 1 else 0 end) somerow_h,
sum(case when somerow = 'i' then 1 else 0 end) somerow_i,
sum(case when somerow = 'j' then 1 else 0 end) somerow_j,
sum(case when somerow = 'k' then 1 else 0 end) somerow_k
from mytable
group by id
order by id
this works, but it is 'static' - if some new value will be added to 'somerow' I will have to change sql manually to get all the values from somerow column, and that is why I'm wondering if it is possible to do something with for loop.
So what I want to get is this:
id somerow_a somerow_b ....
0 3 2 ....
1 2 10 ....
2 19 3 ....
. ... ...
. ... ...
. ... ...
So what I'd like to do is to count all the rows which has some specific letter in it and group it by id (this id isn't primary key, but it is repeating - for id there are about 80 different values possible).
http://sqlfiddle.com/#!15/18feb/2
Are arrays good for you? (SQL Fiddle)
select
id,
sum(totalcol) as total,
array_agg(somecol) as somecol,
array_agg(totalcol) as totalcol
from (
select id, somecol, count(*) as totalcol
from mytable
group by id, somecol
) s
group by id
;
id | total | somecol | totalcol
----+-------+---------+----------
1 | 6 | {b,a,c} | {2,1,3}
2 | 5 | {d,f} | {2,3}
In 9.2 it is possible to have a set of JSON objects (Fiddle)
select row_to_json(s)
from (
select
id,
sum(totalcol) as total,
array_agg(somecol) as somecol,
array_agg(totalcol) as totalcol
from (
select id, somecol, count(*) as totalcol
from mytable
group by id, somecol
) s
group by id
) s
;
row_to_json
---------------------------------------------------------------
{"id":1,"total":6,"somecol":["b","a","c"],"totalcol":[2,1,3]}
{"id":2,"total":5,"somecol":["d","f"],"totalcol":[2,3]}
In 9.3, with the addition of lateral, a single object (Fiddle)
select to_json(format('{%s}', (string_agg(j, ','))))
from (
select format('%s:%s', to_json(id), to_json(c)) as j
from
(
select
id,
sum(totalcol) as total_sum,
array_agg(somecol) as somecol_array,
array_agg(totalcol) as totalcol_array
from (
select id, somecol, count(*) as totalcol
from mytable
group by id, somecol
) s
group by id
) s
cross join lateral
(
select
total_sum as total,
somecol_array as somecol,
totalcol_array as totalcol
) c
) s
;
to_json
---------------------------------------------------------------------------------------------------------------------------------------
"{1:{\"total\":6,\"somecol\":[\"b\",\"a\",\"c\"],\"totalcol\":[2,1,3]},2:{\"total\":5,\"somecol\":[\"d\",\"f\"],\"totalcol\":[2,3]}}"
In 9.2 it is also possible to have a single object in a more convoluted way using subqueries in instead of lateral
SQL is very rigid about the return type. It demands to know what to return beforehand.
For a completely dynamic number of resulting values, you can only use arrays like #Clodoaldo posted. Effectively a static return type, you do not get individual columns for each value.
If you know the number of columns at call time ("semi-dynamic"), you can create a function taking (and returning) polymorphic parameters. Closely related answer with lots of details:
Dynamic alternative to pivot with CASE and GROUP BY
(You also find a related answer with arrays from #Clodoaldo there.)
Your remaining option is to use two round-trips to the server. The first to determine the the actual query with the actual return type. The second to execute the query based on the first call.
Else, you have to go with a static query. While doing that, I see two nicer options for what you have right now:
1. Simpler expression
select id
, count(*) AS total
, count(somecol = 'a' OR NULL) AS somerow_a
, count(somecol = 'b' OR NULL) AS somerow_b
, ...
from mytable
group by id
order by id;
How does it work?
Compute percents from SUM() in the same SELECT sql query
SQL Fiddle.
2. crosstab()
crosstab() is more complex at first, but written in C, optimized for the task and shorter for long lists. You need the additional module tablefunc installed. Read the basics here if you are not familiar:
PostgreSQL Crosstab Query
SELECT * FROM crosstab(
$$
SELECT id
, count(*) OVER (PARTITION BY id)::int AS total
, somecol
, count(*)::int AS ct -- casting to int, don't think you need bigint?
FROM mytable
GROUP BY 1,3
ORDER BY 1,3
$$
,
$$SELECT unnest('{a,b,c,d}'::text[])$$
) AS f (id int, total int, a int, b int, c int, d int);