Optimize SQL query for given Conditions - sql

I have a query of the form:
select SUM(some_column) from (table)
where
IF x then a
ELSE y then b
ELSE z then c
...
Now in my JAVA code,i call this query for every different value(x,y,z,...),which returns me required sum.My objective is to calculate the Total sum for all those values,i.e,
Total = SUM_for_x + SUM_for_y + SUM_for_z + ....
Now,off course,I am hitting the DB for every such value,which is costly.Can i optimize this in 1 single query which does the job for me,hitting the DB just once ?

Assuming that you are only interested in the total sum and not in the partial sums and the conditions are mutually exclusive, you can do this:
SELECT SUM(some_column)
FROM (table)
WHERE (a)
OR (b)
OR (c)
OR ...
An other way (if your conditions are not mutually exclusive) would be:
SELECT SUM(some_column)
FROM(SELECT SUM(some_column) AS some_column FROM (table) WHERE (a) UNION ALL
SELECT SUM(some_column) AS some_column FROM (table) WHERE (b) UNION ALL
SELECT SUM(some_column) AS some_column FROM (table) WHERE (c) -- UNION ALL a.s.o.
)

SELECT SUM(case when x then a end) sum_a,
SUM(case when y then b end) sum_b,
SUM(case when z then c end) sum_c
FROM <table>
and sum up in Java
or get total in SQL:
SELECT SUM(case when x then a end) sum_a +
SUM(case when y then b end) sum_b +
SUM(case when z then c end) sum_c
FROM <table>

Related

SQL : Group by and check if all, some or none are set

Lets say I have the following table:
FKEY A B C D E F
'A' 1 0 1 0 1 0
'A' 0 1 1 1 0 0
Now i want to make a group by FKEY but I just want to know if the A-F columns has 1 in one, all or none of the grouped rows.. The resulton the above table would be:
FKEY A B C D E F
'A' S S A S S N
..where S is "some", A is "all" and N is "none".
What would be the best approach to make this query. I could so some nested queries, but isnt there a smarter way?
In my real life data, the 1's and 0's are actually DATETIME and NULL's
You can use case and aggregation:
select fkey,
(case when sum(a) = 0 then 'N'
when sum(a) = count(*) then 'A'
else 'S'
end) as a,
(case when sum(b) = 0 then 'N'
when sum(b) = count(*) then 'A'
else 'S'
end) as b,
. . .
from t
group by fkey;
The above assumes that the values are only 0 and 1. If that is the case, you can actually phrase this as:
(case when max(a) = 0 then 'N'
when min(a) = 1 then 'A'
else 'S'
end) as a,
You mentioned that your 0 and 1 are actually null or non null dates. Here's a modified version of Gordon's query that caters for that:
select fkey,
(case when count(datecol) = 0 then 'all dates are null'
when count(datecol) = count(*) then 'all dates are filled'
else 'some are null, some filled'
end) as a,
...
from t
group by fkey;
COUNT(null) is 0, COUNT('2001-01-01') is 1, COUNT(*) is the row count independent of any variable. Hence, if our count of the dates was 0, all must be null. If the count of the dates was equal to the count of the rows, then all must be filled with some value, otherwise it's a mix

Looping in select query

I want to do something like this:
select id,
count(*) as total,
FOR temp IN SELECT DISTINCT somerow FROM mytable ORDER BY somerow LOOP
sum(case when somerow = temp then 1 else 0 end) temp,
END LOOP;
from mytable
group by id
order by id
I created working select:
select id,
count(*) as total,
sum(case when somerow = 'a' then 1 else 0 end) somerow_a,
sum(case when somerow = 'b' then 1 else 0 end) somerow_b,
sum(case when somerow = 'c' then 1 else 0 end) somerow_c,
sum(case when somerow = 'd' then 1 else 0 end) somerow_d,
sum(case when somerow = 'e' then 1 else 0 end) somerow_e,
sum(case when somerow = 'f' then 1 else 0 end) somerow_f,
sum(case when somerow = 'g' then 1 else 0 end) somerow_g,
sum(case when somerow = 'h' then 1 else 0 end) somerow_h,
sum(case when somerow = 'i' then 1 else 0 end) somerow_i,
sum(case when somerow = 'j' then 1 else 0 end) somerow_j,
sum(case when somerow = 'k' then 1 else 0 end) somerow_k
from mytable
group by id
order by id
this works, but it is 'static' - if some new value will be added to 'somerow' I will have to change sql manually to get all the values from somerow column, and that is why I'm wondering if it is possible to do something with for loop.
So what I want to get is this:
id somerow_a somerow_b ....
0 3 2 ....
1 2 10 ....
2 19 3 ....
. ... ...
. ... ...
. ... ...
So what I'd like to do is to count all the rows which has some specific letter in it and group it by id (this id isn't primary key, but it is repeating - for id there are about 80 different values possible).
http://sqlfiddle.com/#!15/18feb/2
Are arrays good for you? (SQL Fiddle)
select
id,
sum(totalcol) as total,
array_agg(somecol) as somecol,
array_agg(totalcol) as totalcol
from (
select id, somecol, count(*) as totalcol
from mytable
group by id, somecol
) s
group by id
;
id | total | somecol | totalcol
----+-------+---------+----------
1 | 6 | {b,a,c} | {2,1,3}
2 | 5 | {d,f} | {2,3}
In 9.2 it is possible to have a set of JSON objects (Fiddle)
select row_to_json(s)
from (
select
id,
sum(totalcol) as total,
array_agg(somecol) as somecol,
array_agg(totalcol) as totalcol
from (
select id, somecol, count(*) as totalcol
from mytable
group by id, somecol
) s
group by id
) s
;
row_to_json
---------------------------------------------------------------
{"id":1,"total":6,"somecol":["b","a","c"],"totalcol":[2,1,3]}
{"id":2,"total":5,"somecol":["d","f"],"totalcol":[2,3]}
In 9.3, with the addition of lateral, a single object (Fiddle)
select to_json(format('{%s}', (string_agg(j, ','))))
from (
select format('%s:%s', to_json(id), to_json(c)) as j
from
(
select
id,
sum(totalcol) as total_sum,
array_agg(somecol) as somecol_array,
array_agg(totalcol) as totalcol_array
from (
select id, somecol, count(*) as totalcol
from mytable
group by id, somecol
) s
group by id
) s
cross join lateral
(
select
total_sum as total,
somecol_array as somecol,
totalcol_array as totalcol
) c
) s
;
to_json
---------------------------------------------------------------------------------------------------------------------------------------
"{1:{\"total\":6,\"somecol\":[\"b\",\"a\",\"c\"],\"totalcol\":[2,1,3]},2:{\"total\":5,\"somecol\":[\"d\",\"f\"],\"totalcol\":[2,3]}}"
In 9.2 it is also possible to have a single object in a more convoluted way using subqueries in instead of lateral
SQL is very rigid about the return type. It demands to know what to return beforehand.
For a completely dynamic number of resulting values, you can only use arrays like #Clodoaldo posted. Effectively a static return type, you do not get individual columns for each value.
If you know the number of columns at call time ("semi-dynamic"), you can create a function taking (and returning) polymorphic parameters. Closely related answer with lots of details:
Dynamic alternative to pivot with CASE and GROUP BY
(You also find a related answer with arrays from #Clodoaldo there.)
Your remaining option is to use two round-trips to the server. The first to determine the the actual query with the actual return type. The second to execute the query based on the first call.
Else, you have to go with a static query. While doing that, I see two nicer options for what you have right now:
1. Simpler expression
select id
, count(*) AS total
, count(somecol = 'a' OR NULL) AS somerow_a
, count(somecol = 'b' OR NULL) AS somerow_b
, ...
from mytable
group by id
order by id;
How does it work?
Compute percents from SUM() in the same SELECT sql query
SQL Fiddle.
2. crosstab()
crosstab() is more complex at first, but written in C, optimized for the task and shorter for long lists. You need the additional module tablefunc installed. Read the basics here if you are not familiar:
PostgreSQL Crosstab Query
SELECT * FROM crosstab(
$$
SELECT id
, count(*) OVER (PARTITION BY id)::int AS total
, somecol
, count(*)::int AS ct -- casting to int, don't think you need bigint?
FROM mytable
GROUP BY 1,3
ORDER BY 1,3
$$
,
$$SELECT unnest('{a,b,c,d}'::text[])$$
) AS f (id int, total int, a int, b int, c int, d int);

how to calculate count(*) in various percentiles

Say, I have a table holding integer values from 0 up to 9,999 and I want to make a distribution plot of the population of values in each percentile.
Below is what comes to mind. Is there a better way?
CREATE TABLE A(x INTEGER);
SELECT
(SELECT COUNT(*) FROM A WHERE x>=0 AND x<10) AS prcntl_01,
(SELECT COUNT(*) FROM A WHERE x>=10 AND x<20) AS prcntl_02,
(SELECT COUNT(*) FROM A WHERE x>=20 AND x<30) AS prcntl_03,
(SELECT COUNT(*) FROM A WHERE x>=30 AND x<40) AS prcntl_04,
(SELECT COUNT(*) FROM A WHERE x>=40 AND x<50) AS prcntl_05,
...
(SELECT COUNT(*) FROM A WHERE x>=990 AND x<1000) AS prcntl_100,
The size of the SQL statement is not a consideration as I can generate it on the fly. I am just wondering if there is an idiomatic way to get population counts in each percentile.
Use conditional aggregation instead of multiple queries:
SELECT sum(case when x >= 0 AND x < 10 then 1 else 0 end) as prcntl_01,
sum(case when x >= 10 AND x < 20 then 1 else 0 end) as prcntl_02,
. . .
sum(case when x >= 990 AND x < 1000 then 1 else 0 end) as prcntl_100
FROM A;
If you want the values in separate rows rather than columns, you can simply do:
select n as which,
sum(case when x >= (n - 1)*10 and x < n*10 - 1 then 1 else 0 end) as percentile
from A cross join
generate_series(1, 100) as n
group by n;
This limits the amount of code you have to write.

T-Sql: turn multiple rows into one row

How does one turn these multiple rows into one row? N and Y are bool values.
Id IsPnt IsPms, IsPdt
1 N Y N
1 N Y N
1 Y N N
into this
Id IsPnt IsPms, IsPdt
1 Y Y N
Edit:
The query that produces the resultset looks like this
select b.id,
CASE mpft.PlanIndCd WHEN 'PBMN' THEN 1 ELSE 0 END AS IsPnt,
CASE mpft.PlanIndCd WHEN 'PBMT' THEN 1 ELSE 0 END AS IsPbt,
CASE mpft.PlanIndCd WHEN 'PBMS' THEN 1 ELSE 0 END AS IsPms
from vw_D_SomveViewName pb
-- bunch of joins
where mpft.PlanIndCd in ('HANR', 'PBMN','PBMT','PBMS','HAWR')
You can simply use MAX() on this if the values are really Y and N only.
SELECT ID, MAX(IsPnt) IsPnt, MAX(IsPms) IsPms, MAX(IsPdt) IsPdt
FROM tableName
GROUP BY ID
UPDATE 1
SELECT b.id,
MAX(CASE mpft.PlanIndCd WHEN 'PBMN' THEN 1 ELSE 0 END) AS IsPnt,
MAX(CASE mpft.PlanIndCd WHEN 'PBMT' THEN 1 ELSE 0 END) AS IsPbt,
MAX(CASE mpft.PlanIndCd WHEN 'PBMS' THEN 1 ELSE 0 END) AS IsPms
FROM vw_D_SomveViewName pb
-- bunch of joins
WHERE mpft.PlanIndCd in ('HANR', 'PBMN','PBMT','PBMS','HAWR')
GROUP BY b.ID
Will this work?
select
id,
max(IsPnt),
max(IsPms),
max(IsPdt)
from
table
GROUP BY
id
After the edit of your question, you can simply use the PIVOT table operator directly instead of using the MAX expression, something like:
SELECT
Id,
PBMN AS IsPnt,
PBMT AS IsPbt,
PBMS AS IsPms
FROM
(
SELECT
id,
mpft.PlanIndCd,
ROW_NUMBER() OVER(PARTITION BY id
ORDER BY ( SELECT 1)) AS RN
from vw_D_SomveViewName pb
-- bunch of joins
where mpft.PlanIndCd in ('HANR', 'PBMN','PBMT','PBMS','HAWR')
) AS t
PIVOt
(
MAX(RN)
FOR PlanIndCd IN ([PBMN], [PBMT], [PBMS])
) AS p;
You can see it in action in the following demo example:
Demo on SQL Fiddle
select Id, MAX(IPnt), MAX(IsPms), MAX(IsPdt)
from table etc

doing a group by within a group

i have two columns business_line(with values X,Y) and business_segment(values X,Y,Z same X and Y as business_line) in a table with name "Sometable". I have another column with name type_of_payment(with values A,B,C,D,E) and a final column with name transaction_value. This is what i want to do:
Sum the transactions grouped by business_line and business_segment and also find out what proportion of these payments were from A,C,E. So my output table would be something like this
(last three columns can be named anything
but they specify proportions of A,C,E)
Business_line SUM(transaction_value) A C E
and business seg.
X 100 20% 30% 50%
Y 200 11% 12% 77%
X 300 and so on
Y 170
Z 230
How do i do this??
PS : the sums of A C E need not be 100% as B and D are also present
This is standard SQL, should work on Oracle (but untested):
SELECT
business_line,
business_segment,
grand_total,
A_total * 100.0 / grand_total as A,
C_total * 100.0 / grand_total as C,
E_total * 100.0 / grand_total as E
FROM
(
SELECT
business_line,
business_segment,
SUM(transaction_value) as grand_total,
SUM(CASE WHEN payment_type = 'A' THEN transaction_value END) as A_total,
SUM(CASE WHEN payment_type = 'C' THEN transaction_value END) as C_total,
SUM(CASE WHEN payment_type = 'E' THEN transaction_value END) as E_total
FROM
SomeTable
GROUP BY
business_line,
business_segment
) as t
For Oracle 11g and above, you can use PIVOT
select *
from
(
select sometable.line, paymenttype,total, 100.0*transaction_value/total as percentage
from sometable
inner join
(select line, sum(transaction_value) as total
from sometable
group by line) total
on sometable.line = total.line
)
pivot
(
sum(percentage) for paymenttype in (a,c,e)
)