SQL Nested Sum(probably ?) - sql

I got a select that groups the total value of a column, but also I need the percentage referring to each sum referring to the total sum!
As an example of today's return:
id|sum
1| 10
2| 50
3| 80
4| 20
5| 60
What I'm looking for: (the total is 220)
id|sum| %
1| 10|10/220
2| 50|50/220
3| 80|80/220
4| 20|20/220
5| 60|60/220
I am not sure if there's an easy way to got that result, using a sub select I could do this but I tought it not good and should have a better way.
The select is so simple:
Select
id,
sum(value)
From
table
Group By
id
But, that's the real select:
Select
OrcaItem.Cd_Produto,
OrcaItem.Ds_Produto,
OrcaItem.Cd_Produto || ' - ' || OrcaItem.Ds_Produto CdDs_Produto,
Estoque.Qt_Disponivel,
Sum(OrcaItem.Qt_Vendida) Qt_Vendida,
Sum(OrcaItem.Vr_TotalLiquido) Vr_Liquido, -- HERE!
Cast(Sum(
Case :piTipoCusto
When 0 Then (Produto.Vr_CustoAtual * OrcaItem.Qt_Vendida)
When 1 Then (OrcaItem.Vr_CustoAtual * OrcaItem.Qt_Vendida)
When 2 Then (OrcaItem.Vr_CustoFin * OrcaItem.Qt_Vendida)
When 3 Then (Produto.Vr_UltPcoCompra * OrcaItem.Qt_Vendida)
End
) As Numeric(15,2)) Vr_Custo,
Cast(Sum(
Case :piTipoCusto
When 0 Then OrcaItem.Vr_TotalLiquido - (Produto.Vr_CustoAtual * OrcaItem.Qt_Vendida)
When 1 Then OrcaItem.Vr_TotalLiquido - (OrcaItem.Vr_CustoAtual * OrcaItem.Qt_Vendida)
When 2 Then OrcaItem.Vr_TotalLiquido - (OrcaItem.Vr_CustoFin * OrcaItem.Qt_Vendida)
When 3 Then OrcaItem.Vr_TotalLiquido - (Produto.Vr_UltPcoCompra * OrcaItem.Qt_Vendida)
End
) As Numeric(15,2)) Vr_Lucro,
Sum(
Case :piTipoCusto
When 0 Then OrcaItem.Vr_TotalLiquido - (Produto.Vr_CustoAtual * OrcaItem.Qt_Vendida)
When 1 Then OrcaItem.Vr_TotalLiquido - (OrcaItem.Vr_CustoAtual * OrcaItem.Qt_Vendida)
When 2 Then OrcaItem.Vr_TotalLiquido - (OrcaItem.Vr_CustoFin * OrcaItem.Qt_Vendida)
When 3 Then OrcaItem.Vr_TotalLiquido - (Produto.Vr_UltPcoCompra * OrcaItem.Qt_Vendida)
End
) / Sum(OrcaItem.Vr_TotalLiquido) * 100 Pc_Lucro,
Produto.Cd_Linha,
Linha.Ds_Linha,
Produto.Cd_Grupo,
Grupo.Ds_Grupo
From
OrcaItem
Inner Join Orca On Orca.Nr_Orcamento = OrcaItem.Nr_Orcamento
Inner Join Estoque On Estoque.Cd_Produto = OrcaItem.Cd_Produto
Inner Join Produto On Produto.Cd_Produto = OrcaItem.Cd_Produto
Inner Join Linha On Linha.Cd_Linha = Produto.Cd_Linha
Inner Join Grupo On Grupo.Cd_Linha = Produto.Cd_Linha And
Grupo.Cd_Grupo = Produto.Cd_Grupo
Where
Orca.Fg_Situacao In ('F', 'R') And
Orca.Dt_Atendido Between :piDt_Inicio And :piDt_Final
Group By
OrcaItem.Cd_Produto,
OrcaItem.Ds_Produto,
CdDs_Produto,
Estoque.Qt_Disponivel,
Produto.Cd_Linha,
Linha.Ds_Linha,
Produto.Cd_Grupo,
Grupo.Ds_Grupo
Order By
Produto.Cd_Linha,
Produto.Cd_Grupo,
OrcaItem.Ds_Produto
Sum(OrcaItem.Vr_TotalLiquido) Vr_Liquido = total of each
The 'global' total is the sum of Vr_Liquido.

Assuming the simple select you show, you can use a window function, assuming you're using Firebird 3.0 or higher:
select id, sum_value, cast(sum_value as numeric(18,2)) / sum(sum_value) over()
from (
select id, sum("VALUE") as sum_value
from testdata
group by id
)
The over() will aggregate over all rows. The cast to NUMERIC(18,2) is needed to ensure the value is non-zero. Use a higher scale if you need more digits (or cast to double precision).
For the more complex select, you take the same approach. Make the original query a derived table (or common table expression), and use the window function in the enclosing select list.

Related

Oracle SQL : Calculating weighted probability

I'm struggling to retrieve a "weighted probability" from a database table in my SQL statement.
What do I need to do:
I have tabular information of probable financial values like:
Table my_table
ID
P [%]
Value [$]
1
50
200
2
50
200
3
60
100
I need to calculate the weighted probability of reasonable worst case financial value to occur.
The formula is:
P_weighted = 1 - (1 - P_1 * Value_1/Max(Value_1-n) * (1 - P_2 * Value_2/Max(Value_1-n) * ...
i.e.
P_weighted = 1 - Product(1 - P_i * Value_i / Max(Value_1-n)
P_weighted = 1 - (1 - 50% * 200 / 200) * (1 - 50% * 200 / 200) * (1 - 60% * 100 / 200) = 82.5%
I know the is not product function in (Oracle) SQL, and this can be substituted by EXP( SUM LN(x))) ensuring x is always positive.
Hence, if I were only to calculate the combined probability I could (regardless of the value I could do like:
SELECT EXP(SUM(LN(1 - t.P))) FROM FROM my_table t WHERE condition
When I need to include the Max(t.Value) I've got the following problem:
A SELECT list cannot include both a group function, such as AVG, COUNT, MAX, MIN, SUM, STDDEV, or VARIANCE, and an individual column expression, unless the individual column expression is included in a GROUP BY clause.
So I tried the following:
SELECT ROUND(1-EXP(SUM(LN(1 - t.P*t.Value/max(t.Value)))),1) FROM FROM my_table t WHERE condition GROUP BY t.P, t.Value
But this does obviously group the output by probability rather than multiplying it and just returns 0.5 or 50% instead of the product which should be 0.825 or 82.5%.
How do I get the weighted probability from by table above using (Oracle) SQL?
Does this do it:
with da as (select .50 as p, 200 as v from dual union all select .50 , 200 from dual union all select .60,100 from dual),
mx as (select max(v) mx from da)
select exp(sum(ln(1-da.p*da.v/mx))) from da, mx;
EXP(SUM(LN(1-DA.P*DA.V/MX)))
----------------------------
.175
with
test1 as(
select max(value) v_max from my_table
),
test2 as(
select 1-(my.p/100* value/t1.v_max) rez
from my_table my, test1 t1
)
select to_char(round((1-(EXP (SUM (LN (rez)))))*100,2))||'%' "Weighted probability"
from test2
RESULT:
Weighted probability
--------------------
82,5%
If you want the calculation per-row then you can use an analytic SUM:
SELECT id,
ROUND(1 - EXP(SUM(LN(1 - wp)) OVER (ORDER BY id)), 3) AS cwp
FROM (
SELECT id,
p * value / MAX(value) OVER () AS wp
FROM table_name
)
Which, for the sample data:
CREATE TABLE table_name (ID, P, Value) AS
SELECT 1, .50, 200 FROM DUAL UNION ALL
SELECT 2, .50, 200 FROM DUAL UNION ALL
SELECT 3, .60, 100 FROM DUAL;
Outputs the cumulative weighted probabilities:
ID
CWP
1
.5
2
.75
3
.825
If you just want the total weighted probability then:
SELECT ROUND(1 - EXP(SUM(LN(1 - wp))), 3) AS twp
FROM (
SELECT id,
p * value / MAX(value) OVER () AS wp
FROM table_name
)
Which, for the sample data, outputs:
TWP
.825
db<>fiddle here

SQL count DISTINCT ONCE user_id multiple attributes

Hello there I cant manage to get a good result for the following case:
I have a table which is like this:
UserID | Label
-------- ------
1 | Private
1 | Public
2 | Private
3 | Hidden
4 | Public
5 | Hidden
I want to have the following happening if a User has following assigned he is:
Private and Hidden are treaten the same: lets say Business
Public: BtoC
Public and Private and/or Hidden: both
So in the end I have a count(DISTINCT UserID) of
Business 3
BtoC 1
both 1
I have tried to use CASE WHEN but it doesn't work my current total query looks like this:
SELECT gen_month,
count(DISTINCT cu.id) as leads,
a.label
FROM generate_series(DATE_TRUNC('month', CURRENT_DATE::date - 96*INTERVAL '1 month'), CURRENT_DATE::date, '1 month') m(gen_month)
LEFT OUTER JOIN company_user AS cu
ON (date_trunc('month', cu.creation_date) = date_trunc('month', gen_month))
LEFT JOIN user u
ON u.user_id = cu.id
LEFT join user_account_status as uas
on cu.id = uas.user_id
LEFT JOIN account as a
on uas.account_id = a.id
where gen_month >= DATE_TRUNC('month',NOW() - INTERVAL '5 months')
group by m.gen_month, a.label
order by gen_month
So my main problem now is that the count appears in every attribute once.
How can I make a userid only count once under condition CASE WHEN user_id appears Public and (Private or Hidden) THEN count(DISTINCT user_id) as Both?
Addition: its mySQL mariaDB and postgreSQL. But first I would happy with Postgres
This is not implemented in your total query, but for counting users for each category, you can:
with the_table(UserID , Label) as(
select 1 ,'Private' union all
select 1 ,'Public' union all
select 2 ,'Private' union all
select 3 ,'Hidden' union all
select 4 ,'Public' union all
select 5 ,'Hidden'
)
select result, count(*) from (
select UserID, case when min(Label) = 'Public' then 'BtoC' when max(Label) in('Private','Hidden') then 'Business' else 'both' end as result
from the_table
group by UserID
) t
group by result
with
my_table(user_id, label) as (values
(1,'Private'),
(1,'Public'),
(2,'Private'),
(3,'Hidden'),
(4,'Public'),
(5,'Hidden')),
t as (
select
user_id,
string_agg('{'||label||'}', '') as labels
from my_table
group by user_id),
tt as (
select
user_id,
labels,
case
when
position('{Public}' in labels) > 0 and (position('{Private}' in labels) > 0 or position('{Hidden}' in labels) > 0) then 'Both'
when
position('{Private}' in labels) > 0 or position('{Hidden}' in labels) > 0 then 'Business'
when
position('{Public}' in labels) > 0 then 'BtoC'
end as kind
from t)
select kind, count(*) from tt group by kind;
For MariaDB use GROUP_CONCAT() instead of PostgreSQL string_agg().
Note that the case statement check conditions in order of appearance and returns the value for the first satisfied condition.
PS: Using PostgreSQL's arrays the conditions would be more elegant.

SQL - Return all rows for ID where one row meets condition A,B,or C

I'm trying to return all rows for a particular IDs where a condition is met in any one of the rows tied to those IDs. Pardon me being a newbie to SQL... Example below:
ID * Line * # *
12 * 1 * A *
12 * 2 * B *
12 * 3 * X *
12 * 4 * Y *
15 * 1 * A *
15 * 2 * B *
15 * 3 * C *
Not sure what the code would be other than my select and condition = (X, Y, or Z) to return:
ID * Line * # *
12 * 1 * A * <-- doesn't include X, Y, or Z but is part of the ID which
12 * 2 * B * <-- has X in another row of that ID
12 * 3 * X *
12 * 4 * Y *
I'm wanting to pull all row records despite not meeting the condition as long as they're part of the ID that has a row that meets the condition.
Thanks for the help!
* Edit: Including code attempted*
SELECT ID
,LINE
,#
WHERE ID,
IN (
SELECT ID
WHERE # IN ('X','Y','Z'))
Results:
ID LINE #
12 3 X
12 4 Y
What I need:
ID LINE #
12 1 A
12 2 B
12 3 X
12 4 Y
I almost feel like I need to create a temp table of ID & LINE using my condition of IN('X','Y','Z') and then inner join on ID for all LINE(s) not X,Y,Z. I think that may work, but I haven't learned how to use temp tables yet. I'm a little troubled because I'm using a query, which I've simplified a ton here, where I'm selecting 18 fields that join in 7 other tables. I think this is just complicating my understanding of the subquery, not so much the subquery being affected by that.
Thanks all for the help and answers so far!
You can use a subquery and IN for this.
Select *
From YourTable
where ID in (select ID from YourTable where # in ('X','Y','Z'))
Just a note, there is no 12 * 4 * C * in your data but I think it's just a type-o in your results and should be 12 * 4 * Y *
Besides the subquery approach you might also try an OLAP-function (Depending on the actual data this might be better or worse, of course)
In Teradata you can apply QUALIFY:
Select *
From YourTable
qualify -- check if any row with the same ID has X/Y/Z
max(case when ID in ('X','Y','Z') then 1 else 0 end)
over (partition by ID) = 1
In SQL Server you have to use a Derived Table/CTE:
Select *
From
( Select *,
max(case when ID in ('X','Y','Z') then 1 else 0 end)
over (partition by ID) as flag
from YourTable
) as dt
where flag = 1

Using Random function in the WHERE Clause

In my WHERE Clause I'm using the random function to get a random number between 1 and 5. However the result is always empty without any error.
Here it is:
Select Question._id, question_text, question_type, topic, favorite,
picture_text, picture_src, video_text, video_src, info_title, info_text,
info_picture_src, topic_text
FROM Question
LEFT JOIN Question_Lv ON Question._id = Question_Lv.question_id
LEFT JOIN Info ON Question._id = Info.question_id
LEFT JOIN Info_Lv ON Question._id = Info_Lv.question_id
LEFT JOIN Picture ON Question._id = Picture.question_id
LEFT JOIN Picture_Lv ON Question._id = Picture_Lv.question_id
LEFT JOIN Video ON Question._id = Video.question_id
LEFT JOIN Video_Lv ON Question._id = Video_Lv.question_id
LEFT JOIN Topic ON Question.topic = Topic._id
LEFT JOIN Topic_Lv ON Topic._id = Topic_Lv.topic_id
LEFT JOIN Exam ON Question._id = Exam.question_id
WHERE Exam.exam = (random() * 5+ 1)
What is the random function doing in this case and how to use it correctly?
From Docs
random()
The random() function returns a pseudo-random integer between
-9223372036854775808 and +9223372036854775807.
Hence your random value is not between 0 and 1 as you assumed and hence no rows.
You can get it between 0 and 1 by dividing it with 2×9223372036854775808 and adding 0.5 to it.
random() / 18446744073709551616 + 0.5
So, your where clause becomes:
WHERE Exam.exam = ((random() / 18446744073709551616 + 0.5) * 5 + 1)
which is same as:
WHERE Exam.exam = 5 * random() / 18446744073709551616 + 3.5
Also, you'll probably need to round the output of right side calculation, so:
WHERE Exam.exam = round(5 * random() / 18446744073709551616 + 3.5)
I'll answer this question using Vertica as a database.
Vertica has the function RANDOM(), which returns a random double precision number between 0 and 1, and the function RANDOMINT(<*integer*>), which returns an integer number between 0 and *<integer>*-1. I'll use RANDOMINT(5) for this example.
As a general suggestion - Isolate your specific problem in your question. The joins in your query are not part of the problem. And use a sample table, like I do in the code below.
As some of the previous answers suggested, RANDOMINT(5) will return a new random integer between 0 and 4 for each of the rows that are read from the exam table.
See here:
WITH exam(id,exam,exam_res) AS (
SELECT 1,1,'exam_res_1'
UNION ALL SELECT 2,2,'exam_res_2'
UNION ALL SELECT 3,3,'exam_res_3'
UNION ALL SELECT 4,4,'exam_res_4'
UNION ALL SELECT 5,5,'exam_res_5'
UNION ALL SELECT 6,1,'exam_res_1'
UNION ALL SELECT 7,2,'exam_res_2'
UNION ALL SELECT 8,3,'exam_res_3'
UNION ALL SELECT 9,4,'exam_res_4'
UNION ALL SELECT 10,5,'exam_res_5'
UNION ALL SELECT 11,1,'exam_res_1'
UNION ALL SELECT 12,2,'exam_res_2'
UNION ALL SELECT 13,3,'exam_res_3'
UNION ALL SELECT 14,4,'exam_res_4'
UNION ALL SELECT 15,5,'exam_res_5'
)
SELECT * FROM exam WHERE exam=RANDOMINT(5)+1
;
id|exam|exam_res
3| 3|exam_res_3
4| 4|exam_res_4
5| 5|exam_res_5
6| 1|exam_res_1
7| 2|exam_res_2
9| 4|exam_res_4
12| 2|exam_res_2
What you need to do is make sure that you call your random number generator only once.
If your database abides to the ANSI 99 standard and supports the WITH clause (the common table expression, as I also use it to generate the sample data), do that also in a common table expression - which I call search_exam:
WITH search_exam(exam) AS (
SELECT RANDOMINT(5)+1
)
, exam(id,exam,exam_res) AS (
SELECT 1,1,'exam_res_1'
UNION ALL SELECT 2,2,'exam_res_2'
UNION ALL SELECT 3,3,'exam_res_3'
UNION ALL SELECT 4,4,'exam_res_4'
UNION ALL SELECT 5,5,'exam_res_5'
UNION ALL SELECT 6,1,'exam_res_1'
UNION ALL SELECT 7,2,'exam_res_2'
UNION ALL SELECT 8,3,'exam_res_3'
UNION ALL SELECT 9,4,'exam_res_4'
UNION ALL SELECT 10,5,'exam_res_5'
UNION ALL SELECT 11,1,'exam_res_1'
UNION ALL SELECT 12,2,'exam_res_2'
UNION ALL SELECT 13,3,'exam_res_3'
UNION ALL SELECT 14,4,'exam_res_4'
UNION ALL SELECT 15,5,'exam_res_5'
)
SELECT id,exam,exam_res FROM exam
WHERE exam=(SELECT exam FROM search_exam)
;
id|exam|exam_res
1| 1|exam_res_1
6| 1|exam_res_1
11| 1|exam_res_1
Alternatively, you can go SELECT id,exam,exam_res FROM exam INNER JOIN search_exam USING(exam) .
Happy playing -
Marco the Sane

Group by aggregate with some arithmetics (merging similar rows)

I need to combine the following rows:
id num_votes avg_vote
1 2 4
1 3 1
1 0 0
To end up with this:
id num_votes avg_votes
1 2+3+0=5 4*2/5 + 3*1/5 = 2.2
I've tried the following, aggregate nested functions don't work of course:
select id
, sum(num_votes) as _num_votes
, sum(num_votes/sum(num_votes)*avg_vote) as _avg_vote
from mytable
GROUP BY id, num_votes, avg_vote;
SELECT id
, sum(num_votes) as _num_votes
, round( sum(num_votes * avg_vote)::numeric
/ sum(num_votes)
, 2) AS avg_votes
FROM mytable
GROUP BY id; -- you cannot GROUP BY aggregated columns, just: id
SQL Fiddle.
You don't need window functions for this. Aggregate functions do the job.
The calculation:
4*2/5 + 3*1/5 + 0*0/5
Can be rewritten as:
(4*2 + 3*1 + 0*0)/5
And implemented as:
sum(num_votes * avg_vote) / sum(num_votes)
The rest is casting and rounding to preserve fractional digits. (Integer division would truncate.)