make numeric values homonyms fractions in sql table - sql

i have a table like this:
ID
num_A
num_B
1
1
168
2
1
4
2
5
24
2
6
24
3
1
36
So, num_A and num_B represent a fraction. That means for ID=1, i have 1/168, ID=2 ---> (1/4)+(5/24)+(6/24) = 17/24, ID=3 --> 1/36....
I need to add 2 columns, one with the sum(num_A) and one with the denominator num_B, for those with the same ID. So the example should be:
ID
num_A
num_B
sumA
denom_B
1
1
168
1
168
2
1
4
17
24
2
5
24
17
24
2
6
24
17
24
3
1
36
1
36
My problem is that i dont know how to calculate the denominator for each different fraction in postgres.

In general PostgreSQL provides the LCM function that returns the least common multiple (the smallest strictly positive number that is an integral multiple of both inputs), but it takes only two arguments and cannot be used to process rowset column values.
Thus, to get the LCM of rows with the same ID value, you can use a recursive CTE to process the rows one by one, using the LCM function with the LCM calculated in the previous step (in the first step equal to the value of num_B ) and the current value of num_B as arguments. This will produce the LCM value of all previous num_B and the current value for each row.
Finally, you can get the maximum (the last if to be exact, it would be the maximum anyway) calculated LCM value for rows grouped by ID and that will be the LCM for all num_B values ​​with the same ID.
The rest is simple - divide, multiply and sum.
Query:
WITH t_rn AS (
SELECT *, ROW_NUMBER() OVER (PARTITION BY id ORDER BY num_b) AS rn FROM t
),
least_common_multiple AS (
WITH RECURSIVE least_multiples AS (
SELECT
id,
num_b,
num_b AS lm,
rn
FROM t_rn
WHERE rn = 1
UNION ALL
SELECT
t_rn.id,
t_rn.num_b,
LCM(t_rn.num_b, lm.lm),
t_rn.rn
FROM t_rn
JOIN least_multiples lm ON lm.id = t_rn.id AND t_rn.rn = lm.rn + 1
)
SELECT
id,
MAX(lm) AS lcm
FROM least_multiples
GROUP BY id
)
SELECT
t.*,
SUM(t.num_a * lm.lcm / t.num_b) OVER (PARTITION BY t.id) AS suma,
lm.lcm AS denom_b
FROM t
JOIN least_common_multiple lm ON t.id = lm.id
Output
id
num_a
num_b
suma
denom_b
1
1
168
1
168
2
1
4
17
24
2
5
24
17
24
2
6
24
17
24
3
1
36
1
36
DEMO

I think you are trying to simulate fraction addition,
Try the following query:
with find_mutiplication As
(
Select Id, num_a, num_b,
ROUND(EXP(SUM(LN(ABS(num_b))) over (partition by id))) as mutiplication,
ROUND(EXP(SUM(LN(ABS(num_b))) over (partition by id))) / num_b * num_a as unified
From mytable
)
,
calc as
(
Select *,
mutiplication/ GCD(mutiplication::int, SUM(unified::int)over (partition by id)) denom_B,
num_a * (mutiplication/ GCD(mutiplication::int, SUM(unified::int)over (partition by id)) / num_b) as dv
From find_mutiplication
)
Select id, num_a, num_b,
SUM(dv) Over (Partition By id) As sumA,
denom_b
From calc
Order By id
See demo from db<>fiddle.
To understand how the query works consider the following image:
where the ROUND(EXP(SUM(LN(num_b)) over (partition by id))) will find the multiplication of the dividends for each id. (According to this post)

Related

Is it possible to use a aggregate function over partition by as a case condition in SQL?

Problem statement is to calculate median from a table that has two columns. One specifying a number and the other column specifying the frequency of the number.
For e.g.
Table "Numbers":
Num
Freq
1
3
2
3
This median needs to be found for the flattened array with values:
1,1,1,2,2,2
Query:
with ct1 as
(select num,frequency, sum(frequency) over(order by num) as sf from numbers o)
select case when count(num) over(order by num) = 1 then num
when count(num) over (order by num) > 1 then sum(num)/2 end median
from ct1 b where sf <= (select max(sf)/2 from ct1) or (sf-frequency) <= (select max(sf)/2 from ct1)
Is it not possible to use count(num) over(order by num) as the condition in the case statement?
Find the relevant row / 2 rows based of the accumulated frequencies, and take the average of num.
The example and Fiddle will also show you the
computations leading to the result.
If you already know that num is unique, rowid can be removed from the ORDER BY clauses
with
t1 as
(
select t.*
,nvl(sum(freq) over (order by num,rowid rows between unbounded preceding and 1 preceding),0) as freq_acc_sum_1
,sum(freq) over (order by num, rowid) as freq_acc_sum_2
,sum(freq) over () as freq_sum
from t
)
select t1.*
,case
when freq_sum/2 between freq_acc_sum_1 and freq_acc_sum_2
then 'V'
end as relevant_record
from t1
order by num, rowid
Fiddle
Example:
ID
NUM
FREQ
FREQ_ACC_SUM_1
FREQ_ACC_SUM_2
FREQ_SUM
RELEVANT_RECORD
7
8
1
0
1
18
5
10
1
1
2
18
1
29
3
2
5
18
6
31
1
5
6
18
3
33
2
6
8
18
4
41
1
8
9
18
V
9
49
2
9
11
18
V
2
52
1
11
12
18
8
56
3
12
15
18
10
92
3
15
18
18
MEDIAN
45
Fiddle for 1M records
You can find the one (or two) middle value(s) and then average:
SELECT AVG(num) AS median
FROM (
SELECT num,
freq,
SUM(freq) OVER (ORDER BY num) AS cum_freq,
(SUM(freq) OVER () + 1)/2 AS median_freq
FROM table_name
)
WHERE cum_freq - freq < median_freq
AND median_freq < cum_freq + 1
Or, expand the values using a LATERAL join to a hierarchical query and then use the MEDIAN function:
SELECT MEDIAN(num) AS median
FROM table_name t
CROSS JOIN LATERAL (
SELECT LEVEL
FROM DUAL
WHERE freq > 0
CONNECT BY LEVEL <= freq
)
Which, for the sample data:
CREATE TABLE table_name (Num, Freq) AS
SELECT 1, 3 FROM DUAL UNION ALL
SELECT 2, 3 FROM DUAL;
Outputs:
MEDIAN
1.5
(Note: For your sample data, there are 6 items, an even number, so the MEDIAN will be half way between the value of 3rd and 4rd items; so half way between 1 and 2 = 1.5.)
db<>fiddle here

(SQL) Per ID, starting from the first row, return all successive rows with a value N greater than the prior returned row

I have the following example dataset:
ID
Value
Row index (for reference purposes only, does not need to exist in final output)
a
4
1
a
7
2
a
12
3
a
12
4
a
13
5
b
1
6
b
2
7
b
3
8
b
4
9
b
5
10
I would like to write a SQL script that returns the next row which has a Value of N or more than the previously returned row starting from the first row per ID and ordered ascending by [Value]. An example of the final table for N = 3 should look like the following:
ID
Value
Row index
a
4
1
a
7
2
a
12
3
b
1
6
b
4
9
Can this script be written in a vectorised manner? Or must a loop be utilised? Any advice would be greatly appreciated. Thanks!
SQL tables represent unordered sets. There is no definition of "previous" value, unless you have a column that specifies the ordering. With such a column, you can use lag():
select t.*
from (select t.*,
lag(value) over (partition by id order by <ordering column>) as prev_value
from t
) t
where prev_value is null or prev_value <= value - 3;
EDIT:
I think I misunderstood what you want to do. You seem to want to start with the first row for each id. Then get the next row that is 3 or higher in value. Then hold onto that value and get the next that is 3 or higher than that. And so on.
You can do this in SQL using a recursive CTE:
with ts as (
select distinct t.id, t.value, dense_rank() over (partition by id order by value) as seqnum
from t
),
cte as (
select id, value, value as grp_value, 1 as within_seqnum, seqnum
from ts
where seqnum = 1
union all
select ts.id, ts.value,
(case when ts.value >= cte.grp_value + 3 then ts.value else cte.grp_value end),
(case when ts.value >= cte.grp_value + 3 then 1 else cte.within_seqnum + 1 end),
ts.seqnum
from cte join
ts
on ts.id = cte.id and ts.seqnum = cte.seqnum + 1
)
select *
from cte
where within_seqnum = 1
order by id, value;
Here is a db<>fiddle.

how to set auto increment column value with condition

I have table like this:
value nextValue
1 2
2 3
3 20
20 21
21 22
22 23
23 NULL
Value is ordered ASC, nextValue is next row Value.
requirement is group by with condition nextValue-value>10, and count how many values in different groups.
For example, there should be two groups (1,2,3) and (20,21,22,23), first group count is 3, the second group count is 4.
I'm trying to mark each group with unique number, so I could group by these marked nums
value nextValue mark
1 2 1
2 3 1
3 20 1
20 21 2
21 22 2
22 23 2
23 NULL 2
But I don't know how to write mark column, I need an autocrement variable when nextValue-value>10.
Can I make it happen in Hive? Or there's better solution for the requirement?
If I understand correctly, you can use a cumulative sum. The idea is to set a flag when next_value - value > 10. This identifies the groups. So, this query adds a group number:
select t.*,
sum(case when nextvalue > value + 10 then 1 else 0 end) over (order by value desc) as mark
from t
order by value;
You might not find this solution satisfying, because the numbering is in descending order. So, a bit more arithmetic fixes that:
select t.*,
(sum(case when nextvalue > value + 10 then 1 else 0 end) over () + 1 -
sum(case when nextvalue > value + 10 then 1 else 0 end) over (order by value desc)
) as mark
from t
order by value;
Here is a db<>fiddle.
Calculate previous value, then calculate new_group_flag if value-prev_value >10, then calculate cumulative sum of new_group_flag to get group number (mark). Finally you can calculate group count using analytics function or group-by (in my example analytics count is used to show you the full dataset with all intermediate calculations). See comments in the code.
Demo:
with your_data as (--use your table instead of this
select stack(10, --the number of tuples generated
1 ,
2 ,
3 ,
20 ,
21 ,
22 ,
23 ,
40 ,
41 ,
42
) as value
)
select --4. Calculate group count, etc, etc
value, prev_value, new_group_flag, group_number,
count(*) over(partition by group_number) as group_count
from
(
select --3. Calculate cumulative sum of new group flag to get group number
value, prev_value, new_group_flag,
sum(new_group_flag) over(order by value rows between unbounded preceding and current row)+1 as group_number
from
(
select --2. calculate new_group_flag
value, prev_value, case when value-prev_value >10 then 1 else 0 end as new_group_flag
from
(
select --1 Calculate previous value
value, lag(value) over(order by value) prev_value
from your_data
)s
)s
)s
Result:
value prev_value new_group_flag group_number group_count
1 \N 0 1 3
2 1 0 1 3
3 2 0 1 3
20 3 1 2 4
21 20 0 2 4
22 21 0 2 4
23 22 0 2 4
40 23 1 3 3
41 40 0 3 3
42 41 0 3 3
This works for me
It needs "rows between unbounded preceding and current row" in my case.
select t.*,
sum(case when nextvalue > value + 10 then 1 else 0 end) over (order by value desc rows between unbounded preceding and current row) as mark
from t
order by value;

How to union a new row for every user until the last occurrence in sql?

I want to select a new line and union it with my original table, which should double the rows. I want this addition to occur for every raw until the last occurrence as such:
Original table:
Name record Score
John-1 1 13
John-2 1 12
John-2 2 21
John-2 3 23
John-3 1 24
John-3 2 25
Matt-1 1 10
Matt-1 2 13
This is my query:
SELECT Name, record, Score
FROM Table1
UNION
SELECT Name + ‘-start’, record, 0.1
FROM Table1
and I am getting the following output:
Name record Score
John-1 1 13
John-1-start 1 0.1
John-2 1 12
John-2-start 1 0.1
John-2 2 21
John-2-start 2 0.1
John-2 3 23
John-2-start 3 0.1
John-3 1 24
John-3-start 1 0.1
John-3 2 25
John-3-start 2 0.1
Matt-1 1 10
Matt-1-start 1 0.1
Matt-1 2 13
Matt-1-start 2 0.1
This is the desired output:
Name record Score
John-1 1 13
John-1-start 1 0.1
John-2 1 12
John-2-start 1 0.1
John-2 2 21
John-2-start 2 0.1
John-2 3 23
John-2-start 3 0.1
John-3 1 24
John-3-start 1 0.1
John-3 2 25
Matt-1 1 10
Matt-1-start 1 0.1
Matt-1 2 13
I would do this as:
SELECT v.Name, v.record, v.Score
FROM Table1 t1 CROSS APPLY
(VALUES (name, record, score),
(name + '-start', record, 0.1)
) v(name, record, score)
ORDER BY v.Name;
If you want to NULL out the final row for each name, you can do something like:
SELECT (case when row_number() over (partition by left(name, 4) order by name desc, record desc) > 1 then v.Name end) as Name,
(case when row_number() over (partition by left(name, 4) order by name desc, record desc) > 1 then v.record end) as record,
(case when row_number() over (partition by left(name, 4) order by name desc, record desc) > 1 then v.Score end) as score
FROM Table1 t1 CROSS APPLY
(VALUES (name, record, score),
(name + '-start', record, 0.1)
) v(name, record, score)
ORDER BY v.Name;
But that type of transformation should really be done at the application level.
You could add a row number, then when the row number is equal to the max row number it doesn't pull that record.
Edited for update. This adds a partition by the row number of the name column left of the -, then removes the max row number of each group by joining back to the first table with the same row number partitions.
select name, record, score
from table1
union
select name + '-start', record, 0.1
from
(select
Name,
LEFT(Name+'-', CHARINDEX('-',Name+'-')-1) leftnamet1,
record,
score,
row_number() over(partition by LEFT(Name+'-', CHARINDEX('-',Name+'-')-1) order by name, record) r
from Table1) t1
left join
(select
leftnamet2
max(r) maxrow
from (select
LEFT(Name+'-', CHARINDEX('-',Name+'-')-1) leftnamet2,
record,
score,
row_number() over(partition by LEFT(Name+'-', CHARINDEX('-',Name+'-')-1) order by name, record) r
From Table1) t3
group by leftnamet2
) t2
on leftnamet1=leftnamet2
where r<>maxrow

SQL Local Minima and Maxima

I have this data:
row_id type value
1 a 1
2 a 2
3 a 3
4 a 5 --note that type a, value 4 is missing
5 a 6
6 a 7
7 b 1
8 b 2
9 b 3
10 b 4
11 b 5 --note that type b is missing no values from 1 to 5
12 c 1
13 c 3 --note that type c, value 2 is missing
I want to find the minimum and maximum values for each consecutive "run" within each type. That is, I want to return
row_id type group_num min_value max_value
1 a 1 1 3
2 a 2 5 7
3 b 1 1 5
4 c 1 1 1
5 c 2 3 3
I am a fairly experienced SQL user, but I've never solved this problem. Obviously I know how to get the overall minimum and maximum for each type, using GROUP, MIN, and MAX, but I'm really at a loss for these local minima and maxima. I haven't found anything on other questions that answers my question.
I'm using PLSQL Developer with Oracle 11g. Thanks!
This is a gaps-and-islands problem. You can use an analytic function effect/trick to finds the chains of contiguous values for each type:
select type,
min(value) as min_value,
max(value) as max_value
from (
select type, value,
dense_rank() over (partition by type order by value)
- dense_rank() over (partition by null order by value) as chain
from your_table
)
group by type, chain
order by type, min(value);
The inner query uses the difference between the ranking of the values within the type and within the entire result set to create the 'chain' number. The outer query just uses that for the grouping.
SQL Fiddle including the result of the inner query.
This is one way to achieve the result you require:
with step_1 as (
select w.type,
w.value,
w.value - row_number() over (partition by w.type order by w.row_id) as grp
from window_test w
), step_2 as (
select x.type,
x.value,
dense_rank() over (partition by x.type order by x.grp) as grp
from step_1 x
)
select rank() over (order by y.type, y.grp) as row_id,
y.type,
y.grp as group_num,
min(y.value) as min_val,
max(y.value) as max_val
from step_2 y
group by y.type, y.grp
order by 1;